Andrei Barbu - Home

This work has appeared in:

Andrei Barbu, Siddharth Narayanaswamy, Jeffrey Mark Siskind, 'Learning physically-instantiated game play through visual observation', IEEE International Conference on Robotics and Automation, May 2010.

We develop a system that learns to play board games from visual observation. The intent here is to learn to play legally, as opposed to previous work which learned to play well. Two robotic agents play a board game, these agents know the rules of the game. A third agent that doesn't know the rules observes the game and after watching a small number of games, typically 3 to 5, learns the rules of the game. It can then use this knowledge to step in and play against another robot or a human.

We learn to play a number of games including Tic-tac-toe, Hexapawn and a class of variants of Hexapawn with different rules for moving and capturing pieces.

Board games are a microcosm for social interactions, races, and wars. This work is the first in a line of work which will explore how general concepts such as threats, blocks, and forks apply to different games and how knowledge can be transferred between games to both help acquire the rules of a game and play well. In addition board games allow us to explore the student/teacher relationship where robots and humans can interact and teach each other different skills through natural language and observation.

This work employs Progol to learn the rules of the game given some common background knowledge that would be available to any child. It learns the rules of the game piece-by-piece, first learning how to set up an initial board. It then learns the legal move generator and uses this knowledge to learn the outcome predicate. It learns from both positive evidence, legal moves it sees, and negative evidence, the fact that a game has not ended means no one won in that position. This is important because learning the outcome predicate must be done from a tiny number of examples, after all there's only one outcome per game.

Source code for this work.

Tic-tac-toe

Every cache square for every player in the initial state has some piece of
that player.

A player moves by moving some piece of that player from some cache square for
that player to some empty board square.

A player wins when every square in some row has some piece of that player.
A player wins when every square in some column has some piece of that player.
A player wins when every square in some diagonal has some piece of that
player.

A player draws when no player wins and that player has no move.

initial_board([[none,none,none],[none,none,none],[none,none,none]],
	player_x).

legal_move(A,B,C) :- row(D), col(E), owns(A,F), empty(G), at(D,
	E,B,G,H), at(D,E,C,F,I), frame_obj(I,I,H,H,C,B).

outcome(A,B,C) :- opponent(A,D), owns_outcome(D,C), owns_piece(C,
	E), at(F,G,B,E,H), at(I,J,B,E,K), at(L,M,B,E,N), linear_obj(K,
	H,N).

Hexapawn

Every square in the close row for every player in the initial state has some
piece of that player.

A player moves by moving some piece of that player from some square to some
empty forward-adjacent square for that player of that square.
A player moves by moving some piece of the opponent of that player from some
forward-diagonal square for that player of some square to some cache square
for that opponent then moving the piece of that player from that square to
that forward-diagonal square.

A player wins when some square in the distant row for that player has some
piece of that player.

A player draws when no player wins and that player has no move.

initial_board([[x,x,x],[none,none,none],[o,o,o]],player_x).

legal_move(A,B,C) :- row(D), col(E), owns(A,F), empty(G), forward(A,
	H,D), at(H,E,B,F,I), at(H,E,C,G,J), at(D,E,B,G,K), at(D,
	E,C,F,L), frame_obj(I,K,J,L,B,C).
legal_move(A,B,C) :- row(D), col(E), opponent(A,F), owns(A,G),
	empty(H), forward(F,D,I), owns(F,J), sideways(E,K),
	at(I,K,B,G,L), at(I,K,C,H,M), at(D,E,C,G,N), at(D,E,B,
	J,O), frame_obj(L,O,M,N,B,C).

outcome(A,B,C) :- row(D), opponent(A,E), forward(E,D,F), forward(E,
	F,G), owns_outcome(E,C), owns_piece(C,H), at(G,I,B,H,
	J).
outcome(A,B,C) :- opponent(A,D), has_no_move(A,B), owns_outcome(D,
	C).