Transfer Learning from Play and Language - Nailing the Baseline

State space model completing a sequence of goals, which are visualised by the transparent objects. Plans sampled by the planner are shown projected into the planner's latent space. In particular, look for where plans are sampled from when interacting with the block and cupboard, and when trying to open the drawer. This video is slightly cherry picked - the average success rate on this sequence of tasks is ~11/13. The most difficult steps are the block reorientations and stand-up.
State space model completing a sequence of goals, which are visualised by the transparent objects. Plans sampled by the planner are shown projected into the planner's latent space. In particular, look for where plans are sampled from when interacting with the block and cupboard, and when trying to open the drawer. This video is slightly cherry picked - the average success rate on this sequence of tasks is ~11/13. The most difficult steps are the block reorientations and stand-up.
Read More

Linearised State Representations for Reinforcement Learning

Recently I found a way to learn state representations such that linear interpolation between the latent representations of states provided near optimal trajectories between the states in the original set of dimensions. They are learnt by optimising the representations of expert trajectories to lie along lines in higher dimensional latent space. The hard problem of finding the best path between states is reduced to the simple problem of taking a straight line between the latent representations of states - and the complexity is wrapped in the mapping to and from the latent space. This even worked for image based object manipulation tasks, and might be an interesting way to approach sub-goal generation for temporally extended manipulation tasks or provide a dense reward metric where latent Euclidean distance is a ‘true’ measure of progress towards the goal state.

Read More