Teaser Image

Imitation learning is a popular paradigm to teach robots new tasks, but collecting robot demonstrations through teleoperation or kinesthetic teaching is tedious and time-consuming. In contrast, directly demonstrating a task using our human embodiment is much easier and data is available in abundance, yet transfer to the robot can be non-trivial. In this work, we propose Real2Gen to train a manipulation policy from a single human demonstration. Real2Gen extracts required information from the demonstration and transfers it to a simulation environment, where a programmable expert agent can demonstrate the task arbitrarily many times, generating an unlimited amount of data to train a flow matching policy. We evaluate Real2Gen on human demonstrations from three different real-world tasks and compare it to a recent baseline. Real2Gen shows an average increase in the success rate of 26.6% and better generalization of the trained policy due to the abundance and diversity of training data. We further deploy our purely simulation-trained policy zero-shot in the real world.

Video

Code

This work is released under under the GPLv3 license. For any commercial purpose, please contact the authors. A software implementation of this project can be found on GitHub.

Publications

If you find our work useful, please consider citing our paper(s):

Nick Heppert*, Minh Quang Nguyen*, Abhinav Valada
Scaling Single Human Demonstrations for Imitation Learning using Generative Foundational Models
ICRA, 2026.

(PDF) (BibTeX)

Nick Heppert*, Minh Quang Nguyen*, Abhinav Valada
Real2Gen: Imitation Learning from a Single Human Demonstration with Generative Foundational Models
ICRA Workshop on Foundation Models and Neuro-Symbolic AI for Robotics, 2025.

(PDF) (BibTeX)

Authors

Nick Heppert

Nick Heppert*

University of Freiburg, Zuse School ELIZA

Adrian Röfer

Minh Quang Nguyen*

University of Freiburg

Abhinav Valada

Abhinav Valada

University of Freiburg

*equal contribution, alphabetical order

The code implementation of this project was done my Minh during his Master Thesis at the University of Freiburg. Nick provided the initial idea, supervised the Thesis and wrote the paper.

Acknowledgment

This work was partially funded by the Carl Zeiss Foundation with the ReScaLe project.

Nick Heppert is supported by the Konrad Zuse School of Excellence in Learning and Intelligent Systems (ELIZA) through the DAAD programme Konrad Zuse Schools of Excellence in Artificial Intelligence, sponsored by the Federal Ministry of Education and Research.