A Modular Open-Vocabulary Planning System for Robotic Manipulation
Website | Documentation | Paper
TiPToP solves complex real-world manipulation tasks directly from raw pixels and natural-language commands by combining Task-and-Motion Planning with perception and language models through inference-time search — with zero robot training data.
See the documentation for installation, setup, and usage instructions.
The documentation is hosted at tiptop-robot.readthedocs.io and automatically rebuilds on every push to main. To build and serve it locally for previewing changes:
pixi run docs-install # Install doc dependencies pixi run docs-build # Build HTML docs pixi run docs-serve # Serve with live reloadSee the Contributing Guide for development setup and guidelines.
@article{shen2026tiptop, title={{TiPToP}: A Modular Open-Vocabulary Planning System for Robotic Manipulation}, author={Shen, William and Kumar, Nishanth and Chintalapudi, Sahit and Wang, Jie and Watson, Christopher and Hu, Edward S. and Cao, Jing and Jayaraman, Dinesh and Kaelbling, Leslie Pack and Lozano-P\'{e}rez, Tom\'{a}s}, journal={arXiv preprint arXiv:2603.09971}, year={2026} }