Text-Guided 6D Object Pose Rearrangement via Closed-Loop VLM Agents
arXiv, 2026
Without additional training, A VLM agent iteratively refines an object's 6D pose to follow text instructions.
I am a MS/PhD student at Seoul National University advised by Prof. Hanbyul Joo. My research interests are in computer vision, especially in 3D/4D reconstruction and object 6D pose estimation.
I'm joining the Visual Computing Lab as a MS/PhD student.
Our work, PhysGaia, got accepted to CVPR 2026.
arXiv, 2026
Without additional training, A VLM agent iteratively refines an object's 6D pose to follow text instructions.
CVPR, 2026
Project Page arXiv Dataset Code
A novel physics-aware dataset specifically designed for Dynamic Novel View Synthesis (DyNVS), encompassing both structured objects and unstructured physical phenomena.