About me

I am a third-year Ph.D. Student in the Robotics Perception and Learning Lab (RIPL) at Georgia Tech, advised by Prof. Zsolt Kira. My research aims to improve the generalizability of foundation models, especially Vision–Language Models (VLMs). I am particularly interested in robust fine-tuning, reasoning, VLM-as-Judge and Vision–Language–Action Models (VLAs).

During Summer 2026, I am a Research Scientist Intern at Adobe Research, working with Dr. Kushal Kafle on VLM-as-Judge.

During Spring and Summer 2023, I was a Research Scientist Intern at Microsoft Research Asia, working on AI4Science.

I graduated with BS in Statistics from Renmin University of China. I was fortunate to work with Prof. Hongteng Xu in Structured Data Science Lab (SDSL).

You can find more details in my CV here.

News

  • [2026.05] I will be joining Adobe Research as a Research Scientist Intern, working with Dr. Kushal Kafle on VLM-as-Judge! See you in Bay Area!
  • [2026.05] SafeManip is online! The first property-driven benchmark for temporal safety evaluation in robotic manipulation.
  • [2026.03] Selected as a Qualcomm Innovation Fellowship Finalist with my amazing teammate Mellon Zhang!
  • [2026.02] MAPS was accepted to CVPR 2026! See you in Denver!
  • [2025.06] Mimicking or Reasoning was featured on YouTube by Discover AI!
  • [2025.04] I received the CVPR 2025 Travel Support Award, thanks! See you in Nashville!
  • [2025.04] I passed my Qualifying Exam!
  • [2025.02] FRAMES-VQA was accepted to CVPR 2025!
  • [2025.01] DiGraP was accepted to ICLR 2025!
  • [2023.08] I joined Georgia Institute of Technology for the Machine Learning PhD Program!

Selected Publications

SafeManip: A Property-Driven Benchmark for Temporal Safety Evaluation in Robotic Manipulation

arXiv 2026

MAPS: Preserving Vision-Language Representations via Module-Wise Proximity Scheduling for Better Vision-Language-Action Generalization

CVPR 2026

FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering

CVPR 2025

Directional Gradient Projection for Robust Fine-tuning of Foundation Models

ICLR 2025