Yuhang Liu | AI4GC Lab

About Me

Hi! I am Yuhang Liu, a graduated Master student from AI4GC Lab at Zhejiang University, advised by Prof. Shengyu Zhang.

My research interests include multimodal GUI agents, general-purpose agent construction, and agentic RL. I am especially interested in building agents that can reason, learn from interaction, operate software, and complete complex tasks across environments.

I am currently an incoming Ph.D. student in the Department of Computing at The Hong Kong Polytechnic University.

During AI4GC

During my time at AI4GC Lab, I worked on multimodal GUI agents, focusing on visual interface understanding, action grounding, and reasoning for real software environments.

My early work on InfiGUIAgent studied how to build a generalist GUI agent from raw screenshots. We used two-stage supervised fine-tuning to combine GUI understanding and grounding with hierarchical reasoning and expectation-reflection reasoning. Accepted by EACL 2026 as an Oral Presentation.

I then explored InfiGUI-R1, which reframes GUI automation as a transition from reactive acting to deliberative reasoning. The work uses spatial reasoning distillation and reinforcement learning signals for sub-goal planning and error recovery, making planning and reflection central parts of GUI-agent training.

My later work on InfiGUI-G1 focused on GUI grounding, especially the semantic-alignment bottleneck that remains after spatial alignment improves. We designed Adaptive Exploration Policy Optimization to encourage broader and more purposeful search over interface elements. Accepted by AAAI 2026 as an Oral Presentation.

Selected Papers

AAAI2026

Oral

InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization

Yuhang Liu, Zeyu Liu, Shuanghe Zhu, Pengxiang Li, Congkai Xie, Jiasheng Wang, Xueyu Hu, Xiaotian Han, Jianbo Yuan, Xinyao Wang, Shengyu Zhang^✉, Hongxia Yang, Fei Wu

Paper Project148

EACL2026

Oral

InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection

Yuhang Liu, Pengxiang Li, Zishu Wei, Congkai Xie, Xueyu Hu, Xinchen Xu, Shengyu Zhang^✉, Xiaotian Han, Hongxia Yang, Fei Wu

Paper Project74

arXiv2025

InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners

Yuhang Liu, Pengxiang Li, Congkai Xie, Xueyu Hu, Xiaotian Han, Shengyu Zhang^✉, Hongxia Yang, Fei Wu

Paper Project67

Now

Now: Incoming Ph.D. student in the Department of Computing at The Hong Kong Polytechnic University.