Wenkai Wang | AI4GC Lab

About Me

Hi! I'm Wenkai Wang, a master's student at AI4GC Lab, Zhejiang University, advised by Shengyu Zhang.

My research focuses on building and evaluating computer-use agents that can operate real software through CLI and GUI interfaces. I am especially interested in agentic reinforcement learning for grounding and planning, as well as benchmarks that capture long-horizon professional workflows and human-in-the-loop collaboration.

Open to collaborations on CUA agents, agentic RL, and LLM evaluation.

Research Interests

CLI/GUI Agent — Grounding, planning, and benchmark design for agents that interact with desktop and application UIs in realistic workflows.
Agentic Reinforcement Learning — RL pipelines for post-training LLMs/MLLMs, including reward design, data synthesis, and stable self-improvement.
LLM Evaluation — Building rigorous benchmarks and evaluation protocols for long-horizon agent behavior and multimodal reasoning.

Education

2025 – present · M.S. in Artificial Intelligence, Zhejiang University
2021 – 2025 · B.S. in Computer Science and Technology, Huazhong University of Science and Technology

Selected Papers

arXiv2026

DeskCraft: Benchmarking Desktop Agents on Professional Workflows and Human-in-the-Loop Collaboration

Wenkai Wang, Tao Xiong, Jingchen Ni, Yunpeng Bao, Xiyun Li, Tianqi Liu, Hongcan Guo, Zilong Huang, Shengyu Zhang^✉

Paper

AAAI2026

Oral

A Rolling Stone Gathers No Moss: Adaptive Policy Optimization for Stable Self-Evaluation in Large Multimodal Models

Wenkai Wang, Hongcan Guo, Zheqi Lv, Shengyu Zhang^✉

Paper

ACL2026

Measure Twice, Click Once: Co-evolving Proposer and Visual Critic via Reinforcement Learning for GUI Grounding

Wenkai Wang, Xiyun Li, Hongcan Guo, Wenhao Yu, Tianqing Fang, Haitao Mi, Dong Yu, Shengyu Zhang^✉

Paper Blog

Blog

On-site

Jun 2026

Measure Twice, Click Once: Co-evolving Proposer and Visual Critic for GUI Grounding

COPC reframes GUI grounding from single-shot coordinate regression to a Propose-then-Critic paradigm with maturity-aware co-evolutionary RL, unlocking latent Pass@k potential through visual discrimination rather than geometric clustering. Accepted to ACL 2026.