Hi, I’m Jiwen, a researcher and frontier explorer in generative AI.

I’m building Video World Model: AI systems that generate consistent, controllable video environments with real-time interactivity, memory, and reasoning capabilities. As Genie 3 and Project Genie have shown, this is the foundation for next-generation simulation, robotics, and interactive media, yet there’s still a long way to go.

🤗 I’m fortunate to have Yiran Qin as a close friend and collaborator. We share a deep interest in the intersection of world models and robotics.

🚀 In 2026, I’m working toward releasing this technology as several open-source projects.

🤝 Open to collaborations from academia, industry, or investment. If you’re interested in video world models, let’s talk!

VideoWorldModel CVPR 2026 Workshop

We're organizing a workshop on Video World Models at CVPR 2026. Submissions are welcome across two tracks: Proceedings and Non-Proceedings.

Learn More & Submit →

Contact me via 📬 Email / WeChat

Research Vision

My long-term research goal is to build ideal Video World Model. I’m currently focused on three core challenges:

⚡ Real-Time Streaming Generation. Generating high-quality streaming video with interactive control and memory, which requires substantial engineering and infrastructure work beyond algorithms alone.

🧠 Memory Systems. Building complex, comprehensive memory architectures that integrate diverse tools (context, 3D representations, learnable parameters, etc.) and support a wide range of functions including retrieval, querying, compression, and updates.

👁️ Visual Intelligence. Can large-scale video training alone give rise to advanced intelligence? Not short-term dynamics prediction, but long-horizon analysis, reasoning, and planning. I believe video data holds great potential for emergent capabilities.

Selected Publications

(*: indicates equal contribution; #: indicates corresponding author)

Research Topics: World Model / Interactive Video Generation / Embodied AI

SIGGRAPH Asia 2025

Context as Memory: Scene-Consistent Interactive Long Video Generation with Memory Retrieval

Jiwen Yu, Jianhong Bai, Yiran Qin, Quande Liu^#, Xintao Wang, Pengfei Wan, Di Zhang, Xihui Liu^#

SIGGRAPH Asia 2025

Paper | Project Page | Dataset

ICCV 2025

GameFactory: Creating New Games with Generative Interactive Videos

Jiwen Yu^*, Yiran Qin^*, Xintao Wang^#, Pengfei Wan, Di Zhang, Xihui Liu^#

ICCV 2025 Highlight

Paper | Project Page | GitHub | Dataset

Preprint

Survey of Interactive Generative Video

Jiwen Yu^*, Yiran Qin^*, Haoxuan Che^*, Quande Liu^#, Xintao Wang, Pengfei Wan, Di Zhang, Kun Gai, Hao Chen, Xihui Liu^#

Position: Interactive Generative Video as Next-Generation Game Engine

Jiwen Yu^*, Yiran Qin^*, Haoxuan Che, Quande Liu, Xintao Wang^#, Pengfei Wan, Di Zhang, Xihui Liu^#

Survey Paper | Position Paper

ICML 2025

WorldSimBench: Towards Video Generation Models as World Simulators

Yiran Qin^*, Zhelun Shi^*, Jiwen Yu, Xijun Wang, Enshen Zhou, Lijun Li, Zhenfei Yin, Xihui Liu, Lu Sheng, Jing Shao, Lei Bai, Wanli Ouyang, Ruimao Zhang

ICML 2025

Paper | Project Page

CVPR 2025

SkillMimic: Learning Reusable Basketball Skills from Demonstrations

Yinhuai Wang^*, Qihan Zhao^*, Runyi Yu^*, Ailing Zeng, Jing Lin, Zhengyi Luo, Hok Wai Tsui, Jiwen Yu, Xiu Li, Qifeng Chen, Jian Zhang, Lei Zhang, Ping Tan

CVPR 2025 Highlight

Paper | Project Page | GitHub

Past Research Topic: Training-free Applications of Diffusion Model

My research journey began during my Master’s studies, coinciding with the paradigm shift brought by diffusion models in generative AI (2021-2023). This revolutionary advancement inspired my initial research on zero-shot applications of diffusion models, spanning multiple domains including image restoration, generation, editing, steganography, and video synthesis.

Project

AnimateZero: Video Diffusion Models are Zero-Shot Image Animators

Jiwen Yu, Xiaodong Cun^#, Chenyang Qi, Yong Zhang, Xintao Wang, Ying Shan, Jian Zhang^#

Project 2023

Paper | Project Page

NeurIPS 2023

CRoSS: Diffusion Model Makes Controllable, Robust and Secure Image Steganography

Jiwen Yu, Xuanyu Zhang, Youmin Xu, Jian Zhang^#

NeurIPS 2023

Paper | GitHub

ICCV 2023

FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model

Jiwen Yu, Yinhuai Wang, Chen Zhao^#, Bernard Ghanem, Jian Zhang^#

ICCV 2023

Paper | GitHub

ICLR, 2023

Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model

Yinhuai Wang^*, Jiwen Yu^*, Jian Zhang^#

ICLR 2023 Spotlight

Paper | GitHub | Project Page

Educations

2024.09 - Now

Ph.D. Student, University of Hong Kong, HKU-MMLab

Advisor: Prof. Xihui Liu

2021.09 - 2024.06

M.S., Peking University, VILLA Lab

Advisor: Prof. Jian Zhang

Internships

2026.01 - Now

Student Researcher at Anuttacon, Mountain View, CA, US

Advisor: Dr. Xin Tong

2024.09 - 2026.01

Student Researcher (Kuai Star) at Kling team, Shenzhen, China

Advisor: Dr. Xintao Wang

2023.04 - 2024.01

Student Researcher at Tencent AI Lab, Shenzhen, China

Advisor: Prof. Xiaodong Cun

Talks

Dec 2025 Controllable, Generalizable, and Memory-Enabled: Interactive Video World Models
- SAAI. [News Report] (Chinese)
Dec 2025 Context as Memory: Scene-Consistent Interactive Long Video Generation with Memory Retrieval
- GAMES. [Video] (Chinese)
Oct 2025 Toward Higher-Level Intelligence in Interactive Generative Video for World Model
- AITIME. [Video] (Chinese)
Jul 2025 Toward Higher-Level Intelligence of Interactive Generative Video
- TechBeat. [Video] (Chinese)

Academic Service

Primary Organizer, VideoWorldModel (CVPR’26 Workshop)
Reviewer, ICLR, NeurIPS, ICML, ECCV, CVPR, ICCV, SIGGRAPH, SIGGRAPH Asia, TPAMI.