About me

Hi, I am Yu Liu (刘毓, pronunciation: Yoo Lee-oh), a final-year M.S. candidate in Artificial Intelligence at the School of Intelligent Science and Technology, Hangzhou Institute for Advanced Study (HIAS), University of Chinese Academy of Sciences (UCAS).

My research interests include long-lived, human-aware multimodal agents with a robust perception→cognition→feedback loop—via multimodal affective computing, cross-modal alignment, long-horizon memory & reasoning, multi-agent coordination, and safe embodied interaction.

I’m currently seeking PhD positions (Fall 2026 start) aligned with the focus above; I’m happy to connect regarding well-matched openings.

I am very fortunate to be supervised by Prof. Taihao Li and Dr. Leyuan Qu. I also greatly enjoy collaborating with Haoxun Li and Hanlei Shi—I truly value our inspirational discussions and seamless teamwork.

You can find my CV here: Yu Liu’s Curriculum Vitae (Updated: Feb 6, 2026).

Research Highlights

Legend: 🌟 Lead contribution

Affective & Social AI

Research Highlights — Core: Multimodal Empathetic Dialogue Agent
Role: Lead for multimodal fusion across text–audio–visual streams.
Agent pipeline: Perception → Dialogue Understanding → Expression

  • GRACE for Multimodal Facial Emotion Recognition [IEEE Transactions on Affective Computing under review, arXiv] 🌟:
    Introduces GRACE for dynamic facial emotion recognition by aligning refined linguistic cues with salient facial dynamics; achieves SOTA on DFEW, FERV39k, and MAFW. Agent module: instant emotion perception for the dialogue agent.
  • Centering Emotion Hotspots [under review; arXiv (Oct 2025)] 🌟:
    Extends GRACE from frame-level signals to conversation-level tracking by centering multimodal hotspots and fusing them with dialogue context, improving stability and generalization across turns. Agent module: dialogue-level emotion understanding.
  • Think-Before-Draw [Pattern Recognition under review · arXiv]:
    A two-stage framework for disentangled, controllable talking-head synthesis. Designed the multimodal fusion module integrating audio–text–visual cues to preserve identity and sharpen affect control. Agent module: controllable affective expression for responses.

Digital Therapeutics

Zhejiang Vanguard Project: Digital Therapeutics for Depression Detection
Research Highlight: Unified visual biomarker extraction with LLM-augmented questionnaire semantics via multimodal fusion for depression screening under privacy/low-resource constraints; incubated HOPE for subject-level estimation.
Role: Led multimodal fusion (video/audio/text); built the visual-biomarker pipeline; drove cross-module integration aligned to subject-level decisions.

  • HOPE: Hierarchical Fusion for Optimized and Personality-Aware Estimation of Depression [ACM MM ’25 · MPDD Challenge (Young Track) · DOI · GitHub] 🌟:
    Subject-level depression detection under privacy constraints via hierarchical multimodal fusion (audio–video with personalized textual cues) and cross-task/sample consistency—1st place in the MPDD Young Track.
    Role: Developed the consistency-aware subject-level fusion strategy; led the writing; completed final code integration and open-source release preparation.

Earlier Work: Aviation Analytics

Education

  • University of Chinese Academy of Sciences — M.S. in Artificial Intelligence, Hangzhou Institute for Advanced Study (GPA 3.76/4.00), Sep 2023 – Jun 2026 (expected).
  • Civil Aviation University of China — B.S. in Transportation (GPA 3.36/4.00), Sep 2015 – Jun 2019.

Work Experience

China Southern Airlines — Data Analyst, Airlines Operations Center (Sep 2019 – May 2023)

  • Built Python/MySQL analytics and monitoring; automated Tableau/QuickBI reporting, reducing manual report time by >70% and enabling shared data services across departments.
  • Led a Django+SQL delay/fault analytics platform, cutting report generation time by ~85% and supporting operational safety and decision-making.

Hangzhou Institute for Advanced Study, UCAS — Lab Coordinator & Administrator (Sep 2024 – Present)

  • Managed onboarding, event organization, and GPU resource scheduling; oversaw equipment procurement and reimbursement workflows to keep projects moving.

Professional Service

  • Journal Reviewer: Pattern Recognition (Elsevier), 2025–present.