I am currently a Master's student at the School of Artificial Intelligence, Beijing University of Posts and Telecommunications (BUPT), and a Research Intern at Zhipu AI (AutoGLM Group). My research interests lie in the intersection of Large Speech Models (LSM), Singing Voice Conversion (SVC), and Expressive Text-to-Speech (TTS).
If you are seeking any form of academic cooperation or have inquiries regarding my open-source projects, please feel free to email me at shawnpi@qq.com.
I graduated from Ningbo University (Yangming Innovation Class) with a bachelor's degree in Computer Science and Technology and am currently pursuing my master's degree at BUPT (expected 2026). I have gained extensive industry experience through internships at Zhipu AI (ζΊθ°±), Tencent Music (QQι³δΉ) and Momo (ιι).
I have been awarded the Zhejiang Government Scholarship (3 times) and the BUPT First-Class Scholarship (2 times). My research has been published or submitted to top-tier conferences such as AAAI, Interspeech, ICASSP, and ISCSLP.
- 2025.12: π One paper (HQ-SVC) accepted by AAAI 2026 as the first author!
- 2025.10: π Joined Zhipu AI as a Speech Large Model Research Intern.
- 2025.07: πΈ Joined Tencent Music (QQ Music) focusing on multi-speaker conversational podcast TTS.
- 2025.03: π« Joined Momo focusing on paralinguistic TTS and understanding.
- 2024.06: π One paper (SPA-SVC) accepted by Interspeech 2024 as the first author.
π For a full list of publications, please visit my Google Scholar.
HQ-SVC: High-Quality Zero-Shot Singing Voice Conversion in Low-Resource Scenarios, Bingsong Bai, et al., AAAI 2026. [CCF-A]
SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding, Bingsong Bai, et al., ICASSP 2026 (under review). [CCF-B]
SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion, Bingsong Bai, et al., Interspeech 2024. [CCF-C]
ExpressiveSinger: Synthesizing Expressive Singing Voice as an Instrument, Fengping Wang, Bingsong Bai, et al., ISCSLP 2024.
- GLM-ASR Nano: Participated in training and SFT of the SOTA open-source ASR model, reaching #1 on Hugging Face speech model download charts (440k+ downloads in 2 weeks).
- Multi-Speaker Conversational TTS: Improving rhythm/pauses by 68.49% in AI Podcasts (Internal Project @ Tencent Music). Participated in QinYu-TTS
- 2023, 2024: BUPT First-Class Academic Scholarship
- 2020, 2021, 2022: Zhejiang Provincial Government Scholarship (3 consecutive years)
- 2021: Mathematical Contest in Modeling (MCM) - International Second Prize
- 2020: Contemporary Undergraduate Mathematical Contest in Modeling (CUMCM) - Provincial Second Prize



