Yu Zhang AaronZ345

Hi there 👋

I am Yu Zhang (张彧). Now, I am a Research Scientist at ByteDance. If you are seeking any form of academic cooperation, please feel free to email me at [email protected].

I earned my PhD in the College of Computer Science and Technology, Zhejiang University (浙江大学计算机科学与技术学院), under the supervision of Prof. Zhou Zhao (赵洲). Previously, I graduated from Chu Kochen Honors College, Zhejiang University (浙江大学竺可桢学院), with dual bachelor's degrees in Computer Science and Automation. I have also served as a visiting scholar at University of Rochester with Prof. Zhiyao Duan and University of Massachusetts Amherst with Prof. Przemyslaw Grabowicz.

My research interests primarily focus on Multi-Modal Generative AI, specifically in Spatial Audio, Music, Singing, and Speech. I have published 10+ first-author papers at top international AI conferences, such as NeurIPS, ACL, and AAAI.

📎 Homepages

Personal Pages: https://aaronz345.github.io (updated recently🔥)
Linkedin: www.linkedin.com/in/yuzhang34
Google Scholar: https://scholar.google.com/citations?user=kA9A6LsAAAAJ
DBLP: https://dblp.org/pid/50/671-126.html

💻 First-Author Papers

*denotes co-first authors

🔊 Spatial Audio

ACM-MM 2025 ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting, Yu Zhang, Wenxiang Guo, Changhao Pan, et al.
ACM-MM 2025 A Multimodal Evaluation Framework for Spatial Audio Playback Systems: From Localization to Listener Preference, Changhao Pan*, Wenxiang Guo*, Yu Zhang*, et al.
NeurIPS 2025 MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations, Wenxiang Guo*, Changhao Pan*, Zhiyuan Zhu*, Xintong Hu*, Yu Zhang*, et al.
Preprint ASAudio: A Survey of Advanced Spatial Audio Research, Zhiyuan Zhu*, Yu Zhang*, Wenxiang Guo*, et al.

🎼 Music Generation

EMNLP 2025 Versatile Framework for Song Generation with Prompt-based Control, Yu Zhang, Wenxiang Guo, Changhao Pan, et al.

🎙️ Singing Voice Synthesis

ACL 2025 TCSinger 2: Customizable Multilingual Singing Voice Synthesis, Yu Zhang, Wenxiang Guo, Changhao Pan, et al.
EMNLP 2024 TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control, Yu Zhang, Ziyue Jiang, Ruiqi Li, et al.
NeurIPS 2024 Spotlight GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks, Yu Zhang, Changhao Pan, Wenxinag Guo, et al.
AAAI 2024 StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis, Yu Zhang, Rongjie Huang, Ruiqi Li, et al.
ACL 2025 STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation, Wenxiang Guo*, Yu Zhang*, Changhao Pan*, et al.
Under Review Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches, Changhao Pan*, Dongyu Yao*, Yu Zhang*, et al.

💬 Speech Synthesis

ASRU 2025 Conan: A Chunkwise Online Network for Zero-Shot Adaptive Voice Conversion, Yu Zhang, Baotong Tian, Zhiyao Duan.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yu Zhang AaronZ345

Achievements