Skip to content
View AaronZ345's full-sized avatar
🎯
Focusing. I may be slow to respond.
🎯
Focusing. I may be slow to respond.

Block or report AaronZ345

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
AaronZ345/README.md

Hi there 👋

I am Yu Zhang (张彧). Now, I am a Research Scientist at ByteDance. If you are seeking any form of academic cooperation, please feel free to email me at [email protected].

I earned my PhD in the College of Computer Science and Technology, Zhejiang University (浙江大学计算机科学与技术学院), under the supervision of Prof. Zhou Zhao (赵洲). Previously, I graduated from Chu Kochen Honors College, Zhejiang University (浙江大学竺可桢学院), with dual bachelor's degrees in Computer Science and Automation. I have also served as a visiting scholar at University of Rochester with Prof. Zhiyao Duan and University of Massachusetts Amherst with Prof. Przemyslaw Grabowicz.

My research interests primarily focus on Multi-Modal Generative AI, specifically in Spatial Audio, Music, Singing, and Speech. I have published 10+ first-author papers at top international AI conferences, such as NeurIPS, ACL, and AAAI.

📎 Homepages

💻 First-Author Papers

*denotes co-first authors

🔊 Spatial Audio

🎼 Music Generation

🎙️ Singing Voice Synthesis

💬 Speech Synthesis

Pinned Loading

  1. ISDrama ISDrama Public

    Dataset and evaluation code of ISDrama(ACM-MM 2025): Immersive Spatial Drama Generation through Multimodal Prompting

    Python 121

  2. GTSinger GTSinger Public

    Dataset and code of GTSinger(NeurIPS 2024 Spotlight): A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks

    Python 327 13

  3. VersBand VersBand Public

    PyTorch Implementation of VersBand(EMNLP 2025): Versatile Framework for Song Generation with Prompt-based Control

    Python 217 42

  4. TCSinger2 TCSinger2 Public

    PyTorch Implementation of TCSinger 2(ACL 2025): Customizable Multilingual Zero-shot Singing Voice Synthesis

    Python 149 27

  5. TCSinger TCSinger Public

    PyTorch Implementation of TCSinger(EMNLP 2024): Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control

    Python 356 39

  6. StyleSinger StyleSinger Public

    PyTorch Implementation of StyleSinger(AAAI 2024): Style Transfer for Out-of-Domain Singing Voice Synthesis

    Python 409 26