Sori: Real-time General Audio to MIDI Transformation with Accordance-Musicality Trade-off Controllability
Kyungsu Kim, Yejin Kim, Kyogu Lee
Primary Subject: Early Research
The musical experience does not arise from the sound itself but from how the listener interprets it. This perspective has inspired artistic practices that recontextualize non-musical sounds as a piece of art. Motivated by this view, we define a novel task, General Audio to Symbolic Music Transformation (GASMT), which aims to generate symbolic music with both high input accordance and musical quality from arbitrary audio. We also present \textit{Sori}, a real-time system for GASMT, built on a streaming encoder-decoder architecture with domain adversarial training and classifier-free guidance that controls the trade-off between accordance and musicality. Audio demos are available at: https://tinyurl.com/sori-demo