Harmonization using a Style-Controllable Latent Space Based on VAE

Itsuki Nakada; Shinji Sako

Harmonization using a Style-Controllable Latent Space Based on VAE

Itsuki Nakada, Shinji Sako

Some of the required materials for this paper do not exist: Video

Abstract:

Recent attempts to automate harmonization have been advanced by machine learning. In particular, architectures that combine Variational Autoencoder (VAE) \cite{kingma2022autoencodingvariationalbayes} with time-series processing methods such as Transformer and LSTM are commonly employed \cite{morphing_reharmonization, structured_and_flexible, ji2023emotionconditionedmelodyharmonizationhierarchical}. Training with a VAE enables diverse outputs through interpolation and extrapolation. However, these approaches may not fully utilize the entire learned latent space. In addition, the generated accompaniments often consist of the same chord for an entire bar or monotonous rhythmic patterns. This paper investigates a method for style control by training a LSTM-VAE on melody and chord progressions, and measuring and comparing features of generated accompaniments across the entire latent space.