P6-15: The Potential of Unsupervised Induction of Harmonic Syntax for Jazz
Ruben Cartuyvels, John Koslovsky, Marie-Francine Moens
Subjects: Machine learning/artificial intelligence for music ; TISMIR ; Systematic musicology ; Knowledge-driven approaches to MIR ; Music signal processing ; Computational musicology ; MIR fundamentals and methodology
Presented Virtually
4-minute short-format presentation
Hierarchical structures describing a syntax of harmony have long been studied and proposed by music theorists, but algorithms that model these structures either require costly expert annotations for training, or are based on music theorists’ predispositions about harmonic syntax. We build upon a line of work that models harmonic sequences with probabilistic context-free grammars (PCFGs), inspired by the well-known formalism for syntax in human language. By using neural networks for parameter sharing when estimating PCFG rule probabilities, we learn the grammar in an entirely unsupervised manner. Our model induces a harmonic syntax purely from data, with minimal bias, and with parse trees as latent variables, while simply maximizing the likelihood of training sequences. This frees us from the need, for the first time, both for expert-annotated harmonic syntax trees and for human-defined grammar rules. We propose improvements inspired by music theory, including chord symbol representations and a training objective that incentives the inclusion of short and frequent chord progressions that are based on musical relations. Experiments show that our methods can model harmony in datasets of jazz pieces, often resulting in realistic parse trees that overlap with expert annotations, without access to these annotations during training at all. Code, models and predictions are publicly available.