Structure-informed Positional Encoding for Music Generation - Equipe Signal, Statistique et Apprentissage Accéder directement au contenu
Communication Dans Un Congrès Année : 2024

Structure-informed Positional Encoding for Music Generation

Résumé

Music generated by deep learning methods often suffers from a lack of coherence and long-term organization. Yet, multi-scale hierarchical structure is a distinctive feature of music signals. To leverage this information, we propose a structure-informed positional encoding framework for music generation with Transformers. We design three variants in terms of absolute, relative and non-stationary positional information. We comprehensively test them on two symbolic music generation tasks: next-timestep prediction and accompaniment generation. As a comparison, we choose multiple baselines from the literature and demonstrate the merits of our methods using several musically-motivated evaluation metrics. In particular, our methods improve the melodic and structural consistency of the generated pieces.
Fichier principal
Vignette du fichier
svbwdvrdnrztpzxgdsckkhqxkjbjpfzx.pdf (1.16 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04432659 , version 1 (15-02-2024)
hal-04432659 , version 2 (20-02-2024)
hal-04432659 , version 3 (28-02-2024)

Identifiants

Citer

Manvi Agarwal, Changhong Wang, Gaël Richard. Structure-informed Positional Encoding for Music Generation. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2024, Seoul, South Korea. ⟨hal-04432659v3⟩
209 Consultations
71 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More