APE-GAN: A Novel Active Learning Based Music Generation Model With Pre-Embedding

2021 
Being able to generate realistic musics is one of the biggest challenges for Artificial Intelligence, and the current models do not have musical descriptive ability as humans have and the musics they produce are not highly realistic. This paper proposed an active learning based music generation model with pre-embedding (APE-GAN) that can use textual inputs to generate musics, and have increased performance by active learning and a checking mechanism with the discriminator in GAN model. Through experiments, this work shows that APE-GAN only needs 5% to 10% humans labelled data to achieve relatively good music generation ability. This method uses BERT to obtain embedding vector, thus provides prior and artistic thoughts from humans so that our model can achieve a higher level of creativity than most of the popular methods. Finally, it would be too subjective to discriminate by humans whether a music sequence generated is “good” or not. This paper used an evaluation metric (Kullback–Leibler divergence, or KL divergence) based on similarity of music sequences generated by similar or different meanings textual inputs. After using Lakh Pianoroll Dataset to train APE-GAN, the similar meaning textual inputs result in outputs with KL divergence approximate to 0, while different meaning textual inputs result in outputs with KL divergence approximate to 1.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []