Brain2Music: reconstructing music from human brain activity

How we experience music is an interesting topic in neuroscience, and with functional magnetic resonance imaging (fMRI) is possible the exciting task of reconstructing music from brain activity.

Different line of works have investigated the representation in the brain of several musical features such as rhythms, emotions, timbres and musical genres.

The model Brain2Music [1] is able to reconstruct songs from scans of fMRI by using a deep neural network trained to generate music from several high-level semantically-structured music embeddings. At the core of the method is a language model that is able to create high fidelity songs from several inputs such as text describing the wanted song or a hummed melody. 

Fig. 1  The model Brain2Music [1] takes as input fMRI responses of people listening to music and generates 128-dimensional music embeddings.

Due to human brain anatomical differences, it is not yet possible to apply a model trained on one subject to another, and future work involves compensating those differences and using one unique model for all.

However, the fMRI data is very sparse both temporally and spatially, making the information extraction extremely challenging and inefficient. Also, subjects have to spend a long time in the large fMRI scanner. Therefore, a new music decoding technology is needed in the near future.  


[1] Denk, Timo I., et al. “Brain2Music: reconstructing music from human brain activity.” arXiv (2023). DOI: 10.48550/arxiv.2307.11078