This Repository is an implementation of Music Mood Detection Based On Audio And Lyrics With Deep Neural Net by R. Delbouys et al. This model uses two CNN layers and two dense layers to solve Music Emotion Recognition problem. It is using Multi-modal Architecture in Regression task. This bi-modal deep learning structure is expected to combine data from two different domains and reflect information that can not be covered by one domain.
For datasets, the Deezer Mood Detection Dataset and parts of the Million Song Dataset was used. The Deezer Mood detection dataset didn't include the audio and the lyrics due to copyright issues and thus had to be supplemented used the Million Song Dataset. However with a few adjustments a dataset can be chosen and used for the task from this website