MERGE is a comprehensive bimodal dataset designed to advance research in Music Emotion Recognition (MER). The field of MER has evolved from audio-centric systems to bimodal ensembles that incorporate both audio and lyrics. However, the development of bimodal systems has been hindered by the lack of large, publicly available datasets. MERGE addresses this gap by providing three new research datasets that include audio, lyrics, and bimodal data. These datasets were created using a semi-automatic approach and are intended to serve as a benchmark for future research. Experiments conducted using feature engineering, machine learning, and deep learning methodologies have demonstrated the viability of the datasets, achieving a best overall result of 79.21% F1-score for bimodal classification using a deep neural network.
Deep Neural Networks
Bimodal ensemble models
MERGE dataset
F1-score, classification accuracy
On-premises
No
Yes
Bimodal data integration, high classification accuracy
No
Standard computing resources
Linux, Windows
Compatible with existing MER systems
None
None
None
Yes
Active research community
Research team from arXiv publication
Large
Low
Standard
Limited
Data privacy
Limited to available data modalities
Music industry, entertainment
Music emotion analysis, recommendation systems
Academic institutions, music industry
Integration with existing MER frameworks
Scalable with data size
Community support
None
Command-line interface
No
None
Open-source
Yes
Collaborations with music research institutions
None
None
1.0
Open-source software
No
None
Open-source
0.00
USD
MIT License
01/03/2023
01/03/2023
+1-800-555-0199
Integration with music analysis tools
Yes