The MusicEval dataset is the first generative music assessment dataset. The dataset contains 2,748 music clips generated by 31 prevalent and advanced TTM (Text-to-Music) models in response to 384 text prompts, along with 13,740 ratings collected from 14 music experts. The total duration is 16.62 hours.

 

The MusicEval dataset collects 384 text prompts in total. Among these, 80 manually written prompts and 20 prompts selected from the MusicCaps dataset are used for the open-access models to generate music, while the remaining 284 descriptions only correspond to music clips from the demo-only system.

 

All music clips are 16khz mono audio and each music clip is evaluated by 5 raters on two dimensions: overall musical impression and alignment with the text prompt, which respectively emphasize the importance of both the quality of the generated music and its consistency with the given text prompt.

 

论 文

 

上线准备中

数据下载

 

上线准备中

查看样例

 

Demo

MusicEval 生成式音乐评分数据集

A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation

* This dataset was jointly developed and constructed by the HLT Laboratory of the College of Computer Science at Nankai University and AISHELL.