AISHELL-RealMAN
开 源 数 据 ,助 力 人 工 智 能 发 展
RealMAN
A Real-Recorded and Annotated Microphone Array Dataset
The RealMAN dataset is a multi-channel microphone array dataset for dynamic speech enhancement and localization. Specifically, a 32-channel array with high-fidelity microphones is used for recording. A loudspeaker is used for playing source speech signals. A total of 83-hour speech signals (48 hours for static speaker and 35 hours of moving speaker) are recorded in 32 different scenes, and 144 hours of background noise are recorded in 31 different scenes. Both speech and noise recording scenes cover various common indoor, outdoor, semi-outdoor and transportation environments. See Figure 1 for the recording devices. The azimuth angle of the loudspeaker is annotated with an omni-direction fisheye camera, and is used for the training of source localization networks. The direct-path signal is obtained by filtering the played speech signal with an estimated direct-path propagation filter, and is used for the training of speech enhancement networks.
数据下载
License: CC BY NC 4.0
论 文
基线系统
微信公众号
联系我们
商务合作:bd@aishelldata.com
技术服务:tech@aishelldata.com
联系电话:+86-010-80225006
公司地址:
北京市海淀区西北旺东路10号院东区10号楼新兴产业联盟大厦3层316室
开源数据