• 首页 首页 icon
  • 工具库 工具库 icon
    • IP查询 IP查询 icon
  • 内容库 内容库 icon
    • 快讯库 快讯库 icon
    • 精品库 精品库 icon
    • 问答库 问答库 icon
  • 更多 更多 icon
    • 服务条款 服务条款 icon

AISHELL-4 多通道文会议语音数据库

武飞扬头像
希尔贝壳AISHELL
帮助1

AISHELL-4是一个通过麦克风阵列实录的八通道中文普通话会议场景语音数据集。该数据集共包含211场会议,每场会议4至8人,数据集共120小时左右。该数据集旨在促进实际应用场景下多说话人处理的研究。AISHELL-4数据包括了实际会议场景下各种重要特性,例如停顿、重叠、说话人轮转、噪声等。同时数据集提供了准确的音字转写文本及时间戳信息,方便研究者进行诸如前端处理、语音识别、说话人分割等单独任务,并可以进行联合优化。

The AISHELL-4 is a sizable real-recorded Mandarin speech dataset collected by 8-channel circular microphone array for speech processing in conference scenario. The dataset consists of 211 recorded meeting sessions, each containing 4 to 8 speakers, with a total length of 120 hours. This dataset aims to bride the advanced research on multi-speaker processing and the practical application scenario in three aspects. With real recorded meetings, AISHELL-4 provides realistic acoustics and rich natural speech characteristics in conversation such as short pause, speech overlap, quick speaker turn, noise, etc. Meanwhile, the accurate transcription and speaker voice activity are provided for each meeting in AISHELL-4. This allows the researchers to explore different aspects in meeting processing, ranging from individual tasks such as speech front-end processing, speech recognition and speaker diarization, to multi-modality modeling and joint optimization of relevant tasks. We also release a PyTorch-based training and evaluation framework as baseline system to promote reproducible research in this field.

学新通

学新通

 120 小时 丨 120 Hours

211 场会议 丨 211 Meeting Sessions

10个 会议室 丨 10 Meeting Rooms

60 人 丨 60 Speakers

学新通

 Speech front-end processing

Speech Recognition

Speaker Diarization

学新通

开源系统 

Open Source


AISHELL-4 is part of the AISHELL-ASR0055 Corpus

学新通

 The setup of the recording environment.

学新通

20 个会议室 丨 20 Meeting Rooms

639 场会议 丨 639 Meeting Sessions

370 小时/单通道 丨 370 Hours/Single Channel

162 人 丨 162 Speakers

http://www.aishelltech.com/aishell_4学新通http://www.aishelltech.com/aishell_4

这篇好文章是转载于:学新通技术网

  • 版权申明: 本站部分内容来自互联网,仅供学习及演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,请提供相关证据及您的身份证明,我们将在收到邮件后48小时内删除。
  • 本站站名: 学新通技术网
  • 本文地址: /boutique/detail/tanhfijieb
系列文章
更多 icon
同类精品
更多 icon
继续加载