基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
Crowdsourcing has been used recently as an alternative to traditional costly annotation by many natural language processing groups. In this paper, we explore the use of Wechat Official Account Platform (WOAP) in order to build a speech corpus and to assess the feasibility of using WOAP followers (also known as contributors) to assemble speech corpus of Mongolian. A Mongolian language qualification test was used to filter out potential non-qualified participants. We gathered natural speech recordings in our daily life, and constructed a Chinese-Mongolian Speech Corpus (CMSC) of 31472 utterances from 296 native speakers who are fluent in Mongolian, totalling 30.8 h of speech. Then,an evaluation experiment was performed, in where the contributors were asked to choose a correct sentence from a multiple choice list to ensure the high-quality of corpus. The results obtained so far showed that crowdsourcing for constructing CMSC with an evaluation mechanism could be more effective than traditional experiments requiring expertise.
推荐文章
Mechanism of accelerated dissolution of mineral crystals by cavitation erosion
Cavitation erosion
Mineral dissolution
Plastic deformation
Stepwave
Gibbs free energy
Using Sr isotopes to trace the geographic origins of Chinese mitten crabs
Chinese mitten crab
Lakes
Sr isotopes
Geographic origin
Evaluation of groundwater quality in the Dibdibba aquifer using hydrogeochemical and isotope techniq
Safwan-Zubair
Dibdibba formation
Geochemical modeling
Mixing
Stable isotopes
Basra
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 Utilizing Crowdsourcing for the Construction of Chinese-Mongolian Speech Corpus with Evaluation Mechanism
来源期刊 国际计算机前沿大会会议论文集 学科 社会科学
关键词 Crowdsourcing Chinese-Mongolian CORPUS SPEECH CORPUS WOAP Evaluation MECHANISM
年,卷(期) 2017,(2) 所属期刊栏目
研究方向 页码范围 12-14
页数 3页 分类号 C5
字数 语种
DOI
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (0)
共引文献  (0)
参考文献  (0)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
2017(0)
  • 参考文献(0)
  • 二级参考文献(0)
  • 引证文献(0)
  • 二级引证文献(0)
研究主题发展历程
节点文献
Crowdsourcing
Chinese-Mongolian
CORPUS
SPEECH
CORPUS
WOAP
Evaluation
MECHANISM
研究起点
研究来源
研究分支
研究去脉
引文网络交叉学科
相关学者/机构
期刊影响力
国际计算机前沿大会会议论文集
半年刊
北京市海淀区西三旗昌临801号
出版文献量(篇)
616
总下载数(次)
6
总被引数(次)
0
论文1v1指导