基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
Data sparseness has been an inherited issue of statistical language models and smoothing method is usually used to resolve the zero count problems. In this paper, we studied empirically and analyzed the well-known smoothing methods of Good-Turing and advanced Good-Turing for language models on large sizes Chinese corpus. In the paper, ten models are generated sequentially on various size of corpus, from 30 M to 300 M Chinese words of CGW corpus. In our experiments, the smoothing methods;Good-Turing and Advanced Good-Turing smoothing are evaluated on inside testing and outside testing. Based on experiments results, we analyzed further the trends of perplexity of smoothing methods, which are useful for employing the effective smoothing methods to alleviate the issue of data sparseness on various sizes of language models. Finally, some helpful observations are described in detail.
推荐文章
Using Sr isotopes to trace the geographic origins of Chinese mitten crabs
Chinese mitten crab
Lakes
Sr isotopes
Geographic origin
A combined IR and XRD study of natural well crystalline goethites(α-FeOOH)
Crystallinity
Goethite
IR-spectrometry
X-ray diffraction
XRD rietveld refinement
Characterization
Effects of mineral-organic fertilizer on the biomass of green Chinese cabbage and potential carbon s
Potassic rock
Carbonate
Karst
Ion chromatograph
Carbon sequestration
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 An Empirical Study of Good-Turing Smoothing for Language Models on Different Size Corpora of Chinese
来源期刊 电脑和通信(英文) 学科 医学
关键词 Good-Turing Methods SMOOTHING LANGUAGE Models PERPLEXITY
年,卷(期) 2013,(5) 所属期刊栏目
研究方向 页码范围 14-19
页数 6页 分类号 R73
字数 语种
DOI
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (0)
共引文献  (0)
参考文献  (0)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
2013(0)
  • 参考文献(0)
  • 二级参考文献(0)
  • 引证文献(0)
  • 二级引证文献(0)
研究主题发展历程
节点文献
Good-Turing
Methods
SMOOTHING
LANGUAGE
Models
PERPLEXITY
研究起点
研究来源
研究分支
研究去脉
引文网络交叉学科
相关学者/机构
期刊影响力
电脑和通信(英文)
月刊
2327-5219
武汉市江夏区汤逊湖北路38号光谷总部空间
出版文献量(篇)
783
总下载数(次)
0
总被引数(次)
0
论文1v1指导