基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
The aim of this work was the behavior analysis when a spell checker was integrated as an extra pre-process during the first stage of the test mining. Different models were analyzed, choosing the most complete one considering the pre-processes as the initial part of the text mining process. Algorithms for the Spanish language were developed and adapted, as well as for the methodology testing through the analysis of 2363 words. A capable notation for removing special and unwanted characters was created. Execution times of each algorithm were analyzed to test the efficiency of the text mining pre-process with and without orthographic revision. The total time was shorter with the spell-checker than without it. The key difference of this work among the existing related studies is the first time that the spell checker is used in the text mining preprocesses.
推荐文章
Distribution and assessment of hydrogeochemical processes of F-rich groundwater using PCA model: a c
Fluoride
Groundwater chemistry
PCA model
Hydrogeochemical processes
Yuncheng Basin
Zinc isotope fractionation under vaporization processes and in aqueous solutions
Evaporation process
Zinc isotope
Kinetic isotope fractionation
Equilibrium fractionation
Zinc species in solution
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 Advantages of Using a Spell Checker in Text Mining Pre-Processes
来源期刊 电脑和通信(英文) 学科 医学
关键词 Spell CHECKER Text Mining STEMMING TOKENIZATION PORTER ALGORITHM SNOWBALL ALGORITHM
年,卷(期) 2018,(11) 所属期刊栏目
研究方向 页码范围 43-54
页数 12页 分类号 R73
字数 语种
DOI
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (0)
共引文献  (0)
参考文献  (0)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
2018(0)
  • 参考文献(0)
  • 二级参考文献(0)
  • 引证文献(0)
  • 二级引证文献(0)
研究主题发展历程
节点文献
Spell
CHECKER
Text
Mining
STEMMING
TOKENIZATION
PORTER
ALGORITHM
SNOWBALL
ALGORITHM
研究起点
研究来源
研究分支
研究去脉
引文网络交叉学科
相关学者/机构
期刊影响力
电脑和通信(英文)
月刊
2327-5219
武汉市江夏区汤逊湖北路38号光谷总部空间
出版文献量(篇)
783
总下载数(次)
0
总被引数(次)
0
论文1v1指导