基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
Purpose:The purpose of the study is to explore the potential use of nature language process (NLP) and machine learning (ML) techniques and intents to find a feasible strategy and effective approach to fulfill the NER task for Web oriented person-specific information extraction.Design/methodology/approach:An SVM-based multi-classification approach combined with a set of rich NLP features derived from state-of-the-art NLP techniques has been proposed to fulfill the NER task.A group of experiments has been designed to investigate the influence of various NLP-based features to the performance of the system,especially the semantic features.Optimal parameter settings regarding with SVM models,including kernel functions,margin parameter of SVM model and the context window size,have been explored through experiments as well.Findings:The SVM-based multi-classification approach has been proved to be effective for the NER task.This work shows that NLP-based features are of great importance in datadriven NE recognition,particularly the semantic features.The study indicates that higher order kernel function may not be desirable for the specific classification problem in practical application.The simple linear-kernel SVM model performed better in this case.Moreover,the modified SVM models with uneven margin parameter are more common and flexible,which have been proved to solve the imbalanced data problem better.Research limitations/implications:The SVM-based approach for NER problem is only proved to be effective on limited experiment data.Further research need to be conducted on the large batch of real Web data.In addition,the performance of the NER system need be tested when incorporated into a complete IE framework.Originality/value:The specially designed experiments make it feasible to fully explore the characters of the data and obtain the optimal parameter settings for the NER task,leading to a preferable rate in recall,precision and F1 measures.The overall system performance (F1 value) for all types of name entities can achieve above 88.6%,which can meet the requirements for the practical application.
推荐文章
期刊_丙丁烷TDLAS测量系统的吸收峰自动检测
带间级联激光器
调谐半导体激光吸收光谱
雾剂检漏 中红外吸收峰 洛伦兹光谱线型
期刊_联合空间信息的改进低秩稀疏矩阵分解的高光谱异常目标检测
高光谱图像
异常目标检测 低秩稀疏矩阵分解 稀疏矩阵 残差矩阵
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 Person-specific named entity recognition using SVM with rich feature sets
来源期刊 中国文献情报(英文刊) 学科
关键词 Named entity recognition Natural language processing SVM-based classifier Feature selection
年,卷(期) 2012,(3) 所属期刊栏目
研究方向 页码范围 27-46
页数 20页 分类号
字数 语种 英文
DOI
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (14)
共引文献  (3)
参考文献  (6)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
1998(1)
  • 参考文献(1)
  • 二级参考文献(0)
2004(1)
  • 参考文献(0)
  • 二级参考文献(1)
2008(2)
  • 参考文献(1)
  • 二级参考文献(1)
2010(1)
  • 参考文献(0)
  • 二级参考文献(1)
2011(3)
  • 参考文献(2)
  • 二级参考文献(1)
2012(1)
  • 参考文献(0)
  • 二级参考文献(1)
2015(2)
  • 参考文献(0)
  • 二级参考文献(2)
2016(2)
  • 参考文献(0)
  • 二级参考文献(2)
2017(5)
  • 参考文献(0)
  • 二级参考文献(5)
2019(2)
  • 参考文献(2)
  • 二级参考文献(0)
2012(1)
  • 参考文献(0)
  • 二级参考文献(1)
  • 引证文献(0)
  • 二级引证文献(0)
研究主题发展历程
节点文献
Named entity recognition
Natural language processing
SVM-based classifier
Feature selection
研究起点
研究来源
研究分支
研究去脉
引文网络交叉学科
相关学者/机构
期刊影响力
中国文献情报(英文刊)
季刊
1674-3393
11-5670/G2
eng
出版文献量(篇)
199
总下载数(次)
0
论文1v1指导