基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
In the recent informatization of Chinese courts, the huge amount of law cases and judgment documents, which were digital stored,has provided a good foundation for the research of judicial big data and machine learning. In this situation, some ideas about Chinese courts can reach automation or get better result through the research of machine learning, such as similar documents recommendation, workload evaluation based on similarity of judgement documents and prediction of possible relevant statutes. In trying to achieve all above mentioned, and also in face of the characteristics of Chinese judgement document, we propose a topic model based approach to measure the text similarity of Chinese judgement document, which is based on TF-IDF, Latent Dirichlet Allocation (LDA), Labeled Latent Dirichlet Allocation (LLDA) and other treatments. Combining with the characteristics of Chinese judgment document,we focus on the specific steps of approach, the preprocessing of corpus, the parameters choices of training and the evaluation of similarity measure result. Besides, implementing the approach for prediction of possible statutes and regarding the prediction accuracy as the evaluation metric, we designed experiments to demonstrate the reasonability of decisions in the process of design and the high performance of our approach on text similarity measure. The experiments also show the restriction of our approach which need to be focused in future work.
推荐文章
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 Topic Model Based Text Similarity Measure for Chinese Judgment Document
来源期刊 国际计算机前沿大会会议论文集 学科 社会科学
关键词 CHINESE JUDGMENT documents Data science Machine learning Natural language processing Text similarity TF-IDF TOPIC model LATENT DIRICHLET ALLOCATION Labeled LATENT DIRICHLET ALLOCATION
年,卷(期) 2017,(2) 所属期刊栏目
研究方向 页码范围 9-11
页数 3页 分类号 C5
字数 语种
DOI
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (0)
共引文献  (0)
参考文献  (0)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
2017(0)
  • 参考文献(0)
  • 二级参考文献(0)
  • 引证文献(0)
  • 二级引证文献(0)
研究主题发展历程
节点文献
CHINESE
JUDGMENT
documents
Data
science
Machine
learning
Natural
language
processing
Text
similarity
TF-IDF
TOPIC
model
LATENT
DIRICHLET
ALLOCATION
Labeled
LATENT
DIRICHLET
ALLOCATION
研究起点
研究来源
研究分支
研究去脉
引文网络交叉学科
相关学者/机构
期刊影响力
国际计算机前沿大会会议论文集
半年刊
北京市海淀区西三旗昌临801号
出版文献量(篇)
616
总下载数(次)
6
总被引数(次)
0
论文1v1指导