基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
In a previous study, we introduced dynamical aspects of written texts by regarding serial sentence number from the first to last sentence of a given text as discretized time. Using this definition of a textual timeline, we defined an autocorrelation function (ACF) for word occurrences and demonstrated its utility both for representing dynamic word correlations and for measuring word importance within the text. In this study, we seek a stochastic process governing occurrences of a given word having strong dynamic correlations. This is valuable because words exhibiting strong dynamic correlations play a central role in developing or organizing textual contexts. While seeking this stochastic process, we find that additive binary Markov chain theory is useful for describing strong dynamic word correlations, in the sense that it can reproduce characteristics of autocovariance functions (an unnormalized version of ACFs) observed in actual written texts. Using this theory, we propose a model for time-varying probability that describes the probability of word occurrence in each sentence in a text. The proposed model considers hierarchical document structures such as chapters, sections, subsections, paragraphs, and sentences. Because such a hierarchical structure is common to most documents, our model for occurrence probability of words has a wide range of universality for interpreting dynamic word correlations in actual written texts. The main contributions of this study are, therefore, finding usability of the additive binary Markov chain theory to analyze dynamic correlations in written texts and offering a new model of word occurrence probability in which common hierarchical structure of documents is taken into account.
推荐文章
An experimental study on dynamic coupling process of alkaline feldspar dissolution and secondary min
Alkaline feldspar
Dissolution rate
Precipitation
Mineral conversion
Secondary porosity
Triple oxygen isotope constraints on the origin of ocean island basalts
Triple oxygen isotope
Helium isotope
Ocean island basalts
Mantle plume
Mantle heterogeneity
Crustal recycling
Origin软件在夫兰克-赫兹实验中的应用
夫兰克-赫兹实验
Origin软件
数据处理
Frank-Hertz曲线求导
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 Origin of Dynamic Correlations of Words in Written Texts
来源期刊 数据分析和信息处理(英文) 学科 工学
关键词 AUTOCORRELATION FUNCTION AUTOCOVARIANCE FUNCTION Word Occurrence Stochastic Process Additive Binary Markov Chain
年,卷(期) 2019,(4) 所属期刊栏目
研究方向 页码范围 228-249
页数 22页 分类号 TP3
字数 语种
DOI
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (0)
共引文献  (0)
参考文献  (0)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
2019(0)
  • 参考文献(0)
  • 二级参考文献(0)
  • 引证文献(0)
  • 二级引证文献(0)
研究主题发展历程
节点文献
AUTOCORRELATION
FUNCTION
AUTOCOVARIANCE
FUNCTION
Word
Occurrence
Stochastic
Process
Additive
Binary
Markov
Chain
研究起点
研究来源
研究分支
研究去脉
引文网络交叉学科
相关学者/机构
期刊影响力
数据分析和信息处理(英文)
季刊
2327-7211
武汉市江夏区汤逊湖北路38号光谷总部空间
出版文献量(篇)
106
总下载数(次)
0
论文1v1指导