基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
With the rapid growth of web-based social networking technologies in recent years, author identification and analysis have proven increasingly useful. Authorship analysis provides information about a document’s author, often including the author’s gender. Men and women are known to write in distinctly different ways, and these differences can be successfully used to make a gender prediction. Making use of these distinctions between male and female authors, this study demonstrates the use of a simple stream-based neural network to automatically discriminate gender on manually labeled tweets from the Twitter social network. This neural network, the Modified Balanced Winnow, was employed in two ways;the effectiveness of data stream mining was initially examined with an extensive list of n-gram features. Feature selection techniques were then evaluated by drastically reducing the feature list using WEKA’s attribute selection algorithms. This study demonstrates the effectiveness of the stream mining approach, achieving an accuracy of 82.48%, a 20.81% increase above the baseline prediction. Using feature selection methods improved the results by an additional 16.03%, to an accuracy of 98.51%.
推荐文章
Identification of bacterial fossils in marine source rocks in South China
South China
Excellent marine source rocks
Bacterial fossil
Sedimentary environment
一种分布式Twitter数据处理方案及应用
社交媒体
分布式处理框架
Twitter流式数据
流感疫情侦测
分布式计算
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 Gender Identification on Twitter Using the Modified Balanced Winnow
来源期刊 通讯与网络(英文) 学科 医学
关键词 GENDER IDENTIFICATION TWITTER MODIFIED BALANCED WINNOW Neural Networks Stream Data Mining Feature Selection
年,卷(期) 2012,(3) 所属期刊栏目
研究方向 页码范围 189-195
页数 7页 分类号 R73
字数 语种
DOI
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (0)
共引文献  (0)
参考文献  (0)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
2012(0)
  • 参考文献(0)
  • 二级参考文献(0)
  • 引证文献(0)
  • 二级引证文献(0)
研究主题发展历程
节点文献
GENDER
IDENTIFICATION
TWITTER
MODIFIED
BALANCED
WINNOW
Neural
Networks
Stream
Data
Mining
Feature
Selection
研究起点
研究来源
研究分支
研究去脉
引文网络交叉学科
相关学者/机构
期刊影响力
通讯与网络(英文)
季刊
1949-2421
武汉市江夏区汤逊湖北路38号光谷总部空间
出版文献量(篇)
427
总下载数(次)
0
总被引数(次)
0
论文1v1指导