基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
With the abundance of exceptionally High Dimensional data, feature selection has become an essential element in the Data Mining process. In this paper, we investigate the problem of efficient feature selection for classification on High Dimensional datasets. We present a novel filter based approach for feature selection that sorts out the features based on a score and then we measure the performance of four different Data Mining classification algorithms on the resulting data. In the proposed approach, we partition the sorted feature and search the important feature in forward manner as well as in reversed manner, while starting from first and last feature simultaneously in the sorted list. The proposed approach is highly scalable and effective as it parallelizes over both attribute and tuples simultaneously allowing us to evaluate many of potential features for High Dimensional datasets. The newly proposed framework for feature selection is experimentally shown to be very valuable with real and synthetic High Dimensional datasets which improve the precision of selected features. We have also tested it to measure classification accuracy against various feature selection process.
推荐文章
The Acquisition High-resolution the Prospecting Technique of Seismic Data for of Active Faults
地震数据采集
勘探技术
高分辨率
活动断层
浅层地震技术
故障部位
活动时间
勘探方法
Thermodynamic properties of San Carlos olivine at high temperature and high pressure
San Carlos olivine
Thermodynamic property
Thermal expansion
Heat capacity
Temperature gradient
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 A Feature Subset Selection Technique for High Dimensional Data Using Symmetric Uncertainty
来源期刊 数据分析和信息处理(英文) 学科 医学
关键词 High DIMENSIONAL Datasets FEATURE SELECTION CLASSIFICATION Predominant FEATURE
年,卷(期) 2014,(4) 所属期刊栏目
研究方向 页码范围 95-105
页数 11页 分类号 R73
字数 语种
DOI
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (0)
共引文献  (0)
参考文献  (0)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
2014(0)
  • 参考文献(0)
  • 二级参考文献(0)
  • 引证文献(0)
  • 二级引证文献(0)
研究主题发展历程
节点文献
High
DIMENSIONAL
Datasets
FEATURE
SELECTION
CLASSIFICATION
Predominant
FEATURE
研究起点
研究来源
研究分支
研究去脉
引文网络交叉学科
相关学者/机构
期刊影响力
数据分析和信息处理(英文)
季刊
2327-7211
武汉市江夏区汤逊湖北路38号光谷总部空间
出版文献量(篇)
106
总下载数(次)
0
论文1v1指导