基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
Big data analysis requires the presence of large computing powers, which is not always feasible. And so, itbecame necessary to develop new clustering algorithms capable of such data processing. This study proposes a newparallel clustering algorithm based on the k-means algorithm. It significantly reduces the exponential growth ofcomputations. The proposed algorithm splits a dataset into batches while preserving the characteristics of the initialdataset and increasing the clustering speed. The idea is to define cluster centroids, which are also clustered, for eachbatch. According to the obtained centroids, the data points belong to the cluster with the nearest centroid. Real largedatasets are used to conduct the experiments to evaluate the effectiveness of the proposed approach. The proposedapproach is compared with k-means and its modification. The experiments show that the proposed algorithm is apromising tool for clustering large datasets in comparison with the k-means algorithm.
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 Efficient algorithm for big data clustering on single machine
来源期刊 智能技术学报 学科 工学
关键词 algorithm. ALGORITHM BATCH
年,卷(期) 2020,(1) 所属期刊栏目
研究方向 页码范围 9-14
页数 6页 分类号 TP3
字数 语种
DOI
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (0)
共引文献  (0)
参考文献  (0)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
2020(0)
  • 参考文献(0)
  • 二级参考文献(0)
  • 引证文献(0)
  • 二级引证文献(0)
研究主题发展历程
节点文献
algorithm.
ALGORITHM
BATCH
研究起点
研究来源
研究分支
研究去脉
引文网络交叉学科
相关学者/机构
期刊影响力
智能技术学报
季刊
2468-2322
重庆市巴南区红光大道69号
出版文献量(篇)
142
总下载数(次)
4
总被引数(次)
0
论文1v1指导