基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
Spark is a fast unified analysis engine for big data and machine learning,in which the memory is a crucial resource.Resilient Distribution Datasets(RDDs)are parallel data structures that allow users explicitly persist intermediate results in memory or on disk,and each one can be divided into several partitions.During task execution,Spark automatically monitors cache usage on each node.And when there is a RDD that needs to be stored in the cache where the space is insufficient,the system would drop out old data partitions in a least recently used(LRU)fashion to release more space.However,there is no mechanism specifically for caching RDD in Spark,and the dependency of RDDs and the need for future stages are not been taken into consideration with LRU.In this paper,we propose the optimization approach for RDDs cache and LRU based on the features of partitions,which includes three parts:the prediction mechanism for persistence,the weight model by using the entropy method,and the update mechanism of weight and memory based on RDDs partition feature.Finally,through the verification on the spark platform,the experiment results show that our strategy can effectively reduce the time in performing and improve the memory usage.
推荐文章
IABS:一个基于Spark的Apriori改进算法
Apriori算法
频繁项集
存储结构转换
Spark
内存计算
一种Spark下分布式DBN并行加速策略
分布内存计算框架
缓存替换
范围分区
深度信念网络
数据倾斜
基于梯度的低功耗 Cache 划分算法
GPA 算法
cache 划分
低功耗
循环 OBL 替换算法
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 An Improved Memory Cache Management Study Based on Spark
来源期刊 计算机、材料和连续体(英文) 学科 工学
关键词 Resilient DISTRIBUTION datasets UPDATE mechanism WEIGHT MODE
年,卷(期) 2018,(9) 所属期刊栏目
研究方向 页码范围 415-431
页数 17页 分类号 TP3
字数 语种
DOI
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (0)
共引文献  (0)
参考文献  (0)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
2018(0)
  • 参考文献(0)
  • 二级参考文献(0)
  • 引证文献(0)
  • 二级引证文献(0)
研究主题发展历程
节点文献
Resilient
DISTRIBUTION
datasets
UPDATE
mechanism
WEIGHT
MODE
研究起点
研究来源
研究分支
研究去脉
引文网络交叉学科
相关学者/机构
期刊影响力
计算机、材料和连续体(英文)
月刊
1546-2218
江苏省南京市浦口区东大路2号东大科技园A
出版文献量(篇)
346
总下载数(次)
4
总被引数(次)
0
论文1v1指导