基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
In distributed training,increasing batch size can improve parallelism,but it can also bring many difficulties to the training process and cause training errors.In this work,we investigate the occurrence of training errors in theory and train ResNet-50 on CIFAR-10 by using Stochastic Gradient Descent (SGD) and Adaptive moment estimation(Adam) while keeping the total batch size in the parameter server constant and lowering the batch size on each Graphics Processing Unit (GPU).A new method that considers momentum to eliminate training errors in distributed training is proposed.We define a Momentum-like Factor (MF) to represent the influence of former gradients on parameter updates in each iteration.Then,we modify the MF values and conduct experiments to explore how different MF values influence the training performance based on SGD,Adam,and Nesterov accelerated gradient.Experimental results reveal that increasing MFs is a reliable method for reducing training errors in distributed training.The analysis of convergent conditions in distributed training with consideration of a large batch size and multiple GPUs is presented in this paper.
推荐文章
Elemental characteristics of lacustrine oil shale and its controlling factors of palaeo-sedimentary
Elemental geochemistry
Palaeosedimentary
Main controlling factors
Lacustrine oil shale
Triassic
Ordos Basin
A re-assessment of nickel-doping method in iron isotope analysis on rock samples using multi-collect
Fe isotope
Ni-doping
Stable isotope
Precision and accuracy
Mass bias correction
Pseudo-high mass resolution
基于改进Tri-training算法的中文问句分类
Tri-training算法
随机采样
问句分类
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 Increasing Momentum-Like Factors:A Method for Reducing Training Errors on Multiple GPUs
来源期刊 清华大学学报自然科学版(英文版) 学科
关键词
年,卷(期) 2022,(1) 所属期刊栏目 REGULAR ARTICLES
研究方向 页码范围 114-126
页数 13页 分类号
字数 语种 英文
DOI 10.26599/TST.2020.9010023
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (0)
共引文献  (0)
参考文献  (0)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
2022(0)
  • 参考文献(0)
  • 二级参考文献(0)
  • 引证文献(0)
  • 二级引证文献(0)
引文网络交叉学科
相关学者/机构
期刊影响力
清华大学学报自然科学版(英文版)
双月刊
1007-0214
11-3745/N
16开
北京市海淀区双清路学研大厦B座908
1996
eng
出版文献量(篇)
2269
总下载数(次)
0
  • 期刊分类
  • 期刊(年)
  • 期刊(期)
  • 期刊推荐
论文1v1指导