基本信息来源于合作网站,原文需代理用户跳转至来源网站获取       
摘要:
Objective: The Chinese description of images combines the two directions of computer vision and natural language processing. It is a typical representative of multi-mode and cross-domain problems with artificial intelligence algorithms. The image Chinese description model needs to output a Chinese description for each given test picture, describe the sentence requirements to conform to the natural language habits, and point out the important information in the image, covering the main characters, scenes, actions and other content. Since the current open source datasets are mostly in English, the research on the direction of image description is mainly in English. Chinese descriptions usually have greater flexibility in syntax and lexicalization, and the challenges of algorithm implementation are also large. Therefore, only a few people have studied image descriptions, especially Chinese descriptions. Methods: This study attempts to derive a model of image description generation from the Flickr8k-cn and Flickr30k-cn datasets. At each time period of the description, the model can decide whether to rely more on images or text information. The model captures more important information from the image to improve the richness and accuracy of the Chinese description of the image. The image description data set of this study is mainly composed of Chinese description sentences. The method consists of an encoder and a decoder. The encoder is based on a convolutional neural network. The decoder is based on a long-short memory network and is composed of a multi-modal summary generation network. Results: Experiments on Flickr8k-cn and Flickr30k-cn Chinese datasets show that the proposed method is superior to the existing Chinese abstract generation model. Conclusion: The method proposed in this paper is effective, and the performance has been greatly improved on the basis of the benchmark model. Compared with the existing Chinese abstract generation model, its performance is also superior. In the next step, more visual prior i
推荐文章
Mechanism of accelerated dissolution of mineral crystals by cavitation erosion
Cavitation erosion
Mineral dissolution
Plastic deformation
Stepwave
Gibbs free energy
基于BiLSTM-Attention唇语识别的研究
唇语识别
双向长短时记忆网络
注意力机制
深度学习
时序编码
Using Sr isotopes to trace the geographic origins of Chinese mitten crabs
Chinese mitten crab
Lakes
Sr isotopes
Geographic origin
Application ontology构建及SPARQL查询研究
本体
手机应用
简单协议和RDF查询语言
查询
本体描述语言
内容分析
关键词云
关键词热度
相关文献总数  
(/次)
(/年)
文献信息
篇名 Application of Dual Attention Mechanism in Chinese Image Captioning
来源期刊 智能学习系统与应用(英文) 学科 工学
关键词 IMAGE CAPTION in Chinese DUAL ATTENTION MECHANISM Richness ACCURACY
年,卷(期) 2020,(1) 所属期刊栏目
研究方向 页码范围 14-29
页数 16页 分类号 TP3
字数 语种
DOI
五维指标
传播情况
(/次)
(/年)
引文网络
引文网络
二级参考文献  (0)
共引文献  (0)
参考文献  (0)
节点文献
引证文献  (0)
同被引文献  (0)
二级引证文献  (0)
2020(0)
  • 参考文献(0)
  • 二级参考文献(0)
  • 引证文献(0)
  • 二级引证文献(0)
研究主题发展历程
节点文献
IMAGE
CAPTION
in
Chinese
DUAL
ATTENTION
MECHANISM
Richness
ACCURACY
研究起点
研究来源
研究分支
研究去脉
引文网络交叉学科
相关学者/机构
期刊影响力
智能学习系统与应用(英文)
季刊
2150-8402
武汉市江夏区汤逊湖北路38号光谷总部空间
出版文献量(篇)
166
总下载数(次)
0
总被引数(次)
0
论文1v1指导