助盲场景下的图像描述算法研究
首发时间:2018-10-08
摘要:本文利用卷积神经网络根据图像抽取特征,将抽取出来的图像特征作为循环神经网络的输入得到关于图像的描述性文字,最终将文字以语音的方式输出从而达到助盲作用。为了提高图像描述生成的质量与缩短实验的时间成本,探究使用长短期记忆网络代替传统循环神经网络,采用Adam代替传统梯度下降法进行优化算法选取,对于非GPU训练环境下的神经网络利用矩阵运算进行计算加速。实验结果表明,基于神经网络的图像描述算法基本能够识别图像的主体,同时能够对于主体的行为给予良好的预测,对于助盲能起到一定作用。
For information in English, please click here
Research on image description algorithm in blind scene
Abstract:In this paper, the convolutional neural network is used to extract features from images, and use them as an input to the recurrent neural network to obtain captions of image. After then, those captions will be read through Voice Technology to achieve the effect of assistance for the blind. In order to improve the quality of Image Caption Generation Algorithm and shorten the time spent on the experiment, experiment use Long-Short Term Memory networks instead of Recurrent Neural Networks. Not only that, the neural network trained by CPU utilizes matrix operations for computational acceleration. In order to get better training results, Adam was used instead of the Gradient Descent.Experimental results show that the Image Caption Generation Algorithm based on neural network has the ability to identify the subject of the image
Keywords: Computer vision;Deep learning;Recurrent neural network;Image description
引用
No.****
动态公开评议
共计0人参与
勘误表
助盲场景下的图像描述算法研究
评论
全部评论0/1000