基于时空注意力的唇语识别算法
首发时间:2021-03-19
摘要:随着深度学习技术的蓬勃发展,以深度学习技术为基础的唇语识别算法也获得了长足的进步,然而现有的唇语识别算法存在无法有效提取特征的问题。本文将基于对注意力卷积模块的研究,设计一种新的注意力机制:时空注意力,并将其用于唇语识别当中。通过对视频的时域和空间特征的提取和融合,提升三维卷积神经网络沟通时空特征和提取唇部细节特征的能力。通过自建数据集以及合理地设计训练策略,最终基于时空注意力机制的唇语识别算法在实验中获得了85.5%的准确率,相较于现有的算法获得了4.1%的性能提升。
For information in English, please click here
Lip Reading Algorithm Based on Spatio-Temporal Attention
Abstract:With the vigorous development of deep learning technology, lip reading algorithms based on deep learning technology have also made considerable progress. However, the existing lip reading algorithms have the problem of not being able to effectively extract features. Based on the research of the Convolutional Block Attention Module, this paper will design a new attention mechanism: spatiotemporal attention, and use it in lip recognition. By the extraction and fusion of the temporal and spatial features of the video, the ability of the three-dimensional convolutional neural network to communicate temporal and spatial features and extract lip detail features is improved. Using self-built data sets and reasonable design of training strategies, the lip reading algorithm based on the spatiotemporal attention mechanism finally obtained 85.5% accuracy in the experiment, which is a 4.1% performance improvement compared with the existing algorithm..
Keywords: computer applications lip reading attention mechanism deep learning
基金:
引用
No.****
同行评议
勘误表
基于时空注意力的唇语识别算法
评论
全部评论0/1000