基于Faster RCNN的字幕文本检测方法
首发时间:2019-04-18
摘要:视频中的字幕往往具有极强的语义信息,能够有效地帮助人们对视频内容进行理解和分析。本文旨在利用深度学习技术对视频图像中的字幕文本进行高速且准确的检测,从而高效地实现视频数据内容信息的获取,辅助从业人员进行海量视频数据的检索和分类。本文首先建立了视频字幕数据集,字符样本包含常用汉字6763类、英文字符26类等,充分考虑了样本的均衡性和泛化能力。之后,针对视频字幕检测的具体场景,选取Faster RCNN 检测框架,引入先验损失函数等方法提高了精度和召回率。最后完成了整体系统的串联与搭建,从视频读取到字幕帧截取,再到字幕文本行定位及文本内容检测,实现了端到端的视频输入到文本字符串输出过程。本文设计并实现了一个兼顾速度与精度的视频字幕检测系统,能够实现对视频字幕文本的实时定位与识别。最终文本检测精度和召回率都达到了99.5%,文本识别准确率为97.5%,整体检测速度达到45fps。
For information in English, please click here
Subtitle Text Detection Method based on Faster RCNN
Abstract:Subtitles in videos often have strong semantic information, which can effectively help people understand and analyze video content. The purpose of this paper is to use deep learning technology to detect subtitle text in video image quickly and accurately, so as to achieve efficient and fast acquisition of video data content information, and to assist relevant practitioners in the retrieval and classification of massive video data. First, the corresponding video subtitle dataset is established. The character samples of the data set include 6763 common Chinese characters, 26 English characters, etc. The diversity, balance and generalization of the samples are fully considered. Second, aiming at the specific scene of video subtitle text detection, this paper chooses Faster RCNN detection framework, introduces transcendental loss function and other methods to improve accuracy and recall. Finally, the whole system\'s serial connection and construction are completed, from video reading to caption frame interception, then to caption text line positioning, text content detection, the end-to-end process from video input to text string output is achieved. A video subtitle detection system with both speed and accuracy is designed and implemented, which can realize real-time location and recognition of video subtitle text. The accuracy and recall rate of text detection have reached 99.5%, the top-1 accuracy rate of text recognition is 97.5%, and the overall detection speed is 45 fps.
Keywords: artificial intelligence deep learning subtitle detection text localization
基金:
引用
No.****
同行评议
勘误表
基于Faster RCNN的字幕文本检测方法
评论
全部评论0/1000