基于深度学习的端到端印刷体数学公式识别

詹逸超; 张洪刚

0
0
浏览
下载

摘要
关键词
基金信息
论文图表
同行评议
相关论文
评论

基于深度学习的端到端印刷体数学公式识别

首发时间：2019-12-26

詹逸超 ¹
詹逸超（1996-），男，硕士研究生，主要研究方向：计算机视觉。
张洪刚 ¹
张洪刚 (1974-)，男，副教授，研究生导师，主要研究方向: 数字图像处理、视频挖掘、智能交通。E-mail: zhhg@bupt.edu.cn

1、北京邮电大学信息与通信工程学院，北京　 100876

摘要：印刷体数学公式的识别对于在线教育和防止学术不端都有重要的现实意义,目前大多研究者使用的是传统的分步骤方法,实现困难且泛用性不强。本文提出一种基于深度学习的端到端的印刷体数学公式识别方法,使用卷积神经网络（convolutional neural network,CNN）提取图像特征,并使用编码器-解码器结构的循环神经网络（recurrent neural network,RNN）,将特征翻译为LaTeX 文本。使用识别前图片和识别的公式结果生成的图片的距离作为评价标准,在基于2003年KDD Cup数据集的提取的公式数据上取得了良好的效果。

关键词：人工智能深度学习卷积神经网络公式识别光学字符识别

For information in English, please click here

The End-to-End Recognition of Printed Mathmatical Formulas Based on Deep Learning

ZHAN Yichao ¹
詹逸超（1996-），男，硕士研究生，主要研究方向：计算机视觉。
ZHANG Honggang ¹
张洪刚 (1974-)，男，副教授，研究生导师，主要研究方向: 数字图像处理、视频挖掘、智能交通。E-mail: zhhg@bupt.edu.cn

1、School of Communication and Information Engineering, Beijing University of Posts and Telecommunications, Beijing 100876

Abstract：The recgonition of printed mathematical formulas has important practical significance for both online education and prevention of academic misconduct. At present, most researchers use traditional step-by-step methods, which are difficult to implement and are rarely universal. This article proposes an end-to-end method for recognition of printed mathematical formulas based on deep learning. It uses a convolutional neural network (CNN) to extract image features and uses a recurrent neural network (RNN) with an encoder-decoder structure to translate features into LaTeX text. Using the distance between the picture before recognition and the picture generated by the formula result as the evaluation criterion, good results have been obtained on the formula data extracted based on the 2003 KDD Cup dataset.

Keywords： artificial intelligence deep learning convolutional neural network formula recognition OCR

基金：

论文图表：

引用

导出参考文献

.txt

.ris

.doc

詹逸超，张洪刚. 基于深度学习的端到端印刷体数学公式识别[EB/OL]. 北京：中国科技论文在线 [2019-12-26]. https://www.paper.edu.cn/releasepaper/content/201912-106.

No.****

同行评议

未申请同行评议

全部评论

0/1000

论文编号	201912-106
论文题目	基于深度学习的端到端印刷体数学公式识别
文献类型
收录期刊	上传封面中文期刊英文期刊期刊名称（中文）期刊名称（英文）年，卷（）上传封面中文专著英文专著书名（中文）书名（英文）出版地出版社出版年上传封面中文译著英文译著书名（中文）书名（英文）出版地出版社出版年上传封面中文论文集英文论文集编者.论文集名称（中文） [c]. 出版地出版社出版年， - 编者.论文集名称（英文） [c]. 出版地出版社出版年，- 上传封面中文文献英文文献期刊名称（中文）期刊名称（英文）日期-- 在线地址http:// 上传封面中文文献英文文献文题（中文）文题（英文）出版地出版社,出版日期-- 上传封面中文文献英文文献文题（中文）文题（英文）出版地出版社,出版日期--
英文作者写法：中外文作者均姓前名后，姓大写，名的第一个字母大写，姓全称写出，名可只写第一个字母，其后不加实心圆点“.”, 作者之间用逗号“，”分隔，最后为实心圆点“.”, 示例1：原姓名写法：Albert Einstein,编入参考文献时写法：Einstein A. 示例2：原姓名写法：李时珍；编入参考文献时写法：LI S Z. 示例3：YELLAND R L,JONES S C,EASTON K S,et al.