学报
 网站首页  部门概况  编委会  投稿须知  制度文件  征订发行  下载专区  过刊(自科)  联系我们 
站内搜索:
当前位置: 网站首页 > 过刊(自科) > 2024年第1期 > 正文

针式打印字体电离层垂测数据自动提取技术

2024年03月04日 09:08  点击:[]

全文下载: 202401020.pdf


文章编号: 1672-6987202401-0146-13 DOI 10.16351/j.1672-6987.2024.01.020


苏桂昌, 张瑞坤, 刘祥鹏*(青岛科技大学 数理学院, 山东 青岛 266061)


摘要: 对于针式打印字体电离层垂测数据扫描图片的像素偏低、字体不连通、文本行粘连无法检测等问题,提出了一种基于CRNN深度学习框架的数据自动提取技术,该技术主要包括图像预处理、文本检测、序列文本识别和识别结果版面处理4个模块。首先,对于3种不同行间距类型的针式打印字体垂测数据扫描图片采用图像模板匹配、降噪处理和倾斜矫正等方法进行图像预处理。然后对预处理后的图片利用投影法进行文本检测加以分割,其中投影分割检测算法中加入了垂直投影、水平投影和检测候选框修正功能,可有效处理粘连文本区域,提高检测精度。最后,考虑到图片数组长度不一,避免切分字符,所以将分割后的文本识别问题转化为序列学习问题,利用CRNN深度学习算法进行文本识别,再通过坐标融合算法,将识别结果保存成Excel标准化格式,从而实现数据自动提取保存。实验结果表明,本研究所提出的算法,文本检测召回率977%,文本识别综合评价指标F值就单个字符识别率9749%,整组字符识别率9478%,并与其他算法进行了比较,验证了其有效性,因此本文所提算法具有较高的实用性,能满足工程应用实际需求


关键词: 电离层; 针式打印字体; 投影分割; 文本检测; CRNN; 文本识别


中图分类号: TP 301.6文献标志码: A

引用格式: 苏桂昌, 张瑞坤, 刘祥鹏. 针式打印字体电离层垂测数据自动提取技术[J. 青岛科技大学学报(自然科学版), 2024, 45(1): 146-158.


SU Guichang ZHANG Ruikun LIU Xiangpeng. Automatic extraction technology of Ionospheric vertical data with pin printer fontJ. Journal of Qingdao University of Science and TechnologyNatural Science Edition), 2024 451): 146-158.


Automatic Extraction Technology of Ionospheric Vertical

Data with Pin Printer Font


SU Guichang, ZHANG Ruikun, LIU Xiangpeng

(College of Mathematics and PhysicsQingdao University of Science and Technology Qingdao 266061 China)


Abstract: Aiming at the problems such as low pixel, disconnected font and undetectable text line adhesion in the scanning images of vertical ionospheric data for pin printer font, an automatic data extraction technique based on CRNN deep learning framework is proposed, which includes four modules: image preprocessing, text detection, sequence text recognition and result layout processing. Firstly, image template matching, noise reduction and tilt correction were used to preprocess the scanned images of three types of pin print vertical data with different line spacing types. Then, text detection and segmentation were performed on the preprocessed images by projection method. In the projection segmentation detection algorithm, vertical projection, horizontal projection and detection candidate frame correction functions were added. It can effectively deal with the cohesive text area and improve the detection accuracy. Finally, considering the different length of the image array, the segmentation of characters is avoided, the segmented text recognition problem is transformed into a sequence learning problem, and the CRNN deep learning algorithm composed of CNN+RNN+CTC is used for text recognition, and then the recognition results are saved into Excel standardized format by coordinate fusion algorithm, so as to realize automatic data extraction and saving. The experimental results show that the algorithm proposed in this paper has a text detection recall rate of 977%, a text recognition comprehensive evaluation index F value of 9749% for a single character recognition rate and 9478% for a whole group of characters recognition rate, and is compared with other algorithms to verify its effectiveness. Therefore, the algorithm proposed in this paper has high practicability and can meet the actual needs of engineering applications.


Key words: ionosphere; pin printer font; projection segmentation; text detection; CRNN; text recognition


收稿日期: 2023-08-31

基金项目: 国家自然科学基金项目(6210321512001308).

作者简介: 苏桂昌(1999-),男,硕士研究生.*通信联系人.






  • 附件【202401020.pdf】已下载

上一条:基于特征投影矩阵和线性约束的水声阵列信号抗主瓣干扰方法

关闭

 
  通知公告 更多>>
关于作者领取2026年第1期样刊...
关于作者领取2025年第6期样刊...
关于作者领取2025年第5期样刊...
关于作者领取2025年第4期样刊...
关于作者领取2025年第3期样刊...
关于作者领取2025年第2期样刊...
关于作者领取2025年第1期样刊...
关于征集2025年《青岛科技大...
学报编辑部举办“戴尊红副主...
  期刊入口 更多>>
学报(社会科学版)网站入口  
PolyChem网站入口  
学报(自然科学版)作者投稿系统  
学报(自然科学版)专家审稿系统  
学报(自然科学版)编辑办公系统  

©版权所有:青岛科技大学 期刊中心  地址:山东省青岛市崂山区松岭路99号图书馆楼5040 邮编:266061