全文下载: 202104014.pdf
文章编号: 1672-6987(2021)04-0094-08; DOI: 10.16351/j.1672-6987.2021.04.014
张幸, 王旭, 赵文仓*(青岛科技大学 自动化与电子工程学院,山东 青岛 266061)
摘要: 由于卷积神经网络内部存在局部连接,通过学习局部特征,模型能够较好地生成纹理风格信息,但是对高级语义特征的学习能力较差,导致生成图像中语义目标出现模糊失真的现象。为提高神经网络的全局特征处理能力,使生成图像中的语义目标更清晰真实,本研究提出一种监督注意力机制应用于端到端结构的级联细化网络图像生成模型。对第一级精细化模块输出的多维低分辨率大感受野特征和语义标签内多维语义特征做特征融合,加强网络内部多维特征之间的全局一致性,通过语义布局指导模型从全局信息中生成真实感图像,提升了由语义标签生成图像中语义目标的清晰度和真实性。将级联细化网络图像生成模型在Cityscapes验证集语义标签上得到的生成图像语义分割平均像素精度提升了8.6%,mIoU精度提升了26.7%。
关键词: 语义标签; 图像生成; 级联细化网络; 注意力机制; 端到端
中图分类号: TP 183文献标志码: A
引用格式: 张幸, 王旭, 赵文仓. 基于监督注意力机制的语义标签生成图像[J]. 青岛科技大学学报(自然科学版), 2021, 42(4): 94-101.
ZHANG Xing, WANG Xu, ZHAO Wencang. Image generated from semantic labels based on supervised attention[J]. Journal of Qingdao University of Science and Technology(Natural Science Edition), 2021, 42(4): 94-101.
Image Generated from Semantic Labels Based on Supervised Attention
ZHANG Xing, WANG Xu, ZHAO Wencang
(College of Automation and Electronic Engineering, Qingdao University of Science and Technology,Qingdao 266061,China)
Abstract: Due to the local connection in convolutional neural networks, by learning local features, the model can generate the texture style information better, but the learning ability of high-level semantic features is poor resulting in fuzzy and distorted semantic objects in the generated images. In order to improve the global features processing ability of neural network and make the semantic objects in the generated images clearer and more real. In this paper, we propose a supervised attention mechanism for the end-to-end structure of cascaded refinement network image generation model. The features fusion of the multi-dimensional low-resolution with large receptive field features output by the first level refinement module and the multi-dimensional semantic features in the semantic labels to strengthen the global consistency between the multi-dimensional features in the network. Using semantic layout to guide model to generate images from global information, improves the vividness and realness of semantic objects in the images generated by semantic labels. Increases the average pixel precision of the semantic segmentation of the generated images from the cascaded refinement network from semantic labels of the Cityscapes validation dataset by 8.6%, and the precision of the mIoU by 26.7%.
Key words: semantic label; image generation; cascaded refinement network; attention mechanism; end-to-end
收稿日期: 2020-05-03
基金项目: 国家留学基金委员会项目(201608370049);国家自然科学基金项目(61171131);山东省重点研发计划项目(YD01033).
作者简介: 张幸(1994—), 女, 硕士研究生. *通信联系人.