设为首页 联系我们 加入收藏

当前位置: 网站首页 期刊分类目录 2016第4期 正文

一种基于抽样与约简的集成学习算法

作者:时间:2016-08-22点击数:

PDF全文下载:2016040451

江峰, 张友强, 杜军威, 刘国柱, 冯云霞

(青岛科技大学 信息科学技术学院,山东 青岛266061)

摘要: 集成学习的一个重要目标是获得一组差异性大的基分类器来构建集成分类器。为实现这一目标,提出一种基于抽样与约简的集成学习算法ELSR。该算法采用多模态扰动策略来训练基分类器。首先,采用多次抽样策略从训练集中抽样产生k个抽样集;其次,使用粗糙集的属性约简技术对每个抽样集进行约简;第三,在每一个约简之后的抽样集上分别训练一个基分类器;最后,利用一个验证集对每个基分类器进行性能测试,并根据测试结果选择一组合适的基分类器来构建集成分类器。在UCI数据集上的实验表明:当采用KNN算法或者C4.5算法来训练基分类器时,ELSR的分类性能总是要优于现有的集成学习算法。

关键词: 选择性集成; 抽样; 属性约简; 粗糙集; 多模态扰动

中图分类号:TP 181文献标志码:A

An Ensemble Learning Agorithm Based on Sampling and ReductionJIANG Feng, ZHANG Youqiang, DU Junwei, LIU Guozhu,FENG Yunxia

(College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China)

Abstract: An important target of ensemble learning is to obtain a set of diverse base classifiers for constructing an ensemble classifier. To achieve this target, an ensemble learning algorithm (called ELSR) based on sampling and reduction was proposed. In ELSR algorithm, a multimodal perturbation strategy was used to train base classifiers. First, a multiple sampling strategy was used to generate k sampling sets from the training set. Second, the attribute reduction technique in rough sets was used to reduce each sampling set. Third, a base classifier was trained on each reduced sampling set, respectively. Finally, a validation set was used to test the performance of each base classifier, and according to the test results, a set of appropriate base classifiers was selected to construct an ensemble classifier. Experimental results on UCI data sets show that when using the KNN algorithm or C4.5 algorithm to train base classifiers, the classification performance of ELSR is always better than those of current ensemble learning algorithms.

Key words:  selective ensemble; sampling; attribute reduction; rough sets; multi-modal perturbation

 收稿日期:   20150905

基金项目:  国家自然科学基金项目(61303193, 61273180);山东省自然科学基金项目(ZR2011FQ005, ZR2012FL17);山东省高等学校科技计划项目(J11LG05).

作者简介:  江峰(1978—),男,副教授.

文章编号:16726987(2016)04045106; DOI: 10.16351/j.16726987.2016.04.019

Copyright © 2011-2017 青岛科技大学学报 (自然科学版)