PDF全文下载:2016060687
王希玲,江峰*, 张友强, 刘国柱
(青岛科技大学 信息科学技术学院,山东 青岛266061)
摘要:针对传统的基于信息熵的决策树算法所存在的问题,本研究从粗糙集理论的角度来考虑信息熵,定义了依赖决策熵的概念,并提出一种基于依赖决策熵的决策树算法DTDDE。在DTDDE算法中,采用依赖决策熵的概念来度量每个条件属性的重要性,并选择重要性最大的属性作为当前的分离属性。通过在多个UCI数据集上的实验表明:与现有的决策树算法相比,本研究所提出的算法能够获得更好的分类性能。
关键词: 决策树;信息熵;粗糙集;依赖决策熵;属性重要性
中图分类号:TP 181文献标志码:A
Decision Tree Classification Algorithm Based on Dependency Decision Entropy
WANG Xiling, JIANG Feng, ZHANG Youqiang, LIU Guozhu
(College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China)
Abstract: To solve the problems of traditional information entropy-based decision tree algorithms, in this paper we consider the issue of information entropy from the view of rough set theory. We define a new concept called dependency decision entropy, and propose a dependency decision entropy-based algorithm (called DTDDE) for constructing decision tree. In algorithm DTDDE, the concept of dependency decision entropy is used to measure the significance of each condition attribute, and the attribute with the maximum significance is selected as the current splitting attribute. Experimental results on several UCI data sets demonstrate that compared with the current decision tree algorithms, our algorithm can obtain better classification performance.
Key words: decision tree; information entropy; rough sets; dependency decision entropy; significance of attribute rough sets
收稿日期: 2015-11-03
基金项目:国家自然科学基金项目(60802042,61273180);山东省自然科学基金项(ZR2011FQ005,ZR2012FL17);山东省高等学校科技计划项目(J11LG05).
作者简介:王希玲(1988—),女,硕士研究生.*通信联系人.
文章编号:16726987(2016)06068706;DOI:10.16351/j.16726987.2016.06.018