全文下载: 202201016.pdf
文章编号: 1672-6987(2022)01-0111-09; DOI: 10.16351/j.1672-6987.2022.01.016
刘辉1, 邵福波2,3, 宫响1*(1.青岛科技大学 数理学院,山东 青岛 266061;2.北京交通大学 轨道交通控制与安全国家重点实验室,北京 100044;
3.中车工业研究院有限公司 技术部,北京 100070)
摘要: 本工作选取多种经典相关系数进行了对比研究,如Pearson相关系数、Spearman相关系数、距离相关系数、最大信息系数及HHG相关系数。具体地,在不同数据规模及噪声水平下,对线性、非线性单调、非单调、非函数等不同类型变量的相关性分别进行研究,得到各相关系数的统计功效。通过分析发现,Pearson相关系数、Spearman相关系数更适合衡量线性、非线性单调相关关系,最大信息系数则更适合衡量含有周期性的相关关系,HHG则更适合衡量非函数相关关系。本研究可为挖掘不同相关关系,提供相关系数选取依据。
关键词: 相关关系; Pearson相关系数; Spearman相关系数; 距离相关系数; 最大信息系数; HHG; 统计功效
中图分类号: O 221.5文献标志码: A
引用格式: 刘辉,邵福波,宫响.经典相关系数及统计功效对比研究[J]. 青岛科技大学学报(自然科学版), 2022, 43(1): 111-119.
LIU Hui, SHAO Fubo, GONG Xiang. Comparison of classical correlation coefficients and statistical power[J]. Journal of Qingdao University of Science and Technology(Natural Science Edition), 2022, 43(1): 111-119.
Comparison of Classical Correlation Coefficients and Statistical PowerLIU Hui1, SHAO Fubo2,3, GONG Xiang1
(1.College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China;
2.State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing 100044, China;
3.Technical Department, CRRC Academy Co., Ltd., Beijing 100070, China)
Abstract: This paper makes a comparison of several classical correlation coefficients, such as Pearson product-moment correlation coefficient, Spearman correlation coefficient, Distance correlation coefficient, Maximum information coefficient and HHG. Practically, under different data scale and noise level, the association of linear, nonlinear and non-function variables is studied respectively, and the statistical power of each correlation coefficient is obtained. It is found that Pearson product-moment correlation coefficient and Spearman correlation coefficient is more suitable to measure the linear and nonlinear monotonic association, and the Maximum Information Coefficient is more suitable to measure the association with periodicity, HHG is a better measure of non-functional association. The research of this paper can provide the choice for mining different correlation coefficient.
Key words: correlation; Pearson correlation coefficient; Spearman correlation coefficient; distance correlation coefficient; maximum information coefficient; HHG; statistical power
收稿日期: 2021-02-10
基金项目: 国家自然科学基金-山东省联合基金项目(U1906215);中国科学院海岸带环境过程与生态修复重点实验室(烟台海岸带研究所)开放基金项目(2020KFJJ04).
作者简介: 刘辉(1996—),男,硕士研究生.*通信联系人.