主管:中国科学院
主办:中国优选法统筹法与经济数学研究会
   中国科学院科技战略咨询研究院

中国管理科学 ›› 2024, Vol. 32 ›› Issue (10): 41-55.doi: 10.16381/j.cnki.issn1003-207x.2021.2499cstr: 32146.14.j.cnki.issn1003-207x.2021.2499

• • 上一篇    下一篇

基于JS散度指标离散化的企业贷款违约预测模型

沈隆,周颖()   

  1. 大连理工大学经济管理学院,辽宁 大连 116024
  • 收稿日期:2021-12-02 修回日期:2024-02-03 出版日期:2024-10-25 发布日期:2024-11-09
  • 通讯作者: 周颖 E-mail:zhouying@dlut.edu.cn
  • 基金资助:
    辽宁省社会科学规划基金项目(L21BGL011)

Credit Scoring Model Based on JS Divergence Feature Discretization

Long Shen,Ying Zhou()   

  1. School of Economics and Management,Dalian University of Technology,Dalian 116024,China
  • Received:2021-12-02 Revised:2024-02-03 Online:2024-10-25 Published:2024-11-09
  • Contact: Ying Zhou E-mail:zhouying@dlut.edu.cn

摘要:

准确预测上市公司银行贷款是否会违约,对上市公司自身的管理以及投资者的投资决策极为重要。本研究的创新与特色有:一是创新性地将JS散度引入信用风险领域中对指标进行离散化,确保了离散化后得到的指标数值区间对企业违约状态的区分能力,根据指标区间中违约企业和非违约企业的比例对相应的指标区间进行WOE打分,保证了WOE分值越大,企业的信用状况越好。避免了现有研究主观设定指标区间数值,导致的如逻辑回归等经典模型理解数据不充分的问题。二是采用增加了L1范数惩罚项(lasso)的最小角回归(lars)方程,以贝叶斯信息准则(BIC)最小为目标,反推出Lasso Lars回归方程中权重不等于0的指标为最优指标组合,确保遴选的指标组合违约预测能力最强,冗余度最小。弥补了现有研究在指标组合遴选时往往没有考虑指标组合中指标间的冗余度,而导致模型复杂性增加的不足。三是使用逻辑回归模型进行建模,以违约预测精度G-mean最大为目标,反推逻辑回归模型最优的违约预测临界点,避免了现有研究逻辑回归模型以经验值0.5为临界点,导致违约企业判别不准的弊端。研究表明:从159个指标中筛选出的43个离散指标既具有违约状态的区分能力,又符合“信用5C”原则,指标体系具有科学性、合理性。其中,除现金比率、企业价值倍数及销售毛利率等反映企业偿债能力、盈利能力、运营能力及成长能力的财务因素对企业违约有显著影响外,管理层持股比例、股东大会召开次数及是否披露内控评价报告等非财务因素,以及恩格尔系数、人均地区生产总值及最终消费率等外部宏观因素对企业违约有显著影响。

关键词: JS散度, 指标离散化, BIC, Lasso Lars回归, 违约预测

Abstract:

Accurately predicting whether bank loans of listed companies will default is very important for the management of listed companies and investors ' investment decisions. The innovation and characteristics of this study are as follows. Firstly, JS divergence is introduced into the field of credit risk for the first time to discrete the features, which ensures the ability of the discrete feature numerical interval to discriminate the default state of enterprises. According to the proportion of default enterprises and non-default enterprises in the feature interval, the corresponding feature interval is scored by WOE, which ensures that the greater the WOE score is, the better the credit status of enterprises is. This has changed the problem of insufficient understanding data of classical models such as logical regression caused by the subjective setting of feature interval values in existing studies. Second, the least-angle regression ( Lars ) equation with L1 norm penalty term ( Lasso ) is adopted, and the minimum Bayesian information criterion ( BIC ) is taken as the goal.Features with weights not equal to 0 in the Lasso-Lars regression equation are inversely derived as the optimal feature combination, so as to ensure that the selected feature combination has the maximum default discrimination ability and the minimum redundancy degree. It makes up for the deficiency that the existing research often does not consider the redundancy of feature combination selected, which leads to the increase of model complexity. Thirdly, the optimal threshold of default discrimination of logistic regression model is deduced with the goal of maximizing G-mean, which avoids the disadvantage of inaccurate judgment of default enterprises caused by the threshold with 0.5. It is shown that 43 discrete features selected from 159 features not only have the ability to distinguish default states, but also conform to the principle of “ credit 5C ”. The feature system is scientific and reasonable. Among them, in addition to the cash ratio, corporate value multiples and sales gross margin and other financial factors that reflect corporate solvency, profitability, operational capacity and growth capacity significantly influenced corporate default, non-financial factors such as management shareholding ratio, the number of shareholders ' meetings and whether to disclose internal control evaluation reports, and external macro factors such as engel's index, per capita GDP and ultimate consumption rate remarkably affects corporate default.

Key words: JS divergence, feature discretization, BIC, Lasso Lars regression, default prediction

中图分类号: