主管:中国科学院
主办:中国优选法统筹法与经济数学研究会
   中国科学院科技战略咨询研究院

Chinese Journal of Management Science ›› 2024, Vol. 32 ›› Issue (10): 41-55.doi: 10.16381/j.cnki.issn1003-207x.2021.2499

Previous Articles     Next Articles

Credit Scoring Model Based on JS Divergence Feature Discretization

Long Shen,Ying Zhou()   

  1. School of Economics and Management,Dalian University of Technology,Dalian 116024,China
  • Received:2021-12-02 Revised:2024-02-03 Online:2024-10-25 Published:2024-11-09
  • Contact: Ying Zhou E-mail:zhouying@dlut.edu.cn

Abstract:

Accurately predicting whether bank loans of listed companies will default is very important for the management of listed companies and investors ' investment decisions. The innovation and characteristics of this study are as follows. Firstly, JS divergence is introduced into the field of credit risk for the first time to discrete the features, which ensures the ability of the discrete feature numerical interval to discriminate the default state of enterprises. According to the proportion of default enterprises and non-default enterprises in the feature interval, the corresponding feature interval is scored by WOE, which ensures that the greater the WOE score is, the better the credit status of enterprises is. This has changed the problem of insufficient understanding data of classical models such as logical regression caused by the subjective setting of feature interval values in existing studies. Second, the least-angle regression ( Lars ) equation with L1 norm penalty term ( Lasso ) is adopted, and the minimum Bayesian information criterion ( BIC ) is taken as the goal.Features with weights not equal to 0 in the Lasso-Lars regression equation are inversely derived as the optimal feature combination, so as to ensure that the selected feature combination has the maximum default discrimination ability and the minimum redundancy degree. It makes up for the deficiency that the existing research often does not consider the redundancy of feature combination selected, which leads to the increase of model complexity. Thirdly, the optimal threshold of default discrimination of logistic regression model is deduced with the goal of maximizing G-mean, which avoids the disadvantage of inaccurate judgment of default enterprises caused by the threshold with 0.5. It is shown that 43 discrete features selected from 159 features not only have the ability to distinguish default states, but also conform to the principle of “ credit 5C ”. The feature system is scientific and reasonable. Among them, in addition to the cash ratio, corporate value multiples and sales gross margin and other financial factors that reflect corporate solvency, profitability, operational capacity and growth capacity significantly influenced corporate default, non-financial factors such as management shareholding ratio, the number of shareholders ' meetings and whether to disclose internal control evaluation reports, and external macro factors such as engel's index, per capita GDP and ultimate consumption rate remarkably affects corporate default.

Key words: JS divergence, feature discretization, BIC, Lasso Lars regression, default prediction

CLC Number: