降维能够帮助深度神经网络更好地学习信用风险吗？——基于因子增强可解释学习模型的研究

doi:10.16381/j.cnki.issn1003-207x.2024.1256

摘要/Abstract

摘要：

本文将大数据降维技术和深度学习的逼近手段相结合，提出了一种新的可解释深度学习方法-因子增强的加性神经网络模型，并依据该方法建立信用风险度量模型，结合我国债券市场公司数据进行实证分析，同时与目前流行的可解释机器学习方法进行比较。研究发现：（1）本文提出的模型在各项评估指标上优于外部评级、线性回归，以及目前流行的可解释机器学习方法，具备更好的泛化能力；（2）价格指标可以动态反映企业风险溢价，在信用风险度量方面最为重要且时效性较高，提高了本文模型动态捕捉风险的能力；（3）本文提出的方法也佐证了在评级过程和方法中引入降维方法、深度学习方法提高评级质量必要性。本文提出的方法是对可解释深度学习、企业信用风险度量等相关研究的有益补充，为信用风险防范和信用评级提供了一种有现实价值的信用风险度量方法。

关键词: 信用风险, 可解释深度学习, FAAN

Abstract:

The challenge of accurately assessing credit risk has grown with the complexity of financial markets and the availability of big data. Traditional models often fall short of capturing dynamic risk factors. The problem is addressed by proposing the Factor-Enhanced Additive Neural Network （FAAN） model， combining big data dimensionality reduction techniques with deep learning approximation to improve accuracy and interpretability in credit risk measurement.The main research question is designing an interpretable deep learning model that outperforms existing methods in measuring corporate credit risk. The FAAN model tackles this by integrating factor-based dimensionality reduction to remove noise and deep learning to capture complex non-linear relationships. It is more effective than traditional models like linear regression and popular interpretable machine learning methods （e.g.， GAM， EBM）.Empirical analysis using real-world data shows that the FAAN model：（1） consistently outperforms external credit ratings and other models across various metrics， demonstrating superior generalization；（2） highlights price indicators as crucial dynamic factors that enhance credit risk detection； and （3） confirms that combining dimensionality reduction with deep learning significantly improves credit rating quality.It contributes to research on interpretable deep learning and corporate credit risk， offering a powerful tool for dynamic risk assessment and credit rating practices， advancing both academic understanding and practical applications.

Key words: credit risk, interpretable deep learning, FAAN

中图分类号:

F832.5

王璞,李鲲鹏,苏立. 降维能够帮助深度神经网络更好地学习信用风险吗？——基于因子增强可解释学习模型的研究[J]. 中国管理科学, 2026, 34(5): 57-71.

Pu Wang,Kunpeng Li,Li Su. Can Dimensionality Reduction Enhance Deep Neural Networks in Learning Credit Risk? ——A Study Based on Factor-Augmented Explainable Learning Models[J]. Chinese Journal of Management Science, 2026, 34(5): 57-71.

图/表 12

图1

表1

关系类型判断加性网络结构模拟实验结果"

分布	系数	X1	X2	X3	X4	X5	X6	X7
正态分布	$β i l$	0.441	0.000	0.000	0.406	0.000	0.410	0.409
正态分布	$β i n l$	0.001	0.024	0.188	0.000	0.722	0.000	0.000
均匀分布	$β i l$	0.237	0.000	0.000	0.238	0.000	0.237	0.238
均匀分布	$β i n l$	0.000	0.008	0.306	0.000	0.943	0.000	0.000
伽马分布	$β i l$	0.197	0.000	0.000	0.191	0.000	0.200	0.198
伽马分布	$β i n l$	0.000	0.002	0.059	0.000	0.185	0.000	0.000

表1

表2

Lasso改善模型泛化能力的验证结果"

分布	是否使用LASSO方法对部分线性特征进行筛分	训练集 $R 2$	测试集 $R 2$
正态分布	是	0.9965	0.9962
正态分布	否	0.8152	0.8099
均匀分布	是	0.9994	0.9994
均匀分布	否	0.8207	0.8187
伽马分布	是	0.5794	0.5783
伽马分布	否	0.0000	0.0000

表2

图2

表3

表4

表5

模型参数分析结果"

参数	参数设定	Recall	Precision	F1	AUC
学习率	$10 - 3$	0.833	0.341	0.484	0.893
	$10 - 5$	0.889	0.340	0.492	0.941
	$10 - 7$	0.861	0.272	0.413	0.906
层数	2	0.861	0.307	0.453	0.917
	3	0.889	0.274	0.418	0.916
	7	0.933	0.298	0.452	0.951
	10	0.944	0.312	0.469	0.928
每层神经元个数	3	0.833	0.303	0.444	0.918
	5	0.917	0.275	0.423	0.920
	20	0.944	0.279	0.430	0.937
	70	0.889	0.364	0.516	0.929
	100	0.917	0.402	0.559	0.942
因子个数	1	0.889	0.360	0.512	0.935
	2	0.917	0.351	0.508	0.929
	4	0.944	0.327	0.486	0.932
	5	0.935	0.330	0.487	0.949
	6	0.967	0.345	0.509	0.963
替换指标期次		0.835	0.256	0.391	0.898
替换指标期次重新训练模型		0.949	0.389	0.552	0.943

表5

表6

自变量机制分析表"

变量类别	变量名称	重要性	合计
价格	主体存续债券发行利率	0.130	0.477
价格	主体存续债券估值收益率	0.348	0.477
规模	营业收入	0.000	0.000
盈利	营业总收入同比增长率	0.027	0.082
	净利润同比增长率	0.023
	销售毛利率	0.004
	EBITDA 利润率	0.028
现金流	经营活动产生的现金流量净额/营业收入	0.000	0.046
	经营活动产生的现金流量净额/经营活动净收益	0.046
	现金营运指数	0.001
营运	存货周转率	0.050	0.050
营运	应收账款周转率	0.000	0.050
偿债	货币资金/短期债务	0.042	0.078
	全部债务/EBITDA	0.005
	EBITDA/利息费用	0.031
交互项	$f 1 (⋅)$	0.142	0.267
	$f 2 (⋅)$	0.075
	$f 3 (⋅)$	0.051

表6

图3

表7

表8

图4

参考文献 58

[1]	Martin D. Early warning of bank failure： A logit regression approach［J］.Journal of Banking & Finace，1977，1（3）： 249-276.
[2]	Edward A， ratios Financial， discriminant analysis and the prediction of corporate bankruptcy［J］.Journal of Finance，1968， 23（4）： 589-609.
[3]	张新民，钱爱民，陈德球. 上市公司财务状况质量：理论框架与评价体系［J］. 管理世界， 2019， 35（7）： 152-166+204.
	Zhang X M， Qian A M， Chen D Q. Quality of financial statement of listed companies： Theoretical framework and evaluation system［J］. Journal of Management World， 2019， 35（7）： 152-166+204.
[4]	Gu S， Kelly B， Xiu D. Empirical asset pricing via machine learning［J］. The Review of Financial Studies， 2020， 33（5）： 2223-2273.
[5]	Murdoch W J， Singh C， Kumbier K， et al. Interpretable machine learning： Definitions， methods， and applications［J/OL］.arXiv，2019..
[6]	Carvalho D V， Pereira E M， Cardoso J S. Machine learning interpretability： A survey on methods and metrics［J］. Electronics， 2019， 8（8）： 832-866.
[7]	Ribeiro M T， Singh S， Guestrin C. “Why should I trust you？”： Explaining the predictions of any classifier［C］//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining， San Francisco， CA， USA， ACM，August 13-17， 2016： 1135-1144.
[8]	Fan J， Gu Y. Factor augmented sparse throughput deep ReLU neural networks for high dimensional regression［J］. Journal of the American Statistical Association， 2024， 119（548）： 2680-2694.
[9]	杨国超，刘琪. 中国债券市场信用评级制度有效性研究［J］. 经济研究， 2022， 57（10）： 191-208.
	Yang G C， Liu Q. A study on the validity of credit rating in Chinese bond market［J］. Economic Research Journal， 2022， 57（10）： 191-208.
[10]	戴雅榕，沈艺峰. 随机森林模型能够预测中国债券违约吗？［J］. 计量经济学报， 2022， 2（2）： 418-440.
	Dai Y R， Shen Y F. Can the random forest model predict bond default in China？［J］. China Journal of Econometrics， 2022， 2（2）： 418-440.
[11]	陈学彬，武靖，徐明东. 我国信用债个体违约风险测度与防范——基于LSTM深度学习模型［J］. 复旦学报（社会科学版）， 2021， 63（3）： 159-173.
	Chen X B， Wu J， Xu M D. Individual default risk measurement and prevention of China’s credit bonds： Based on LSTM deep learning model［J］. Fudan Journal （Social Sciences Edition）， 2021， 63（3）： 159-173.
[12]	Brigham E F. Financial management： Theory and practice［M］.Canada：Cengage Learning，2016.
[13]	Cantor R， Packer F. Differences of opinion and selection bias in the credit rating industry［J］. Journal of Banking & Finance， 1997， 21（10）： 1395-1417.
[14]	Altman E I， Rijken H A. How rating agencies achieve rating stability［J］. Journal of Banking & Finance， 2004， 28（11）： 2679-2714.
[15]	Merton R C. On the pricing of corporate debt： The risk structure of interest rates［J］. The Journal of Finance， 1974， 29（2）： 449-470.
[16]	叶钦华，叶凡，黄世忠. 财务舞弊识别框架构建——基于会计信息系统论及大数据视角［J］. 会计研究， 2022（3）： 3-16.
	Ye Q H， Ye F， Huang S Z. Financial fraud detection framework building： From the perspective of accounting information system theory and big data［J］. Accounting Research， 2022（3）： 3-16.
[17]	姜富伟，林奕皓，马甜. “去刚兑”背景下的企业债券违约风险：机器学习预警和经济机制探究［J］. 金融研究， 2023（10）： 85-103.
	Jiang F W， Lin Y H， Ma T. Research on an early warning model of corporate bond default and its economic mechanism based on machine learning［J］. Journal of Financial Research， 2023（10）： 85-103.
[18]	黄益平，邱晗. 大科技信贷：一个新的信用风险管理框架［J］. 管理世界， 2021， 37（2）： 12-21+50+2+16.
	Huang Y P， Qiu H. Big tech lending： A new credit risk management framework［J］. Journal of Management World， 2021， 37（2）： 12-21+50+2+16.
[19]	董路安，叶鑫. 基于改进教学式方法的可解释信用风险评价模型构建［J］.中国管理科学，2020，28（9）： 45-53.
	Dong L A， Ye X. Interpretable credit risk assessment modeling based on improved pedagogical method［J］. Chinese Journal of Management Science， 2020， 28（9）： 45-53.
[20]	黄志刚，刘志惠，朱建林. 多源数据信用评级普适模型栈框架的构建与应用［J］. 数量经济技术经济研究， 2019， 36（4）： 155-168.
	Huang Z G， Liu Z H， Zhu J L. A general stack framework of credit risk rating models based on multi source data［J］. The Journal of Quantitative & Technical Economics， 2019， 36（4）： 155-168.
[21]	Chou Y L， Moreira C， Bruza P， et al. Counterfactuals and causability in explainable artificial intelligence： Theory， algorithms， and applications［J］. Information Fusion， 2022， 81： 59-83.
[22]	Lundberg S M， Lee S-I. A unified approach to interpreting model predictions［C］//Proceedings of the 31st International Conference on Neural Information Processing Systems， Long Beach， CA， USA， December 4-9， ACM， 2017： 4765-4774.
[23]	Wang F， Rudin C. Falling rule lists［C］//Proceedings of the 18th International Conference on Artificial Intelligence and Statistics， San Diego， CA， USA， May 9-12， ACM， 2015： 1013-1022.
[24]	Friedman J H， Popescu B E. Predictive learning via rule ensembles［J］. The Annals of Applied Statistics， 2008， 2（3）： 916-954.
[25]	Hastie T， Tibshirani R. Generalized additive models： Some applications［J］. Journal of the American Statistical Association， 1987， 82（398）： 371-386.
[26]	Lou Y， Caruana R， Gehrke J， et al. Accurate intelligible models with pairwise interactions［C］//Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining， Chicago，IL USA，August 11-14，ACM，2013： 623-631.
[27]	Yang Z， Zhang A， Sudjianto A. GAMI-Net： An explainable neural network based on generalized additive models with structured interactions［J］. Pattern Recognition， 2021， 120： 108192.
[28]	Green P J， Silverman B W. Nonparametric Regression and Generalized Linear Models： A oughness Penalty Approach［M］. London： Chapman & Hall， 1994.
[29]	Hastie T， Tibshirani R， Friedman J H. The elements of statistical learning： Data mining， inference， and prediction［M］.New York： Springer，2009.
[30]	Lou Y， Caruana R， Gehrke J. Intelligible models for classification and regression［C］//Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining， Beijing， China， August 12-16， ACM， 2012： 150-158.
[31]	Friedman J H. Stochastic gradient boosting［J］. Computational Statistics & Data Analysis， 2002， 38（4）： 367-378.
[32]	Goodfellow I， Bengio Y， Courville A. Deep learning［M］. Cambridge，MA： MIT Press， 2016.
[33]	Chung K L. A course in probability theory［M］. San Diego： Academic Press， 2000.
[34]	Athey S， Imbens G W. The state of applied econometrics： Causality and policy evaluation［J］. The Journal of Economic Perspectives， 2017， 31（2）： 3-32.
[35]	Härdle W， Liang H， Gao J. Partially Linear Models［M］. Heidelberg： Physica-Verlag HD， 2000.
[36]	Zhu L， Li H， Zhang X， et al. Neural partially linear additive model［J］. Frontiers of Computer Science， 2024， 18（6）： 186334.
[37]	Tibshirani R. Regression shrinkage and selection via the lasso［J］. Journal of the Royal Statistical Society： Series B （Methodological）， 1996， 58（1）： 267-288.
[38]	Ravikumar P， Lafferty J， Liu H， et al. Sparse additive models［J］. Journal of the Royal Statistical Society： Series B （Statistical Methodology）， 2009， 71（5）： 1009-1030.
[39]	Lou Y， Bien J， Caruana R， et al. Sparse partially linear additive models［J］. Journal of Computational and Graphical Statistics， 2016， 25（4）： 1126-1140.
[40]	Glorot X， Bengio Y. Understanding the difficulty of training deep feedforward neural networks［C］. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics，Sardinia， Italy， May 13-15， ACM， 2010： 249-256.
[41]	Kingma D P， Ba J. Adam： A method for stochastic optimization［J/OL］. arXiv， 2014..
[42]	He H， Garcia E A. Learning from imbalanced data［J］. IEEE Transactions on Knowledge and Data Engineering， 2009， 21（9）： 1263-1284.
[43]	Krawczyk B. Learning from imbalanced data： Open challenges and future directions［J］. Progress in Artificial Intelligence， 2016， 5（4）： 221-232.
[44]	Fan J， Liao Y. Learning latent factors from diversified projections and its applications to over-estimated and weak factors［J］. Journal of the American Statistical Association， 2022， 117（538）： 909-924.
[45]	Menardi G， Torelli N. Training and assessing classification rules with imbalanced data［J］. Data Mining and Knowledge Discovery， 2014， 28（1）： 92-122.
[46]	Guo H， Li Y， Shang J， et al. Learning from class-imbalanced data： Review of methods and applications［J］. Expert Systems with Applications， 2017， 73： 220-239.
[47]	姚潇，李可，余乐安. 非平衡样本下基于生成对抗网络过抽样技术的公司债券违约风险预测研究［J］. 系统工程理论与实践， 2022， 42（10）： 2617-2634.
	Yao X， Li K， Yu L A. Imbalanced corporate bond default modeling using generative adversarial networks oversampling techniques［J］. Systems Engineering-Theory & Practice， 2022， 42（10）： 2617-2634.
[48]	衣柏衡，朱建军，李杰. 基于改进SMOTE的小额贷款公司客户信用风险非均衡SVM分类［J］. 中国管理科学， 2016， 24（3）： 24-30.
	Yi B H， Zhu J J， Li J. Imbalanced data classification on micro-credit company customer credit risk assessment using improved SMOTE support vector machine［J］. Chinese Journal of Management Science， 2016， 24（3）： 24-30.
[49]	马晓君，董碧滢，王常欣. 一种基于PSO优化加权随机森林算法的上市公司信用评级模型设计［J］. 数量经济技术经济研究， 2019， 36（12）： 165-182.
	Ma X J， Dong B Y， Wang C X. Design of a credit rating model of quoted companies based on the PSO optimized weighted random forest algorithm［J］. The Journal of Quantitative & Technical Economics， 2019， 36（12）： 165-182.
[50]	冯玉梅，王刚. 公司股票和债券价格中的信用风险信息效率研究［J］. 国际金融研究， 2016（7）： 83-96.
	Feng Y M， Wang G. Research on credit risk information efficiency in listed companies' stock and bond price［J］.Studies of International Finance，2016（7）： 83-96.
[51]	Fama E F， Bliss R R.The information in long-maturity forward rates［J］.The American Economic Review， 1987，77（4）：680-692.
[52]	Duffee G R. Estimating the price of default risk［J］. The Review of Financial Studies， 1999， 12（1）： 197-226.
[53]	Adrian T， Crump R K， Moench E. Pricing the term structure with linear regressions［J］. Journal of Financial Economics， 2013， 110（1）： 110-138.
[54]	Collin-Dufresn P， Goldstein R S， Martin J S. The determinants of credit spread changes［J］. The Journal of Finance， 2001， 56（6）： 2177-2207.
[55]	冯光华. 信用评级原理与实务［M］. 北京：中国金融出版社， 2019.
	Feng G H. Principle and practice of credit rating［M］. Beijing： China Financial Publishing House， 2019.
[56]	联合资信评估有限公司. 信用评级理论与实践［M］. 北京：中国金融出版社， 2015.
	United Credit Rating Co.， Ltd. The credit rating theory and practice［M］. Beijing： China Financial Publishing House， 2015.
[57]	吕品. 基于投资视角的信用研究——从评级到策略［M］. 北京：中国金融出版社， 2019.
	Lv P. Credit Research from an Investment Perspective： From Ratings to Strategies ［M］. Beijing： China Financial Publishing House，2019.
[58]	Fan J， Gu Y， Zhou W X. How do noise tails impact on deep ReLU networks？［J］. The Annals of Statistics， 2024， 52（4）： 1854-1871.

模型名称	训练集				测试集
模型名称	Recall	Precision	F1	AUC	Recall	Precision	F1	AUC
主体评级	0.192	0.217	0.204	0.696	0.111	0.160	0.131	0.669
债项评级	0.143	0.423	0.214	0.742	0.083	0.273	0.128	0.720
逻辑回归	0.910	0.346	0.502	0.883	0.871	0.300	0.446	0.857
EBM	0.855	0.348	0.494	0.862	0.758	0.284	0.413	0.799
GAMI-Net	0.987	0.178	0.301	0.899	0.939	0.157	0.270	0.871
FANAM	0.959	0.321	0.481	0.945	0.139	0.094	0.112	0.582
FAAN	0.945	0.342	0.502	0.955	0.944	0.391	0.553	0.940