主管:中国科学院
主办:中国优选法统筹法与经济数学研究会
   中国科学院科技战略咨询研究院

中国管理科学 ›› 2023, Vol. 31 ›› Issue (4): 111-120.doi: 10.16381/j.cnki.issn1003-207x.2020.1344

• 论文 • 上一篇    

基于情感数据的违约判别研究

董冰洁, 迟国泰   

  1. 大连理工大学经济管理学院,辽宁 大连116024
  • 收稿日期:2020-07-13 修回日期:2020-11-18 发布日期:2023-05-06
  • 通讯作者: 董冰洁(1992-),男(汉族),湖北京山人,大连理工大学经济管理学院,博士研究生,研究方向:信用评级,Email:330081293@qq.com. E-mail:330081293@qq.com
  • 基金资助:
    国家自然科学基金重点项目(71731003);国家自然科学基金面上项目(72071026,72201098,72271040,72173096, 71971051,71971034,71873103);国家自然科学基金青年项目(71901055,71903019);国家自然科学基金地区项目(72161033);国家社会科学基金重大项目(18ZDA095)

Study on Default Prediction Based on Sentiment Data

DONG Bing-jie, CHI Guo-tai   

  1. School of Economics and Management, Dalian University of Technology, Dalian 116024, China
  • Received:2020-07-13 Revised:2020-11-18 Published:2023-05-06
  • Contact: 董冰洁 E-mail:330081293@qq.com

摘要: 违约判别对商业银行和借贷机构的信贷决策具有重要意义。本文研究的问题是如何使用情感数据构建违约判别模型,以提高金融机构识别违约客户的能力。本文的创新与特色:一是使用7类21种情感来测度借款描述的情感特征,不仅呈现借款描述中隐含的多维度情感特征,而且避免仅使用正负两类情感计算总情感倾向时情感种类单一、情感含义模糊的弊端;二是使用情感数据和数值数据构建违约判别模型,避免使用单类数据构建违约判别模型准确性不足的弊端。研究表明:相较于使用单类数据构建违约判别模型,同时使用情感数据和数值数据构建的最优临界点判别模型的违约判别准确性最高,其中第二类错误显著下降。回归分析表明:在正向情感中,借款描述中传达“相信”和“赞美”情感的客户与传达“祝愿”“喜爱”“快乐”“尊敬”“安心”等多种情感的客户相比违约可能性更小;“贬责”情感对违约状态有显著正向影响;“悲伤”情感对违约状态有显著负向影响。

关键词: 情感数据;零界点模型;违约判别

Abstract: The application of text data in the financial field has always been a hot research topic.The problem studied in this paper is how to use emotion data to build a default prediction model to improve the ability of financial institutions to identify default customers. The characteristics of this article: (1) to measure the emotional characteristics of the loan description by 21 emotions.This method present the multi-dimensional emotional characteristics implied in the loan description; (2) to use emotional data and numerical data to construct the optimal critical point default discrimination model, which avoids the shortcomings of using single type data to construct default discrimination model.The research shows that: in the comparative analysis, the use of sentiment data and digital data in this paper to construct the optimal critical point hybrid model has the highest accuracy of default prediction, of which the second type of error drops significantly; regression analysis shows that in the positive direction in the emotion, the customers who convey the “believe” and “praise” emotions in the loan description are more than the customers who convey “wish”, “love”, “happiness”, “respect”, “relief” and other emotions are less likely to default; negative emotions have a weaker ability to identify defaults; “derogatory” emotions have a significant positive correlation with defaults; “sadness” emotions have a significant negative correlation with defaults; After the customer’s economic characteristics, the above relationship is still established, and the results are solid.

Key words: sentiment data; text analysis; optimal critical point; mixed model; default prediction

中图分类号: