主管:中国科学院
主办:中国优选法统筹法与经济数学研究会
   中国科学院科技战略咨询研究院

Chinese Journal of Management Science ›› 2023, Vol. 31 ›› Issue (4): 111-120.doi: 10.16381/j.cnki.issn1003-207x.2020.1344

• Articles • Previous Articles    

Study on Default Prediction Based on Sentiment Data

DONG Bing-jie, CHI Guo-tai   

  1. School of Economics and Management, Dalian University of Technology, Dalian 116024, China
  • Received:2020-07-13 Revised:2020-11-18 Published:2023-05-06
  • Contact: 董冰洁 E-mail:330081293@qq.com

Abstract: The application of text data in the financial field has always been a hot research topic.The problem studied in this paper is how to use emotion data to build a default prediction model to improve the ability of financial institutions to identify default customers. The characteristics of this article: (1) to measure the emotional characteristics of the loan description by 21 emotions.This method present the multi-dimensional emotional characteristics implied in the loan description; (2) to use emotional data and numerical data to construct the optimal critical point default discrimination model, which avoids the shortcomings of using single type data to construct default discrimination model.The research shows that: in the comparative analysis, the use of sentiment data and digital data in this paper to construct the optimal critical point hybrid model has the highest accuracy of default prediction, of which the second type of error drops significantly; regression analysis shows that in the positive direction in the emotion, the customers who convey the “believe” and “praise” emotions in the loan description are more than the customers who convey “wish”, “love”, “happiness”, “respect”, “relief” and other emotions are less likely to default; negative emotions have a weaker ability to identify defaults; “derogatory” emotions have a significant positive correlation with defaults; “sadness” emotions have a significant negative correlation with defaults; After the customer’s economic characteristics, the above relationship is still established, and the results are solid.

Key words: sentiment data; text analysis; optimal critical point; mixed model; default prediction

CLC Number: