主管:中国科学院
主办:中国优选法统筹法与经济数学研究会
   中国科学院科技战略咨询研究院
Articles

The Default Prediction Combined with Soft Informationin Online Peer-to-Peer Lending

Expand
  • School of Management, Hefei University of Technology, Hefei 230009,China

Received date: 2016-07-06

  Revised date: 2017-04-20

  Online published: 2018-01-31

Abstract

P2P lending is a new type of loan mode formed by the intersection of Internet and traditional finance. It provides a more convenient loan platform and has been developing rapidly in China.However, the phenomenon of collapse in P2P is getting worse as P2P loans is facing default risk and bad debt losses seriously. Credit evaluation is an important basis for managing loan default risk and supporting lending decision. Compared with traditional loans, the financial data of borrowers collected by P2P platform is limited, which is also called the hard the information.However,there is lots of soft information generated during the loan application, such as loan description text,also involving some information about loans and borrowers. Therefore, a default prediction method combined with soft informationfor P2P lending is proposed. Firstly, the soft information is categorized according to the characteristics of P2P, and the LDA topic model is used to quantify valuable factors in the text of soft information. Secondly, some regression analysis and contrast experiments are performed to test the effect of soft information on P2P default probability. Moreover, a two-stage method is designed to selecteffective variablesets for default modeling, and the default prediction model is constructed through the random forest (RF) method.Finally, based on the data from a Chinese P2P platform—eloan.com, an experimental research is conducted to verify the effectiveness of methods we proposed.The results show that the soft information can improve the recognition rate of loan default, which can be used as the basis of P2P credit evaluation. The feature combination selection method proposed in this paper and the credit evaluation model based on Random Forest have achieved good classification accuracy.And the proposed method can improve predictionperformancesobviously compared withthe platform's own rating method, which has certain reference significance for the credit evaluation of P2P network lending.

Cite this article

JIANG Cui-qing, WANG Rui-ya, DIGN Yong . The Default Prediction Combined with Soft Informationin Online Peer-to-Peer Lending[J]. Chinese Journal of Management Science, 2017 , 25(11) : 12 -21 . DOI: 10.16381/j.cnki.issn1003-207x.2017.11.002

References

[1] Pope D G, Sydnor J R. What's in a picture?:Evidence of discrimination from prosper.com[J]. Journal of Human Resources, 2011, 46(1):53-92.

[2] Michels J. Do unverifiable disclosures matter? evidence from peer-to-peer lending[J]. Accounting Review, 2012, 87(4):1385-1413.

[3] Emekter R, Tu Y. Evaluating credit risk and loan performance in online peer-to-peer (P2P) lending[J]. Applied Economics, 2015, 47(1):54-70.

[4] Angilella S, Mazzù S. The financing of innovative SMEs:A multicriteria credit rating model[J]. European Journal of Operational Research, 2015, 244(2):540-554.

[5] Malekipirbazari M, Aksakalli V. Risk assessment in social lending via random forests[J]. Expert Systems with Applications, 2015, 42(10):4621-4631.

[6] Dorfleitner G, Priberny C, Schuster S, et al. Description-text related soft information in peer-to-peer lending-Evidence from two leading European platforms[J]. Journal of Banking & Finance, 2016, 64:169-187.

[7] 刘征驰, 赖明勇. 虚拟抵押品、软信息约束与P2P互联网金融[J]. 中国软科学, 2015,(1):35-46.

[8] 王会娟, 何琳. 借款描述对P2P网络借贷行为影响的实证研究[J]. 金融经济学研究, 2015,(1):77-85.

[9] Gao Q, Lin M. Linguistic features and peer-to-peer loan quality:A machine learning approach[R]. Social Science Electronic Publishing, 2013.

[10] Cubiles-De-La-Vega M D, Blanco-Oliver A, Pino-Mejías R, et al. Improving the management of microfinance institutions by using credit scoring models based on Statistical Learning techniques[J]. Expert Systems with Applications, 2013, 40(17):6910-6917.

[11] Hajek P, Michalak K. Feature selection in corporate credit rating prediction[J]. Knowledge-Based Systems, 2013, 51(1):72-84.

[12] Petersen M A. Information:Hard and soft[R].Working paper, Northwestern University, 2004.

[13] 陈庭强, 何建敏. 基于复杂网络的信用风险传染模型研究[J]. 中国管理科学, 2014, 22(11):111-117.

[14] Lessmann S, Baesens B, Seow H V, et al. Benchmarking state-of-the-art classification algorithms for credit scoring:An update of research[J]. European Journal of Operational Research, 2015, 247(1):1-32.

[15] 衣柏衡, 朱建军, 李杰. 基于改进SMOTE的小额贷款公司客户信用风险非均衡SVM分类[J]. 中国管理科学, 2016, 24(3):24-30.

[16] Finlay S. Multiple classifier architectures and their application to credit risk assessment[J]. European Journal of Operational Research, 2011, 210(2):368-378.

[17] Kruppa J, Schwarz A, Arminger G, et al. Consumer credit risk:Individual probability estimates using machine learning[J]. Expert Systems with Applications, 2013, 40(13):5125-5131.
Outlines

/