主管:中国科学院
主办:中国优选法统筹法与经济数学研究会
   中国科学院科技战略咨询研究院

中国管理科学 ›› 2026, Vol. 34 ›› Issue (2): 67-78.doi: 10.16381/j.cnki.issn1003-207x.2023.1719cstr: 32146.14.j.cnki.issn1003-207x.2023.1719

• • 上一篇    下一篇

信息披露数据异常分布检验:一种财务欺诈检测的新策略

李国文1, 龚羽豪2, 李靖宇3(), 王帅1   

  1. 1.中央财经大学管理科学与工程学院,北京 102206
    2.清华大学经济管理学院,北京 100084
    3.北京工业大学经济与管理学院,北京 100124
  • 收稿日期:2023-10-25 修回日期:2024-09-16 出版日期:2026-02-25 发布日期:2026-02-04
  • 通讯作者: 李靖宇 E-mail:lijy@bjut.edu.cn
  • 基金资助:
    国家自然科学基金项目(72201287);国家自然科学基金项目(72571297);国家自然科学基金项目(72201012);国家自然科学基金项目(72331010);中央高校基本科研业务费项目

Test for Anomalous Distribution of Information Disclosure: A New Strategy for Financial Fraud Detection

Guowen Li1, Yuhao Gong2, Jingyu Li3(), Shuai Wang1   

  1. 1.School of Management Science and Engineering,Central University of Finance and Economics,Beijing 102206,China
    2.School of Economics and Management,Tsinghua University,Beijing 100084,China
    3.School of Economics and Management,Beijing University of Technology,Beijing 100124,China
  • Received:2023-10-25 Revised:2024-09-16 Online:2026-02-25 Published:2026-02-04
  • Contact: Jingyu Li E-mail:lijy@bjut.edu.cn

摘要:

财务欺诈会对金融市场造成重大损害,传统基于财务指标的方法难以精准识别欺诈行为。在财务欺诈情境下,管理层对信息披露内容进行了篡改,公司原本的信息披露特征会发生偏离。本文研究如何刻画这种偏离,进而提出了一种基于信息披露数据异常分布特征的财务欺诈检测新策略。基于2010—2020年中国市场数据,本文证实了在自然情况下,信息披露的数字和文本分布特征在总体和行业上分别符合本福特定律和齐普夫定律;而数据分布相对这些定律存在偏离的公司,更可能存在实施财务欺诈的情况;更进一步,数据偏离规律的程度越大,存在欺诈的可能性越高。采用经典的财务欺诈检测模型,研究同时证实了考虑信息披露异常分布特征能够显著提升欺诈检测效果。

关键词: 财务欺诈检测, 信息披露特征, 数据分布规律, 文本分析

Abstract:

This study introduces an innovative strategy for detecting financial fraud by examining the abnormal distribution characteristics within corporate information disclosures. Financial fraud not only results in substantial losses for investors but also undermines the stability of capital markets. By utilizing Benford’s Law and Zipf’s Law, the study develops a set of indicators to identify financial fraud through the detection of anomalies in the distribution patterns of numerical and textual disclosures, and it demonstrates the effectiveness of these indicators within the context of the Chinese market. Unlike traditional methods, which typically rely on financial data or the analysis of tone and thematic content, this approach offers superior interpretability, is independent of time-series or cross-sectional data, and is applicable to a broad array of financial fraud scenarios.The research first investigates whether large-scale numerical and textual disclosure data adhere to the expected distributional laws. The findings reveal that over 80% of companies’ annual financial data conform to Benford’s Law, while textual data in MD&A disclosures largely follows Zipf’s Law. Building on these insights, the study introduces two anomaly detection metrics: the KS statistic for numerical disclosures and the e-value for textual disclosures. Subsequent empirical analysis confirms significant differences in the KS and ε-values between fraudulent and non-fraudulent firms. Specifically, a greater deviation from the expected natural distribution, as indicated by higher KS and ε-values, correlates with a higher probability of financial fraud. Through the application of machine learning and deep learning models, the study finds that incorporating these abnormal distribution features enhances fraud detection accuracy, precision, recall, and F1 scores by up to 17% to 26%. These results offer valuable incremental insights for financial fraud detection in the Chinese market, contributing to enhanced detection capabilities for both regulators and investors.The key conclusion of this study is that abnormal distribution characteristics in both numerical and textual disclosures serve as effective tools for financial fraud detection, significantly improving detection accuracy. However, the research has certain limitations, such as sample matching challenges and data filtering constraints. Future studies should expand the scope to include a wider range of company types and smaller sample sizes. Additionally, future research could explore technological advancements to further optimize the identification of abnormal distribution features, thereby improving the applicability and predictive power of financial fraud detection strategies.

Key words: financial fraud detection, disclosure characteristics, data distribution patterns, text analysis

中图分类号: