主管:中国科学院
主办:中国优选法统筹法与经济数学研究会
   中国科学院科技战略咨询研究院

Chinese Journal of Management Science ›› 2026, Vol. 34 ›› Issue (2): 67-78.doi: 10.16381/j.cnki.issn1003-207x.2023.1719

Previous Articles     Next Articles

Test for Anomalous Distribution of Information Disclosure: A New Strategy for Financial Fraud Detection

Guowen Li1, Yuhao Gong2, Jingyu Li3(), Shuai Wang1   

  1. 1.School of Management Science and Engineering,Central University of Finance and Economics,Beijing 102206,China
    2.School of Economics and Management,Tsinghua University,Beijing 100084,China
    3.School of Economics and Management,Beijing University of Technology,Beijing 100124,China
  • Received:2023-10-25 Revised:2024-09-16 Online:2026-02-25 Published:2026-02-04
  • Contact: Jingyu Li E-mail:lijy@bjut.edu.cn

Abstract:

This study introduces an innovative strategy for detecting financial fraud by examining the abnormal distribution characteristics within corporate information disclosures. Financial fraud not only results in substantial losses for investors but also undermines the stability of capital markets. By utilizing Benford’s Law and Zipf’s Law, the study develops a set of indicators to identify financial fraud through the detection of anomalies in the distribution patterns of numerical and textual disclosures, and it demonstrates the effectiveness of these indicators within the context of the Chinese market. Unlike traditional methods, which typically rely on financial data or the analysis of tone and thematic content, this approach offers superior interpretability, is independent of time-series or cross-sectional data, and is applicable to a broad array of financial fraud scenarios.The research first investigates whether large-scale numerical and textual disclosure data adhere to the expected distributional laws. The findings reveal that over 80% of companies’ annual financial data conform to Benford’s Law, while textual data in MD&A disclosures largely follows Zipf’s Law. Building on these insights, the study introduces two anomaly detection metrics: the KS statistic for numerical disclosures and the e-value for textual disclosures. Subsequent empirical analysis confirms significant differences in the KS and ε-values between fraudulent and non-fraudulent firms. Specifically, a greater deviation from the expected natural distribution, as indicated by higher KS and ε-values, correlates with a higher probability of financial fraud. Through the application of machine learning and deep learning models, the study finds that incorporating these abnormal distribution features enhances fraud detection accuracy, precision, recall, and F1 scores by up to 17% to 26%. These results offer valuable incremental insights for financial fraud detection in the Chinese market, contributing to enhanced detection capabilities for both regulators and investors.The key conclusion of this study is that abnormal distribution characteristics in both numerical and textual disclosures serve as effective tools for financial fraud detection, significantly improving detection accuracy. However, the research has certain limitations, such as sample matching challenges and data filtering constraints. Future studies should expand the scope to include a wider range of company types and smaller sample sizes. Additionally, future research could explore technological advancements to further optimize the identification of abnormal distribution features, thereby improving the applicability and predictive power of financial fraud detection strategies.

Key words: financial fraud detection, disclosure characteristics, data distribution patterns, text analysis

CLC Number: