主管:中国科学院
主办:中国优选法统筹法与经济数学研究会
   中国科学院科技战略咨询研究院

中国管理科学 ›› 2025, Vol. 33 ›› Issue (4): 36-49.doi: 10.16381/j.cnki.issn1003-207x.2023.1961

• • 上一篇    下一篇

中国上市公司数字化水平测度与演化研究

胡忠义1,2(), 税典程1, 吴江1,2   

  1. 1.武汉大学信息管理学院,湖北 武汉 430072
    2.武汉大学电子商务研究与发展中心,湖北 武汉 430072
  • 收稿日期:2023-11-21 修回日期:2024-07-15 出版日期:2025-04-25 发布日期:2025-04-29
  • 通讯作者: 胡忠义 E-mail:Zhongyi.hu@whu.edu.cn
  • 基金资助:
    国家自然科学基金面上项目(72171183);国家自然科学基金重点项目(72232006);教育部哲学社会科学研究重大课题攻关项目(20JZD024)

Measurement and Evolution of Digitization Level of Chinese Listed Companies: Empirical Evidence from Annual Report Text

Zhongyi Hu1,2(), Diancheng Shui1, Jiang Wu1,2   

  1. 1.School of Information Management,Wuhan University,Wuhan 430072,China
    2.The Center for Electronic Commerce Research and Development,Wuhan University,Wuhan 430072,China
  • Received:2023-11-21 Revised:2024-07-15 Online:2025-04-25 Published:2025-04-29
  • Contact: Zhongyi Hu E-mail:Zhongyi.hu@whu.edu.cn

摘要:

系统而全面地测度企业的数字化水平是深入探讨数字化转型成效的关键。通过统计年报文本中数字化术语的词频来测算企业的数字化水平,已受到学界的广泛关注。然而,以往研究所采用的术语词典规模小、覆盖面窄、扩展性差,导致企业数字化水平的测度不够全面。为了解决这一问题,本文利用自然语言处理和深度学习技术,构建了BERT-GlobalPointer和BERT-GlobalPointer-Mask两个术语识别模型,以高效识别数字化术语。进一步地,利用术语识别模型构建的大规模术语词典,采用文本分析法对我国上市公司过去近20年的数字化水平进行了测度,并分析了其演化态势。研究结果表明,所构建的模型在识别数字化术语、新术语、长术语方面显著优于基准模型;基于构建的识别模型,本文构建了一个包含27.3万个术语的数字化术语词典,其涵盖范围比以往研究更全面、更丰富;利用该术语词典,能够对近20年我国上市公司的数字化进程进行更为长期和全面的测度。本研究对于推动企业数字化的测度与成效评估具有重要启示和借鉴意义。本研究为后续数字化转型的定量测度及绩效分析提供了基础性数据支持,具有非常重要的方法启示与应用价值。

关键词: 数字化转型, 术语抽取, 文本分析, 深度学习, 演化分析

Abstract:

Measuring the digitalization level of enterprises systematically and comprehensively is crucial for an in-depth exploration of the effectiveness of digital transformation. It has received wide attention from academic community by statistically analyzing the frequency of digitalization-related terms in annual reports of companies. However, previous studies have overlooked the importance of terminology dictionaries, and usually adopt small-scale, narrow coverage and poor scalability, which leads to improper measurement of enterprise digitalization levels. To address this issue, two term extraction models are proposed, namely BERT-GlobalPointer and BERT-GlobalPointer-Mask, for efficient identification of digital terms. Furthermore, utilizing the large-scale terminology dictionary built by the proposed BERT-GlobalPointer-Mask, a Digital Transformation Index (DTI) for Chinese listed companies is developed and their evolution patterns over the past two decades are analyzed. The results indicate that the proposed models significantly outperform the benchmark models in identifying digital terms, new terms and long terms. Based on BERT-GlobalPointer-Mask model, a digital terminology dictionary containing 273,634 terms is constructed, which covers more comprehensive and diverse digital terms than previous studies. With this terminology dictionary, a more long-term and comprehensive measurement of the digitalization of listed companies in China over the past two decades can be conducted. The fundamental data support is provided for quantitative measurement and effectiveness analysis of enterprise digital transformation, and there is significant methodological and practical values.

Key words: digital transformation, term extraction, text analysis, deep learning, evolution analysis

中图分类号: