“结构-信息”耦合网络下基于强化学习的群智共识决策建模研究

doi:10.16381/j.cnki.issn1003-207x.2024.2237

摘要/Abstract

摘要：

人工智能与信息技术的快速发展，推动协同群决策逐渐成为解决应急管理、工程论证等复杂系统决策问题的关键手段，其规模化、社交化、动态化特征对决策模型自适应能力提出了根本性需求。然而，非共识逐次辨识的传统范式难以满足群偏好的高效聚合与动态环境实时响应，亟需探索自适应的智能化群决策方法。考虑到决策成员的信任结构与偏好信息能够揭示个体间的潜在关联和共识基础，二者在动态交互中的耦合效应则直接影响群决策过程，因此，本文提出一种“结构-信息”耦合网络下基于强化学习的群智共识决策模型。首先，量化决策成员间有向信任关系与偏好信息相似性，构建结构网络与信息网络，并分析其耦合交互机制。进一步，设计融合群体共识水平、协同激励与协同成本的综合奖励函数，针对耦合网络下群决策问题中决策行为（动作空间）的离散特性以及决策环境（状态空间）的连续特性，引入深度Q网络（deep Q-network， DQN）算法实现自适应的共识策略学习。最后，仿真实验表明所提模型在提高共识达成效率的同时，能够有效降低协同成本，并通过参数敏感性分析验证了模型在差异化行为机制中的适用性。

关键词: 群智协同, 共识决策, DQN算法, 社会网络分析, 偏好演化

Abstract:

The rapid development of artificial intelligence and information technology has propelled collaborative group decision-making to become a key approach for addressing complex systemic decision-making problems in emergency management， engineering demonstration， and other fields. Its characteristics of scalability， social interactivity， and dynamism fundamentally necessitate adaptive capabilities in decision-making models. However， the traditional paradigm of non-consensus sequential identification struggles to meet the demands of efficient group preference aggregation and real-time response in dynamic environments， urgently necessitating the exploration of adaptive intelligent group decision-making methods. Considering that the trust structure and preference information of decision-makers can reveal the latent inter-individual relationships and consensus foundation， and their coupled effects during dynamic interactions are recognized as directly influencing the group decision-making process. Therefore， a reinforcement learning-based intelligent group consensus decision-making model with the “structure-information” coupled network is proposed in this study. First， trust relationships and preference information similarity among decision-makers are quantified to construct the structure network and the information network， and the coupling interaction mechanisms of these two networks are then investigated. Furthermore， an integrated reward function with group consensus level， collaborative incentives， and collaborative costs is designed. To address the discrete characteristics of decision-making behaviors （action space） and the continuous characteristics of decision environments （state space） in group decision-making with the coupled network， the Deep Q-Network （DQN） algorithm is introduced to adaptively optimize consensus strategies. Finally， simulation experiments demonstrate that the proposed model effectively enhances consensus-reaching efficiency while reducing collaborative costs， and its applicability in differentiated behavioral mechanisms is verified through parameter sensitivity analysis.

Key words: collaborative intelligence, consensus decision-making, DQN algorithm, social network analysis, preference evolution

中图分类号:

C934

谭笑,蔡嘉麒,巩在武. “结构-信息”耦合网络下基于强化学习的群智共识决策建模研究[J]. 中国管理科学, 2026, 34(3): 357-368.

Xiao Tan,Jiaqi Cai,Zaiwu Gong. Group Intelligence Consensus Decision Modeling Based on Reinforcement Learning with the “Structure-Information” Coupled Network[J]. Chinese Journal of Management Science, 2026, 34(3): 357-368.

图/表 10

图1

图2

表1

决策成员初始偏好"

参数	$d 1$	$d 2$	$d 3$	$d 4$	$d 5$	...	$d 36$	$d 37$	$d 38$	$d 39$	$d 40$
$o 1$	9	6	4	1	5	...	1	5	0	8	9
$o 2$	3	0	8	4	5	...	1	0	2	2	6
$o 3$	2	8	9	4	5	...	7	6	4	2	3
$o 4$	10	0	0	10	9	...	4	8	5	3	3
$o 5$	2	8	5	5	0	...	0	10	3	1	8

表1

表 2

环境参数设置"

参数	解释	取值
$m$	决策成员数量	40
$n$	备选方案数量	5
$α$	信任更新速率	0.4
$β$	偏好更新速率	0.4
$λ$	偏好差异对信任更新的敏感度	10
$θ$	偏好差异的阈值	0.5
$k$	激励函数调节参数	10000
$ζ$	群体共识水平奖励的权重系数	6
$μ T$	结构网络关键决策成员判断阈值	0.8
$μ P$	信息网络关键决策成员判断阈值	0.8

表 2

表3

DQN参数设置"

参数	解释	取值
$L$	深度神经网络层数	4
$η$	学习率	0.001
$N e p i s o d e s$	总训练回合数	1000
$γ$	折扣因子	0.9
$ϵ 0$	初始探索率	0.8
$C$	目标网络更新间隔回合数	15
$ξ m a x$	经验回放池最大容量	20000
$ξ m i n$	回放池最小经验数要求	1024
$B$	每次训练采样的经验数	256
$ϑ$	探索率的衰减率	0.9

表3

图3

表4

决策成员部分最终偏好"

	$d 1$	$d 2$	$d 3$	$d 4$	$d 5$	...	$d 36$	$d 37$	$d 38$	$d 39$	$d 40$
$o 1$	4.933	4.932	4.932	4.933	4.932	...	4.932	4.932	4.933	4.932	4.932
$o 2$	3.782	3.783	3.783	3.782	3.783	...	3.782	3.783	3.782	3.783	3.784
$o 3$	4.978	4.976	4.977	4.977	4.977	...	4.977	4.976	4.977	4.977	4.978
$o 4$	5.338	5.338	5.338	5.338	5.338	...	5.338	5.338	5.337	5.338	5.340
$o 5$	4.474	4.474	4.474	4.473	4.473	...	4.474	4.473	4.473	4.473	4.472

表4

图 4

图5

图6

参考文献 28

[1]	王红卫李珏，刘建国，等. 人机融合复杂社会系统研究［J］. 中国管理科学， 2023， 31（7）： 1-21.
	Wang H W， Li J， Liu J G，et al. Research on human-machine integration complex social system［J］. Chinese Journal of Management Science， 2023， 31（7）： 1-21.
[2]	高自友，郭雷，刘中民，等. 大数据与人工智能时代下复杂系统管理研究的若干关键科学问题［J］. 中国科学基金， 2023， 37（3）： 429-438.
	Gao Z Y， Guo L， Liu Z M， et al. Some key scientific issues of complex system management research in the era of big data and artificial intelligence［J］. Bulletin of National Natural Science Foundation of China， 2023， 37（3）： 429-438.
[3]	陈琛，李明. 全面提升总体应急管理能力路径研究——以多主体参与为视角［J］. 中国应急管理科学， 2021（8）： 28-37.
	Chen C， Li M. Review on paths to improve holistic emergency management capability—From the perspective of multi-agent participation［J］. Journal of China Emergency Management Science， 2021（8）： 28-37.
[4]	李兴国，蒋梦鑫，杨雪洁，等. 基于案例推理的重大公共卫生事件协同决策方法［J］. 情报科学， 2023， 41（5）： 10-16+25.
	Li X G， Jiang M X， Yang X J， et al. A collaborative decision-making method for grave public health emergencies based on case-based reasoning［J］. Information Science， 2023， 41（5）： 10-16+25.
[5]	Gong Z， Guo W， Słowiński R. Transaction and interaction behavior-based consensus model and its application to optimal carbon emission reduction［J］. Omega， 2021， 104： 102491.
[6]	徐选华肖婷，陈晓红. 社会网络环境下基于群智知识融合的大群体应急决策共识模型［J］. 中国管理科学， 2024， 32（2）： 285-297.
	Xu X H， Xiao T， Chen X H. A consensus model for large-group emergency decision-making based on group wisdomfusion under social network environment［J］. Chinese Journal of Management Science， 2024， 32（2）： 285-297.
[7]	李梦琪，刘高峰，许叶军，等. 考虑专家信任风险行为挖掘与动态管理的社会网络群体共识决策研究［J］. 中国管理科学， 2024， 32（11）： 144-156.
	Li M Q， Liu G F， Xu Y J， et al. Research on consensus of social network group decision making considering Experts’Trust risk behaviors mining and dynamic management［J］. Chinese Journal of Management Science， 2024， 32（11）： 144-156.
[8]	Zeng Z， Dai H， Zhang D J， et al. The impact of social nudges on user-generated content for social network platforms［J］. Management Science， 2023， 69（9）： 5189-5208.
[9]	Cao M， Gai T， Wu J， et al. Social network group decision making： Characterization， taxonomy， challenges and future directions from an AI and LLMs perspective［J］. Information Fusion， 2025， 120： 103107.
[10]	Peng Y， Zhao Y， Hu J. On the role of community structure in evolution of opinion formation： A new bounded confidence opinion dynamics［J］. Information Sciences， 2023， 621： 672-690.
[11]	Hunter D S， Zaman T. Optimizing opinions with stubborn agents［J］. Operations Research， 2022， 70（4）： 2119-2137.
[12]	Mostagir M， Ozdaglar A， Siderius J. When is society susceptible to manipulation？［J］. Management Science， 2022， 68（10）： 7153-7175.
[13]	Wang P， Liu P， Li Y， et al. Trust exploration- and leadership incubation- based opinion dynamics model for social network group decision-making： A quantum theory perspective［J］. European Journal of Operational Research， 2024， 317（1）： 156-170.
[14]	Ben-Arieh D， Easton T. Multi-criteria group consensus under linear cost opinion elasticity［J］. Decision Support Systems， 2007， 43（3）： 713-721.
[15]	Zhang G Q， Dong Y C， Xu Y F， et al. Minimum-cost consensus models under aggregation operators［J］. IEEE Transactions on Systems， Man， and Cybernetics - Part A： Systems and Humans， 2011， 41（6）： 1253-1261.
[16]	Cheng D， Yuan Y X， Wu Y， et al. Maximum satisfaction consensus with budget constraints considering individual tolerance and compromise limit behaviors［J］. European Journal of Operational Research， 2022， 297（1）： 221-238.
[17]	韩烨帆，纪颖，屈绍建. 基于数据驱动的鲁棒最小成本共识模型［J］. 运筹与管理， 2023， 32（9）： 36-42.
	Han Y F， Ji Y， Qu S J. A data-driven-based robust minimum-cost consensus model［J］. Operations Research and Management Science， 2023， 32（9）： 36-42.
[18]	Meng F Y， Hu W J， Zhao D Y. Marginal contribution-based personalized consensus analysis for interval fuzzy group decision making［J］. IEEE Transactions on Computational Social Systems， 2025， 12（5）： 3011-3022.
[19]	Dong Y， Xu Y， Li H， et al. The OWA-based consensus operator under linguistic representation models using position indexes［J］. European Journal of Operational Research， 2010， 203（2）： 455-463.
[20]	Yu W， Zhang Z， Zhong Q. Consensus reaching for MAGDM with multi-granular hesitant fuzzy linguistic term sets： A minimum adjustment-based approach［J］. Annals of Operations Research， 2021， 300（2）： 443-466.
[21]	Li Z， Zhang Z， Yu W. Consensus reaching for ordinal classification-based group decision making with heterogeneous preference information［J］. Journal of the Operational Research Society，2024， 75（2）： 224-245.
[22]	陶希闻，江文奇，王嘉丽，等. 基于BA-BSO交互机制的自适应群体共识模型［J］. 系统工程理论与实践， 2023， 43（1）： 234-250.
	Tao X W， Jiang W Q， Wang J L， et al. The adaptive group consensus model based on the BA-BSO interaction mechanism［J］. Systems Engineering —Theory & Practice， 2023， 43（1）： 234-250.
[23]	董玉成，范莎，陈霞，等. 诺贝尔经济科学奖与决策理论及其对数据驱动智能决策的研究启示［J］. 管理科学学报， 2025， 28（4）： 174-190.
	Dong Y C， Fan S， Chen X， et al. Nobel Memorial Prize in economic sciences & decision theory and its implica-tions for data-driven intelligent decision-making［J］. Journal of Management Sciences in China， 2025， 28（4）： 174-190.
[24]	张峰，刘凌云，郭欣欣. 基于改进Q-学习算法的多阶段群体决策模型［J］. 控制与决策， 2019， 34（9）： 1917-1922.
	Zhang F， Liu L Y， Guo X X. A multi-stage group decision model based on improved Q-learning［J］. Control and Decision， 2019， 34（9）： 1917-1922.
[25]	Hassani H， Razavi-Far R， Saif M， et al. Reinforcement learning-based feedback and weight-adjustment mechanisms for consensus reaching in group decision making［J］. IEEE Transactions on Systems， Man， and Cybernetics： Systems， 2023， 53（4）： 2456-2468.
[26]	Wasserman S. Social network analysis： Methods and applications［M］. Cambridge： The Press Syndicate of the University of Cambridge， 1994.
[27]	Sutton R S， Barto A G. Reinforcement learning： An introduction ［M］. Cambridge， Mass.： MIT Press， 1998.
[28]	Gong Z W， Zhang H H， Forrest J， et al. Two consensus models based on the minimum cost and maximum return regarding either all individuals or one individual［J］. European Journal of Operational Research， 2015， 240（1）： 183-192.