中国管理科学 ›› 2022, Vol. 30 ›› Issue (12): 131-140.doi: 10.16381/j.cnki.issn1003-207x.2021.2694
牛奔1, 2, 郭晨3, 唐恒3
收稿日期:
2021-08-10
修回日期:
2022-02-17
出版日期:
2022-12-20
发布日期:
2022-12-20
通讯作者:
郭晨(1991-),女(汉族),湖北孝感人,澳门大学工商管理学院,博士研究生,研究方向:群体智能、聚类分析,Email: chen.guo@connect.um.edu.mo.
E-mail:chen.guo@connect.um.edu.mo
基金资助:
NIU Ben1, 2, GUO Chen3, TANG Heng3
Received:
2021-08-10
Revised:
2022-02-17
Online:
2022-12-20
Published:
2022-12-20
Contact:
郭晨
E-mail:chen.guo@connect.um.edu.mo
摘要: 针对混合属性数据聚类问题,本文提出一种基于多目标多元学习细菌觅食优化算法。首先,基于改进的细菌觅食优化算法,提出多目标优化算法框架。然后,提出多元学习策略来提高算法性能。具体地,对于细菌个体,细菌之间采用环形拓扑学习策略,每个细菌只能向其邻域最优个体学习;细菌个体还可以向外部档案非支配个体学习。通过该学习策略,不仅可以保持种群的多样性,也可以加快算法收敛速度。对于外部档案非支配个体,记录其变化趋势,当非支配个体的变化处于停滞状态时,采用精英学习策略对非支配个体进行微扰动,提高非支配解的多样性。最后,为解决混合属性数据聚类问题,设计了一种具有属性权重的混合属性转换策略。为了验证所提算法的性能,将该算法与两个多目标进化算法和三个经典聚类算法在六个标准数据集上进行对比实验。实验结果表明,所提算法在解决数值、分类和混合属性数据聚类问题上具有显著优势。同时,以金融领域信用卡申请客户数据为例,进一步证实了所提算法的可行性,也表明了所提算法在涉及混合属性数据集的医疗、管理、工程等领域有一定的应用前景。
中图分类号:
牛奔, 郭晨, 唐恒. 基于多目标多元学习细菌觅食优化算法的混合数据聚类[J]. 中国管理科学, 2022, 30(12): 131-140.
NIU Ben, , GUO Chen, TANG Heng. Multi-objective Multi-learning Bacterial Foraging Optimization Algorithm for Mixed Data Clustering[J]. Chinese Journal of Management Science, 2022, 30(12): 131-140.
[1] 胡晓东, 高嘉伟. 基于分组模型的引力搜索智能大数据聚类方法[J]. 计算机工程与设计, 2021, 42(6): 1660-1667.Hu Xiaodong, Gao Jiawei. Intelligent big data clustering using gravitational search based on grouping[J]. Computer Engineering and Design, 2021, 42(6): 1660-1667. [2] 陈晔, 孙汪泉, 徐海燕. 基于聚类分析的案例距离决策模型[J]. 中国管理科学, 2015, 23(S1): 102-107.Chen Ye, Sun Wangquan, Xu Haiyan. The study of case-based distance decision model based on clustering analysis[J]. Chinese Journal of Management Science, 2015, 23(S1): 102-107. [3] 王泽洲, 陈云翔, 项华春. 一种改进型专家模糊核聚类赋权方法研究[J]. 中国管理科学, 2021, 29(2): 177-183.Wang Zezhou, Chen Yunxiang, Xiang Huachun. Research. on an improved expert cluster weighting method based on fuzzy kernel clustering[J]. Chinese Journal of Management Science, 2021, 29(2): 177-183. [4] Han Jiawei, Pei Jian, Kamber M. Data mining: concepts and techniques[M]. Elsevier, 2011. [5] Lloyd S. Least squares quantization in PCM[J]. IEEE Transactions on Information Theory, 1982, 28(2): 129-137. [6] Huang Zhexue. Extensions to the k-means algorithm for clustering large data sets with categorical values[J]. Data Mining and Knowledge Discovery, 1998, 2(3): 283-304. [7] 刘超, 姚清华, 乐然. 混合型数据聚类方法的比较[J]. 统计与决策, 2019, 35(11): 64-67.Liu Chao, Yao Qinghua, Le Ran. Comparison of clustering methods for mixed data[J]. Statistics & Decision, 2019, 35(11): 64-67. [8] Huang Zhexue. Clustering large data sets with mixed numeric and categorical values[C]∥Proceedings of the 1st pacific-asia conference on knowledge discovery and data mining (PAKDD), 1997. [9] Ahmad A, Khan S S. Survey of state-of-the-art mixed data clustering algorithms[J]. IEEE Access, 2019, 7: 31883-31902. [10] Ahmad A, Dey L. A k-mean clustering algorithm for mixed numeric and categorical data[J]. Data & Knowledge Engineering, 2007, 63(2): 503-527. [11] Hancer E, Karaboga D. A comprehensive survey of traditional, merge-split and evolutionary approaches proposed for determination of cluster number[J]. Swarm and Evolutionary Computation, 2017, 32: 49-67. [12] AlSahaf H, Bi Y, Chen Q, et al. A survey on evolutionary machine learning[J]. Journal of the Royal Society of New Zealand, 2019, 49(2): 205-228. [13] Guo Chen, Tang Heng, Niu Ben, et al. A survey of bacterial foraging optimization[J]. Neurocomputing, 2021,452: 728746. [14] Wangchamhan T, Chiewchanwattana S, Sunat K. Efficient algorithms based on the k-means and chaotic league championship algorithm for numeric, categorical, and mixed-type data clustering[J]. Expert Systems with Applications, 2017, 90: 146-167. [15] Gower J C. A general coefficient of similarity and some of its properties[J]. Biometrics, 1971,27(4): 857-871. [16] Ji Jinchao, Chen Yongbing, Feng Guozhong, et al. Clustering mixed numeric and categorical data with artificial bee colony strategy[J]. Journal of Intelligent & Fuzzy Systems, 2019, 36(2): 1521-1530. [17] Ji Jinchao, Pang Wei, Li Zairong, et al. Clustering mixed numeric and categorical data with cuckoo search[J]. IEEE Access, 2020(8): 30988-31003. [18] Nooraeni R, Arsa M I, Projo N W K. Fuzzy centroid and genetic algorithms: Solutions for numeric and categorical mixed data clustering[J]. Procedia Computer Science, 2021, 179: 677-684. [19] Dutta D, Dutta P, Sil J. Data clustering with mixed features by multi objective genetic algorithm[C]∥Proceedings of the 2012 12th International Conference on Hybrid Intelligent Systems (HIS), Pune, India, Dec.4-7,2012. [20] Dutta D, Dutta P, Sil J. Simultaneous feature selection and clustering with mixed features by multi objective genetic algorithm[J]. International Journal of Hybrid Intelligent Systems, 2014, 11(1): 41-54. [21] Dutta D, Sil J, Dutta P. Automatic clustering by multi-objective genetic algorithm with numeric and categorical features[J]. Expert Systems with Applications, 2019, 137: 357-379. [22] Passino K M. Biomimicry of bacterial foraging for distributed optimization and control[J]. IEEE Control Systems Magazine, 2002, 22(3): 52-67. [23] 边琦, 张梦寒, 王建平, 等.基于改进细菌觅食算法的飞控系统多模态参数优化[J]. 控制与决策,2022,37(8): 1-8.Bian qi, Zhang Menghan, Wang Jianping, et al. An improved bacterial foraging algorithm for multimodal parameter optimization of the flight control system[J]. Control and Decision,2022,37(8): 1-8. [24] Wan Miao, Li Lixiang, Xiao Jinghua, et al. Data clustering using bacterial foraging optimization[J]. Journal of Intelligent Information Systems, 2012, 38(2): 321-341. [25] Niu Ben, Duan Qiqi, Liang Jing. Hybrid bacterial foraging algorithm for data clustering[C]∥Proceedings of the International Conference on Intelligent Data Engineering and Automated Learning, Hefei, China, Oct.20-23,2013. [26] Revathi J, Eswaramurthy V, Padmavathi P. Bacterial colony optimization for data clustering[C]∥Proceedings of the 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT), Tamil Nadu, India, Feb.20-22,2019. [27] Guo Chen, Tang Heng, Niu Ben. Evolutionary state-based novel multi-objective periodic bacterial foraging optimization algorithm for data clustering[J]. Expert Systems, 2021,39(1): e12812. [28] Niu Ben, Duan Qiqi, Wang Hong, et al. Simplified bacterial foraging optimization with quorum sensing for global optimization[J]. International Journal of Intelligent Systems, 2021,36(6): 2639-2679. [29] Zhan Zhihui, Zhang Jun, Li Yun, et al. Adaptive particle swarm optimization[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2009, 39(6): 1362-1381. [30] Niu Ben, Wang Jingwen, Wang Hong. Bacterial-inspired algorithms for solving constrained optimization problems[J]. Neurocomputing, 2015, 148: 54-62. [31] Zheng Qibin, Diao Xingchun, Cao Jianjun, et al. From whole to part: reference-based representation for clustering categorical data[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 31(3): 927-937. [32] Nouaouria N, Boukadoum M. Improved global-best particle swarm optimization algorithm with mixed-attribute data classification capability[J]. Applied Soft Computing, 2014, 21: 554-567. [33] Ji Jinchao, Bai Tian, Zhou Chunguang, et al. An improved k-prototypes clustering algorithm for mixed numeric and categorical data[J]. Neurocomputing, 2013, 120: 590-596. [34] Nguyen T P Q, Kuo R J. Automatic Fuzzy Clustering Using Non-Dominated Sorting Particle Swarm Optimization Algorithm for Categorical Data[J]. IEEE Access, 2019, 7: 99721-99734. [35] Coello C A C, Pulido G T, Lechuga M S. Handling multiple objectives with particle swarm optimization[J]. IEEE Transactions on Evolutionary Computation, 2004, 8(3): 256-279. [36] Deb K, Pratap A, Agarwal S, et al. A fast and elitist multiobjective genetic algorithm: NSGA-II[J]. IEEE transactions on evolutionary computation, 2002, 6(2): 182-197. [37] Yang Yiming. An evaluation of statistical approaches to text categorization[J]. Information Retrieval, 1999(1): 69-90. [38] Davies D L, Bouldin D W. A cluster separation measure[J]. IEEE Transactions on Pattern Analysis Machine Intelligence, 1979, 1(2): 224-227. |
[1] | 李伟伟,易平涛,王露. 基于行为特征分析的情境化评价方法及应用[J]. 中国管理科学, 2024, 32(10): 171-180. |
[2] | 陈雪龙,张钟,李悦. 临机决策视角下的非常规突发事件应急处置方案生成[J]. 中国管理科学, 2024, 32(10): 109-122. |
[3] | 胡婉婷,丁晶晶,梁樑. 基于期权代储协议的应急物资政企联合储备模型研究[J]. 中国管理科学, 2024, 32(9): 101-112. |
[4] | 邹清明,刘春,曹裕. 碳交易机制下考虑公平关切产出不确定的低碳供应链减排与融资策略研究[J]. 中国管理科学, 2024, 32(9): 248-259. |
[5] | 陈强,荣俊美,常旭华,宫磊. 考虑奖惩差异的高校科研考评及专利审查策略研究[J]. 中国管理科学, 2024, 32(9): 313-322. |
[6] | 聂如欣,田章朋,梁鹤鸣. 社会网络环境下基于行为管理的大群体共识决策方法[J]. 中国管理科学, 2024, 32(9): 35-47. |
[7] | 陈璐,徐海燕,张瑾木子,何亮. 基于图模型理论的模糊权力不对称冲突研究[J]. 中国管理科学, 2024, 32(9): 59-69. |
[8] | 沈种,李星梅. 多种协同关系共同作用下的项目组合决策问题研究[J]. 中国管理科学, 2024, 32(8): 139-148. |
[9] | 李晓娜,马卫民. 需求不确定下高耗水企业节水服务外包决策研究[J]. 中国管理科学, 2024, 32(8): 230-240. |
[10] | 孙向彦,曲薪池. 药占比管制下过度医疗行为多阶段动态演化研究[J]. 中国管理科学, 2024, 32(8): 308-321. |
[11] | 李想,李亚男,马红光. 考虑乘客效用的多元化共享出行平台协同定价策略研究[J]. 中国管理科学, 2024, 32(7): 172-180. |
[12] | 杜恒,卢珂. 供给短缺下考虑消费异质性的零售商定价策略[J]. 中国管理科学, 2024, 32(7): 201-211. |
[13] | 蒋兰娟,陈武华,陈晓红. 考虑医保报销与交付时间的医药零售商定价及渠道模式选择研究[J]. 中国管理科学, 2024, 32(7): 236-247. |
[14] | 杨荣庆,唐孝安,张强,黄挺. 分布式乘性偏好环境下考虑决策者偏好调整意愿的最优-最劣多准则决策方法[J]. 中国管理科学, 2024, 32(7): 65-75. |
[15] | 张磊,叶鑫. 考虑风险态度和自信行为的应急响应等级决策方法[J]. 中国管理科学, 2024, 32(6): 120-128. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|