中国管理科学 ›› 2022, Vol. 30 ›› Issue (12): 131-140.doi: 10.16381/j.cnki.issn1003-207x.2021.2694cstr: 32146.14.j.cnki.issn1003-207x.2021.2694
牛奔1, 2, 郭晨3, 唐恒3
收稿日期:2021-08-10
修回日期:2022-02-17
出版日期:2022-12-20
发布日期:2022-12-20
通讯作者:
郭晨(1991-),女(汉族),湖北孝感人,澳门大学工商管理学院,博士研究生,研究方向:群体智能、聚类分析,Email: chen.guo@connect.um.edu.mo.
E-mail:chen.guo@connect.um.edu.mo
基金资助:NIU Ben1, 2, GUO Chen3, TANG Heng3
Received:2021-08-10
Revised:2022-02-17
Online:2022-12-20
Published:2022-12-20
Contact:
郭晨
E-mail:chen.guo@connect.um.edu.mo
摘要: 针对混合属性数据聚类问题,本文提出一种基于多目标多元学习细菌觅食优化算法。首先,基于改进的细菌觅食优化算法,提出多目标优化算法框架。然后,提出多元学习策略来提高算法性能。具体地,对于细菌个体,细菌之间采用环形拓扑学习策略,每个细菌只能向其邻域最优个体学习;细菌个体还可以向外部档案非支配个体学习。通过该学习策略,不仅可以保持种群的多样性,也可以加快算法收敛速度。对于外部档案非支配个体,记录其变化趋势,当非支配个体的变化处于停滞状态时,采用精英学习策略对非支配个体进行微扰动,提高非支配解的多样性。最后,为解决混合属性数据聚类问题,设计了一种具有属性权重的混合属性转换策略。为了验证所提算法的性能,将该算法与两个多目标进化算法和三个经典聚类算法在六个标准数据集上进行对比实验。实验结果表明,所提算法在解决数值、分类和混合属性数据聚类问题上具有显著优势。同时,以金融领域信用卡申请客户数据为例,进一步证实了所提算法的可行性,也表明了所提算法在涉及混合属性数据集的医疗、管理、工程等领域有一定的应用前景。
中图分类号:
NIU Ben,GUO Chen,TANG Heng. 基于多目标多元学习细菌觅食优化算法的混合数据聚类[J]. 中国管理科学, 2022, 30(12): 131-140.
牛奔,郭晨,唐恒. Multi-objective Multi-learning Bacterial Foraging Optimization Algorithm for Mixed Data Clustering[J]. Chinese Journal of Management Science, 2022, 30(12): 131-140.
| [1] 胡晓东, 高嘉伟. 基于分组模型的引力搜索智能大数据聚类方法[J]. 计算机工程与设计, 2021, 42(6): 1660-1667.Hu Xiaodong, Gao Jiawei. Intelligent big data clustering using gravitational search based on grouping[J]. Computer Engineering and Design, 2021, 42(6): 1660-1667. [2] 陈晔, 孙汪泉, 徐海燕. 基于聚类分析的案例距离决策模型[J]. 中国管理科学, 2015, 23(S1): 102-107.Chen Ye, Sun Wangquan, Xu Haiyan. The study of case-based distance decision model based on clustering analysis[J]. Chinese Journal of Management Science, 2015, 23(S1): 102-107. [3] 王泽洲, 陈云翔, 项华春. 一种改进型专家模糊核聚类赋权方法研究[J]. 中国管理科学, 2021, 29(2): 177-183.Wang Zezhou, Chen Yunxiang, Xiang Huachun. Research. on an improved expert cluster weighting method based on fuzzy kernel clustering[J]. Chinese Journal of Management Science, 2021, 29(2): 177-183. [4] Han Jiawei, Pei Jian, Kamber M. Data mining: concepts and techniques[M]. Elsevier, 2011. [5] Lloyd S. Least squares quantization in PCM[J]. IEEE Transactions on Information Theory, 1982, 28(2): 129-137. [6] Huang Zhexue. Extensions to the k-means algorithm for clustering large data sets with categorical values[J]. Data Mining and Knowledge Discovery, 1998, 2(3): 283-304. [7] 刘超, 姚清华, 乐然. 混合型数据聚类方法的比较[J]. 统计与决策, 2019, 35(11): 64-67.Liu Chao, Yao Qinghua, Le Ran. Comparison of clustering methods for mixed data[J]. Statistics & Decision, 2019, 35(11): 64-67. [8] Huang Zhexue. Clustering large data sets with mixed numeric and categorical values[C]∥Proceedings of the 1st pacific-asia conference on knowledge discovery and data mining (PAKDD), 1997. [9] Ahmad A, Khan S S. Survey of state-of-the-art mixed data clustering algorithms[J]. IEEE Access, 2019, 7: 31883-31902. [10] Ahmad A, Dey L. A k-mean clustering algorithm for mixed numeric and categorical data[J]. Data & Knowledge Engineering, 2007, 63(2): 503-527. [11] Hancer E, Karaboga D. A comprehensive survey of traditional, merge-split and evolutionary approaches proposed for determination of cluster number[J]. Swarm and Evolutionary Computation, 2017, 32: 49-67. [12] AlSahaf H, Bi Y, Chen Q, et al. A survey on evolutionary machine learning[J]. Journal of the Royal Society of New Zealand, 2019, 49(2): 205-228. [13] Guo Chen, Tang Heng, Niu Ben, et al. A survey of bacterial foraging optimization[J]. Neurocomputing, 2021,452: 728746. [14] Wangchamhan T, Chiewchanwattana S, Sunat K. Efficient algorithms based on the k-means and chaotic league championship algorithm for numeric, categorical, and mixed-type data clustering[J]. Expert Systems with Applications, 2017, 90: 146-167. [15] Gower J C. A general coefficient of similarity and some of its properties[J]. Biometrics, 1971,27(4): 857-871. [16] Ji Jinchao, Chen Yongbing, Feng Guozhong, et al. Clustering mixed numeric and categorical data with artificial bee colony strategy[J]. Journal of Intelligent & Fuzzy Systems, 2019, 36(2): 1521-1530. [17] Ji Jinchao, Pang Wei, Li Zairong, et al. Clustering mixed numeric and categorical data with cuckoo search[J]. IEEE Access, 2020(8): 30988-31003. [18] Nooraeni R, Arsa M I, Projo N W K. Fuzzy centroid and genetic algorithms: Solutions for numeric and categorical mixed data clustering[J]. Procedia Computer Science, 2021, 179: 677-684. [19] Dutta D, Dutta P, Sil J. Data clustering with mixed features by multi objective genetic algorithm[C]∥Proceedings of the 2012 12th International Conference on Hybrid Intelligent Systems (HIS), Pune, India, Dec.4-7,2012. [20] Dutta D, Dutta P, Sil J. Simultaneous feature selection and clustering with mixed features by multi objective genetic algorithm[J]. International Journal of Hybrid Intelligent Systems, 2014, 11(1): 41-54. [21] Dutta D, Sil J, Dutta P. Automatic clustering by multi-objective genetic algorithm with numeric and categorical features[J]. Expert Systems with Applications, 2019, 137: 357-379. [22] Passino K M. Biomimicry of bacterial foraging for distributed optimization and control[J]. IEEE Control Systems Magazine, 2002, 22(3): 52-67. [23] 边琦, 张梦寒, 王建平, 等.基于改进细菌觅食算法的飞控系统多模态参数优化[J]. 控制与决策,2022,37(8): 1-8.Bian qi, Zhang Menghan, Wang Jianping, et al. An improved bacterial foraging algorithm for multimodal parameter optimization of the flight control system[J]. Control and Decision,2022,37(8): 1-8. [24] Wan Miao, Li Lixiang, Xiao Jinghua, et al. Data clustering using bacterial foraging optimization[J]. Journal of Intelligent Information Systems, 2012, 38(2): 321-341. [25] Niu Ben, Duan Qiqi, Liang Jing. Hybrid bacterial foraging algorithm for data clustering[C]∥Proceedings of the International Conference on Intelligent Data Engineering and Automated Learning, Hefei, China, Oct.20-23,2013. [26] Revathi J, Eswaramurthy V, Padmavathi P. Bacterial colony optimization for data clustering[C]∥Proceedings of the 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT), Tamil Nadu, India, Feb.20-22,2019. [27] Guo Chen, Tang Heng, Niu Ben. Evolutionary state-based novel multi-objective periodic bacterial foraging optimization algorithm for data clustering[J]. Expert Systems, 2021,39(1): e12812. [28] Niu Ben, Duan Qiqi, Wang Hong, et al. Simplified bacterial foraging optimization with quorum sensing for global optimization[J]. International Journal of Intelligent Systems, 2021,36(6): 2639-2679. [29] Zhan Zhihui, Zhang Jun, Li Yun, et al. Adaptive particle swarm optimization[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2009, 39(6): 1362-1381. [30] Niu Ben, Wang Jingwen, Wang Hong. Bacterial-inspired algorithms for solving constrained optimization problems[J]. Neurocomputing, 2015, 148: 54-62. [31] Zheng Qibin, Diao Xingchun, Cao Jianjun, et al. From whole to part: reference-based representation for clustering categorical data[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 31(3): 927-937. [32] Nouaouria N, Boukadoum M. Improved global-best particle swarm optimization algorithm with mixed-attribute data classification capability[J]. Applied Soft Computing, 2014, 21: 554-567. [33] Ji Jinchao, Bai Tian, Zhou Chunguang, et al. An improved k-prototypes clustering algorithm for mixed numeric and categorical data[J]. Neurocomputing, 2013, 120: 590-596. [34] Nguyen T P Q, Kuo R J. Automatic Fuzzy Clustering Using Non-Dominated Sorting Particle Swarm Optimization Algorithm for Categorical Data[J]. IEEE Access, 2019, 7: 99721-99734. [35] Coello C A C, Pulido G T, Lechuga M S. Handling multiple objectives with particle swarm optimization[J]. IEEE Transactions on Evolutionary Computation, 2004, 8(3): 256-279. [36] Deb K, Pratap A, Agarwal S, et al. A fast and elitist multiobjective genetic algorithm: NSGA-II[J]. IEEE transactions on evolutionary computation, 2002, 6(2): 182-197. [37] Yang Yiming. An evaluation of statistical approaches to text categorization[J]. Information Retrieval, 1999(1): 69-90. [38] Davies D L, Bouldin D W. A cluster separation measure[J]. IEEE Transactions on Pattern Analysis Machine Intelligence, 1979, 1(2): 224-227. |
| [1] | 聂慧芳, 朱建军. 考虑情绪演化下突发事件所致舆情危机的两阶段响应决策方法[J]. 中国管理科学, 2025, 33(10): 112-125. |
| [2] | 张恒杰, 刘娟, 朱文凤, 董玉成. 基于最小成本反向图模型的双渠道供应链碳减排冲突研究[J]. 中国管理科学, 2025, 33(10): 212-224. |
| [3] | 景熠, 刘威, 刘芹芹, 程凤娇, 张华荣. 突发事件冲击下应对供应中断的供应链响应计划研究[J]. 中国管理科学, 2025, 33(10): 57-66. |
| [4] | 张发明, 韩江涛, 张淋茜, 朱姝琪. 社会信任网络下基于T2PLD算子的多属性大群体决策方法[J]. 中国管理科学, 2025, 33(9): 177-188. |
| [5] | 李聪聪, 梁海明, 董玉成. 多粒度语言环境下基于持续学习和行为建模的个性化语义共识决策模型[J]. 中国管理科学, 2025, 33(9): 201-212. |
| [6] | 景熠, 张闻秋, 曹柳. 电商购物平台与回收平台的“以旧换新”合作策略研究[J]. 中国管理科学, 2025, 33(8): 289-297. |
| [7] | 代业明, 于双. 碳税政策下考虑零售商双重行为偏好的再制造闭环供应链决策[J]. 中国管理科学, 2025, 33(8): 308-320. |
| [8] | 张侃, 刘思施, 魏华, 余鹏, 梁新. 基于CPSO改进的TOPSIS三维空间组合定权投影动态综合评价研究[J]. 中国管理科学, 2025, 33(7): 117-127. |
| [9] | 董乾坤, 易平涛, 李伟伟, 王露. 基于多源混合评价信息的随机聚合指数及应用[J]. 中国管理科学, 2025, 33(7): 128-138. |
| [10] | 李进, 江赫奇, 丁圣琪, 张海霞, 伍蓓. 碳交易机制下竞争型供应链最优减排与规制政策研究[J]. 中国管理科学, 2025, 33(7): 360-368. |
| [11] | 缑迅杰, 徐鑫茹, 徐泽水. 基于动态社会网络的能源转型路径评估多属性群决策建模研究[J]. 中国管理科学, 2025, 33(6): 346-359. |
| [12] | 罗世华, 刘俊. 拓展区间Fermatean模糊前景理论综合评价方法[J]. 中国管理科学, 2025, 33(6): 129-139. |
| [13] | 张磊, 韩可可, 叶鑫. 广义Z-numbers证据下考虑专家影响力和评价值一致性的应急决策方法[J]. 中国管理科学, 2025, 33(6): 160-170. |
| [14] | 常志朋, 王治莹, 陈闻鹤. 基于模糊影响图和前景理论的重大疫情防控策略决策方法[J]. 中国管理科学, 2025, 33(6): 171-181. |
| [15] | 赵程伟, 徐选华, 刘瑞环, 何继善. 模糊异构环境下考虑双重交互及不完全理性心理的多属性群决策及应用[J]. 中国管理科学, 2025, 33(6): 182-195. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||
|
||