社交网络中存在海量用户,如何有效推荐好友是社交网络可持续发展的重要环节,也是社交网络相关研究的重要主题。当前实践及现有研究往往基于用户的显性信息推荐好友,而忽略了用户之间的隐性社会关系;此外,显性信息往往不够完整且存在虚假信息问题。为有效实现好友推荐,本文提出了基于用户社会关系的好友推荐算法,并重点应用关联规则算法分析用户之间的隐含关联度,构造用户之间的网络有向图及关系转移矩阵;然后,结合关系转移矩阵与PageRank算法计算每个用户的分数,将分数较高的用户推荐给目标用户。在此基础上,本文引入用户影响力,提出综合考虑用户社会关系及用户影响力的PeopleRank算法。为验证算法的合理性和有效性,将本文所提出的两种算法与传统的社会过滤算法、PageRank算法进行对比分析。为此,本文抓取了Twitter社交网站上用户数据开展实验分析。实验结果显示本文所提出的算法具有较好的推荐效果,尤其是综合考虑用户社会关系及用户影响力的好友推荐算法在推荐准确率和推荐召回率上都有明显的优势。
Due to the vast amounts of users, it is difficult for a user to make effective connections with others for common interests. Friend recommendation on online social networks, therefore, becomes a challenging research issue, which may have significant effects on sustainable developments of social networks.Most of the existing friend recommendation methods are conducted based on users' explicit information such as background, demography, interests and posts, while ignoring users' implicit information such as their social relationships. Notably, explicit information is often incomplete and not trustworthy, and cannot be appropriately used to measure user similarities.In order to effectively recommend friends, a recommendation algorithm is proposed based on user relationship information in online social networks. In the described algorithm, user relationships are characterized by using the association analysis method, and then a weighted, directed graph between network users is constructed. Based on this graph, this algorithm builds a transition matrix and uses the PageRank algorithm to calculateusers' scores that indicate the acceptance probabilities, and then recommend the users with high scores to the target user on social networks. In addition, with the consideration of the user authority in a specific social network, an enhanced friend recommendation algorithm is further developed.In order to validate the proposed approaches, friend recommendation experiments on Twitter are conducted and the users' information and their relationship data are extracted. For this purpose, two traditional methods, i.e., social filtering algorithm and the PageRank algorithm, are used to compare with the two proposed approaches based on two measures, i.e., accuracy and recall rate. Experiments results show that the proposed recommendation algorithms yield clearly better results in accuracy and recall rate than the traditional recommendation algorithms.
[1] 中国互联网信息中心.2014年中国社交类应用用户行为研究报告[EB/OL].[2014-08-22].http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/201408/P020140822379356612744.pdf.
[2] Shriver SK,Nair HS,Hofstetter R.Social ties and user-generated content:Evidence from an online social network[J].Management Science,2013,59(6):1425-1443.
[3] Zheng Yu, Zhang Lizhu, Ma Zhengnin, et al.Recommending friends and locations based on individual location history[J].ACM Transactions on the Web (TWEB),2011,5(1):5.
[4] 胡文江,胡大伟,高永兵,等.基于关联规则与标签的好友推荐算法[J].计算机工程与科学,2013,35(2):109-113.
[5] 于海群,刘万军,邱云飞.基于用户话题偏好的社会网络二级人脉推荐[J].计算机应用,2012,32(05):1366-1370.
[6] Weng Jianshu, Lim E P, Jiang Jing, et al. Twitterrank:Finding topic-sensitive influential twitterers[C]//Proceedings of the third ACM international conference on Web search and data mining:New York, USA, February 04-06,2010.
[7] Chen Jilin, Geyer W, Dugan C, et al. Make new friends, but keep the old:Recommending people on social networking sites[C]//Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Boston, USA, April 04-09,2009.
[8] Huang Wuhan, Meng Xiangwu, Wang Licai. A collaborative filtering algorithm based on users' social relationship mining in mobile communication network[J].Journal of Electronics and Information Technology,2011,33(12):3002-3007.
[9] 高永兵,杨红磊,刘春祥,等.基于内容与社会过滤的好友推荐算法研究[J].微型机与应用,2013,32(14):75-78.
[10] 舒琰,向阳,张骐,等.基于PageRank的微博排名MapReduce算法研究[J].计算机技术与发展,2013,23(2):73-76.
[11] 王平,龙毅宏,唐志红,等.基于社会关系的互联网信任建立模式研究[J].软件,2011,32(4):12-15.
[12] Chen Can, Feng Haodi. Microblog recommendation based on user interaction[C]//Proceedings of 2012 2nd International Conference on.Computer Science and Network Technology (ICCSNT),December 29-31,2012.
[13] 向程冠,熊世桓,王东.基于关联规则的社交网络好友推荐算法[J].中国科技论文,2014,9(1):87-91.
[14] 丁兆云,贾焰,周斌.微博数据挖掘研究综述[J].计算机研究与发展,2015,51(4):691-706.
[15] 官思发,朝乐门.大数据时代信息分析的关键问题、挑战与对策[J].图书情报工作,2015,59(3):12-18.
[16] 赵宇,黄思明,陈锐.数据分类中的特征选择算法研究[J].中国管理科学,2013,21(6):38-46.
[17] 徐志明,李栋,刘挺,等.微博用户的相似性度量及其应用[J].计算机学报,2014,37(1):207-218.
[18] Klimek P, Thurner S. Triadic closure dynamics drives scaling laws in social multiplex networks[J].New Journal of Physics,2013,15(6):63008-63016.
[19] Opsahl T. Triadic closure in two-mode networks:Redefining the global and local clustering coefficients[J].Social Networks,2013,35(2):159-167.
[20] Naruchitparames J, Gunes M H, Louis S J. Friend recommendations in social networks using genetic algorithms and network topology[C]//Proceedings of 2011 IEEE Congress on.Evolutionary Computation (CEC).New Drleans, USA, June 5-8,2011.
[21] Orriols-Puig A, Martínez-López F J, Casillas J, et al. Unsupervised KDD to creatively support managers' decision making with fuzzy association rules:A distribution channel application[J].Industrial Marketing Management,2013,42(4):532-543.
[22] 张玲玲,周全亮,唐广文,等.基于领域知识和聚类的关联规则深层知识发现研究[J].中国管理科学, 2015,23(2):154-161.
[23] 蔡伟杰,张晓辉.关联规则挖掘综述[J].计算机工程,2001,27(5):31-33.
[24] 欧卫,欧缤忆,谢赞福,等.一种基于PageRank的微博用户影响度评估算法[J].计算机与现代化,2013,(12):34-37.
[25] Rajaraman A, Ullman J D.大数据:互联网大规模数据挖掘与分布式处理[M].王斌,译.北京:人民邮电大学出版社,2012.
[26] 李强,王申康.一种基于PageRank算法原理的会员人气度排序算法[J].计算机系统应用,2008,(1):27-30.
[27] 姬新龙,周孝华.基于马尔科夫随机波动和极值理论的风险测度[J].中国管理科学,2014,22(10):44-51.
[28] Arizona State University. Twitter dataset[R/OL]. http://socialcomputing.asu.edu/datasets/Twitte.
[29] Hsu W H, King A L, Paradesi M S R, et al. Collaborative and structural recommendation of friends using weblog-based social network analysis[C]//Proceedings of AAAI Spring Symposium:Computational Approaches to Analyzing Weblogs.Stanford, California, USA, March 27-29,2006.