ABSTRACT
Given a bipartite graph of users and the products that they review, or followers and followees, how can we detect fake reviews or follows? Existing fraud detection methods (spectral, etc.) try to identify dense subgraphs of nodes that are sparsely connected to the remaining graph. Fraudsters can evade these methods using camouflage, by adding reviews or follows with honest targets so that they look "normal". Even worse, some fraudsters use hijacked accounts from honest users, and then the camouflage is indeed organic. Our focus is to spot fraudsters in the presence of camouflage or hijacked accounts. We propose FRAUDAR, an algorithm that (a) is camouflage-resistant, (b) provides upper bounds on the effectiveness of fraudsters, and (c) is effective in real-world data. Experimental results under various attacks show that FRAUDAR outperforms the top competitor in accuracy of detecting both camouflaged and non-camouflaged fraud. Additionally, in real-world experiments with a Twitter follower-followee graph of 1.47 billion edges, FRAUDAR successfully detected a subgraph of more than 4000 detected accounts, of which a majority had tweets showing that they used follower-buying services.
Supplemental Material
- L. Akoglu, R. Chandy, and C. Faloutsos. Opinion fraud detection in online reviews by network effects. In ICWSM, 2013.Google Scholar
- A. Beutel, W. Xu, V. Guruswami, C. Palow, and C. Faloutsos. Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In 22nd WWW, pages 119--130. International World Wide Web Conferences Steering Committee, 2013. Google ScholarDigital Library
- Q. Cao, M. Sirivianos, X. Yang, and T. Pregueiro. Aiding the detection of fake accounts in large scale social online services. In NSDI, 2012. Google ScholarDigital Library
- M. Charikar. Greedy approximation algorithms for finding dense components in a graph. In Approximation Algorithms for Combinatorial Optimization, pages 84--95. Springer, 2000. Google ScholarCross Ref
- C. Cortes, D. Pregibon, and C. Volinsky. Communities of interest. Springer, 2001.Google ScholarCross Ref
- S. Ghosh, B. Viswanath, F. Kooti, N. K. Sharma, G. Korlam, F. Benevenuto, N. Ganguly, and K. P. Gummadi. Understanding and combating link farming in the twitter social network. In 21st WWW, pages 61--70. ACM, 2012. Google ScholarDigital Library
- C. Giatsidis, D. M. Thilikos, and M. Vazirgiannis. Evaluating cooperation in communities with the k-core structure. In Advances in Social Networks Analysis and Mining (ASONAM), 2011 International Conference on, pages 87--93. IEEE, 2011. Google ScholarDigital Library
- Z. Gu, K. Pei, Q. Wang, L. Si, X. Zhang, and D. Xu. Leaps: Detecting camouflaged attacks with statistical learning guided by program analysis.Google Scholar
- Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. Combating web spam with trustrank. In VLDB Endowment, pages 576--587, 2004. Google ScholarDigital Library
- M. Jiang, A. Beutel, P. Cui, B. Hooi, S. Yang, and C. Faloutsos. A general suspiciousness metric for dense blocks in multimodal data. In Data Mining (ICDM), 2015 IEEE International Conference on, pages 781--786. IEEE, 2015. Google ScholarDigital Library
- M. Jiang, P. Cui, A. Beutel, C. Faloutsos, and S. Yang. Catchsync: catching synchronized behavior in large directed graphs. In 20th KDD, pages 941--950. ACM, 2014. Google ScholarDigital Library
- M. Jiang, P. Cui, A. Beutel, C. Faloutsos, and S. Yang. Inferring strange behavior from connectivity pattern in social networks. In Advances in Knowledge Discovery and Data Mining, pages 126--138. Springer, 2014.Google ScholarCross Ref
- N. Jindal and B. Liu. Opinion spam and analysis. In ICDM 2008, pages 219--230. ACM, 2008. Google ScholarDigital Library
- G. Karypis and V. Kumar. METIS: Unstructured graph partitioning and sparse matrix ordering system. The University of Minnesota, 2, 1995.Google Scholar
- J. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46(5):604--632, 1999. Google ScholarDigital Library
- H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In 19th WWW, pages 591--600. ACM, 2010. Google ScholarDigital Library
- J. Leskovec, D. Huttenlocher, and J. Kleinberg. Signed networks in social media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 1361--1370. ACM, 2010. Google ScholarDigital Library
- J. McAuley and J. Leskovec. Hidden factors and hidden topics: understanding rating dimensions with review text. In Proceedings of the 7th ACM conference on Recommender systems, pages 165--172. ACM, 2013. Google ScholarDigital Library
- A. Molavi Kakhki, C. Kliman-Silver, and A. Mislove. Iolaus: Securing online content rating systems. In 22nd WWW, pages 919--930. International World Wide Web Conferences Steering Committee, 2013. Google ScholarDigital Library
- M. Ott, Y. Choi, C. Cardie, and J. T. Hancock. Finding deceptive opinion spam by any stretch of the imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pages 309--319. Association for Computational Linguistics, 2011. Google ScholarDigital Library
- S. Pandit, D. H. Chau, S. Wang, and C. Faloutsos. Netprobe: a fast and scalable system for fraud detection in online auction networks. In 16th WWW, pages 201--210. ACM, 2007. Google ScholarDigital Library
- B. Perozzi, L. Akoglu, P. Iglesias Sánchez, and E. Müller. Focused clustering and outlier detection in large attributed graphs. In 20th KDD, pages 1346--1355. ACM, 2014. Google ScholarDigital Library
- B. Prakash, M. Seshadri, A. Sridharan, S. Machiraju, and C. Faloutsos. Eigenspokes: Surprising patterns and community structure in large graphs. PAKDD, 2010a, 84, 2010.Google Scholar
- A. Rajaraman, J. D. Ullman, J. D. Ullman, and J. D. Ullman. Mining of massive datasets, volume 1. Cambridge University Press Cambridge, 2012. Google ScholarDigital Library
- N. Shah, A. Beutel, B. Gallagher, and C. Faloutsos. Spotting suspicious link behavior with fbox: An adversarial perspective. arXiv preprint arXiv:1410.3915, 2014.Google Scholar
- D. N. Tran, B. Min, J. Li, and L. Subramanian. Sybil-resilient online content voting. In NSDI, volume 9, pages 15--28, 2009. Google ScholarDigital Library
- C. Tsourakakis. The k-clique densest subgraph problem. In 24th WWW, pages 1122--1132. International World Wide Web Conferences Steering Committee, 2015. Google ScholarDigital Library
- S. Virdhagriswaran and G. Dakin. Camouflaged fraud detection in domains with complex relationships. In 12th KDD, pages 941--947. ACM, 2006. Google ScholarDigital Library
- H. Wang, Y. Lu, and C. Zhai. Latent aspect rating analysis without aspect keyword supervision. In 17th KDD, pages 618--626. ACM, 2011. Google ScholarDigital Library
- B. Wu, V. Goel, and B. D. Davison. Propagating trust and distrust to demote web spam. MTW, 190, 2006.Google Scholar
- H. Yu, P. B. Gibbons, M. Kaminsky, and F. Xiao. Sybillimit: A near-optimal social network defense against sybil attacks. In Security and Privacy, 2008. SP 2008. IEEE Symposium on, pages 3--17. IEEE, 2008. Google ScholarDigital Library
- H. Yu, M. Kaminsky, P. B. Gibbons, and A. Flaxman. Sybilguard: defending against sybil attacks via social networks. ACM SIGCOMM Computer Communication Review, 36(4):267--278, 2006. Google ScholarDigital Library
Index Terms
- FRAUDAR: Bounding Graph Fraud in the Face of Camouflage
Recommendations
FdGars: Fraudster Detection via Graph Convolutional Networks in Online App Review System
WWW '19: Companion Proceedings of The 2019 World Wide Web ConferenceOnline review system enables users to submit reviews about the products. However, the openness of Internet and monetary rewards for crowdsourcing tasks stimulate a large number of fraudulent users to write fake reviews and post advertisements to ...
Graph-Based Fraud Detection in the Face of Camouflage
Special Issue on KDD 2016 and Regular PapersGiven a bipartite graph of users and the products that they review, or followers and followees, how can we detect fake reviews or follows? Existing fraud detection methods (spectral, etc.) try to identify dense subgraphs of nodes that are sparsely ...
Graph-based review spammer group detection
Online product reviews nowadays are increasingly prevalent in E-commerce websites. People often refer to product reviews to evaluate the quality of a product before purchasing. However, there have been a large number of review spammers who often work ...
Comments