首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Unsupervised adaptive microblog filtering for broad dynamic topics
Institution:1. Qatar Computing Research Institute, HBKU, Doha, Qatar;2. Department of Computer Science and Engineering, College of Engineering, Qatar University, Doha, Qatar;1. Institute of Computing, Federal University of Amazonas, AM, Brazil;2. Department of Computer Science, Federal University of Minas Gerais, MG, Brazil;3. Institute of Computing, University of Campinas, SP, Brazil;1. Pattern Recognition and Human Language Technology (PRHLT) Research Center, Universitat Politècnica de València, Camino de Vera s/n, Valencia 46022, Spain;2. Computer Science Department, Instituto Nacional de Astrofísica, Óptica y Electrónica, Luis Enrique Erro 1, Puebla 72840, Mexico;1. Department of Information Management, National Sun Yat-Sen University, No. 70, Lienhai Rd., Kaohsiung 80424, Taiwan;2. School of Information Sciences, University of Pittsburgh, 135 North Bellefield Avenue, Pittsburgh, PA 15260, USA;1. Institute of Computing, Federal University of Amazonas –Av. Gen. Rodrigo Otávio, 3000, Manaus 69077-000, AM, Brazil;2. Neemu S/A, Av. Via Lactea, 1374, Manaus 69060-020, AM, Brazil
Abstract:Information filtering has been a major task of study in the field of information retrieval (IR) for a long time, focusing on filtering well-formed documents such as news articles. Recently, more interest was directed towards applying filtering tasks to user-generated content such as microblogs. Several earlier studies investigated microblog filtering for focused topics. Another vital filtering scenario in microblogs targets the detection of posts that are relevant to long-standing broad and dynamic topics, i.e., topics spanning several subtopics that change over time. This type of filtering in microblogs is essential for many applications such as social studies on large events and news tracking of temporal topics. In this paper, we introduce an adaptive microblog filtering task that focuses on tracking topics of broad and dynamic nature. We propose an entirely-unsupervised approach that adapts to new aspects of the topic to retrieve relevant microblogs. We evaluated our filtering approach using 6 broad topics, each tested on 4 different time periods over 4 months. Experimental results showed that, on average, our approach achieved 84% increase in recall relative to the baseline approach, while maintaining an acceptable precision that showed a drop of about 8%. Our filtering method is currently implemented on TweetMogaz, a news portal generated from tweets. The website compiles the stream of Arabic tweets and detects the relevant tweets to different regions in the Middle East to be presented in the form of comprehensive reports that include top stories and news in each region.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号