首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Automatic new topic identification using multiple linear regression
Authors:Seda Ozmutlu
Institution:Department of Industrial Engineering, Uludag University, Muhendislik-Mimarlik Fakultesi, Gorukle, 16059 Bursa, Turkey
Abstract:The purpose of this study is to provide automatic new topic identification of search engine query logs, and estimate the effect of statistical characteristics of search engine queries on new topic identification. By applying multiple linear regression and multi-factor ANOVA on a sample data log from the Excite search engine, we demonstrated that the statistical characteristics of Web search queries, such as time interval, search pattern and position of a query in a user session, are effective on shifting to a new topic. Multiple linear regression is also a successful tool for estimating topic shifts and continuations. The findings of this study provide statistical proof for the relationship between the non-semantic characteristics of Web search queries and the occurrence of topic shifts and continuations.
Keywords:Search engine  Topic identification  Regression  ANOVA  Information retrieval
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号