首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Efficient online index maintenance for contiguous inverted lists
Authors:Nicholas Lester  Justin Zobel  Hugh Williams
Institution:1. School of Computer Science and Information Technology, RMIT University, Victoria 3001, Australia;2. MSN Search, Microsoft Corporation, One Microsoft Way, Redmond, WA 98052, United States
Abstract:Search engines and other text retrieval systems use high-performance inverted indexes to provide efficient text query evaluation. Algorithms for fast query evaluation and index construction are well-known, but relatively little has been published concerning update. In this paper, we experimentally evaluate the two main alternative strategies for index maintenance in the presence of insertions, with the constraint that inverted lists remain contiguous on disk for fast query evaluation. The in-place and re-merge strategies are benchmarked against the baseline of a complete re-build. Our experiments with large volumes of web data show that re-merge is the fastest approach if large buffers are available, but that even a simple implementation of in-place update is suitable when the rate of insertion is low or memory buffer size is limited. We also show that with careful design of aspects of implementation such as free-space management, in-place update can be improved by around an order of magnitude over a naïve implementation.
Keywords:Text indexing  Search engines  Index construction  Index update
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号