首页 | 本学科首页   官方微博 | 高级检索  
     


Compression of index term dictionary in an inverted-file-orientated database: Some effective algorithms
Authors:Janusz L. Winiewski
Affiliation:1. Department of Computer Science, University of Helsinki, Finland;2. Department of Computer Science, University of Chile, Chile;3. CeBiB — Center for Biotechnology and Bioengineering, Chile
Abstract:A new method of index term dictionary compression in an inverted-file-orientated database is discussed. A technique of word coding that generates short fixed-length codes obtained from the index terms themselves by analysis of monogram and bigram statistical distributions is described. Transformation of the index term dictionary into a code dictionary preserves a word-to-word discrimination with a rate of three synonyms per 1300 terms, at compression ratio up to 90% and at low cost in terms of the CPU time expenditure. When applied in computer network environment, it offers substantial savings in communication channel utilization at negligible response time degradation. Experimental data for 26,113 index term dictionary of the New York Times Info Bank available via a computer network are presented.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号