首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Improving semistatic compression via phrase-based modeling
Authors:Nieves R Brisaboa  Antonio Fariña  Gonzalo Navarro  José R Paramá
Institution:1. Database Lab, Facultade de Informática, University of A Coruña, Campus de Elviña s/n, 15071 A Coruña, Spain;2. Department of Computer Science, University of Chile, Blanco Encalada 2120, Santiago, Chile
Abstract:In recent years, new semistatic word-based byte-oriented text compressors, such as Tagged Huffman and those based on Dense Codes, have shown that it is possible to perform fast direct search over compressed text and decompression of arbitrary text passages over collections reduced to around 30–35% of their original size. Much of their success is due to the use of words as source symbols and a byte-oriented target alphabet. This approach broke with traditional statistical compressors, which use characters as source symbols and a bit-oriented target alphabet.
Keywords:Text compression  Direct search
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号