A mathematical model for estimating the effectiveness of bigram coding |
| |
Authors: | Abraham Bookstein Gary Fouty |
| |
Institution: | Graduate Library School, The University of Chicago, Chicago, IL 60637, U.S.A.;Iowa State University Library, Ames, IA 50010, U.S.A. |
| |
Abstract: | This paper discusses bigram coding as a technique for compacting data. A mathematical model is developed that estimates the effectiveness of such a code as a function of the fraction of bigram tokens that are encodeable: this model accounts for the degree of overlap of encodeable tokens by assuming that bigram token occurrences have a Markov property. The model requires that a single parameter be fit to the data. The results of an experiment testing this model on a file of catalog data in a library is given, and excellent agreement is found. This model provides substantial improvement over an earlier model in which bigrams are assumed to occur independently of each other. |
| |
Keywords: | |
本文献已被 ScienceDirect 等数据库收录! |
|