首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Sorting of textual data bases: A variety generation approach to distribution sorting
Authors:David Cooper  Mary E Dicker  Michael F Lynch
Institution:Postgraduate School of Librarianship and Information Science, University of Sheffield, Sheffield S10 2TN, England
Abstract:A method of sorting large textual data-bases by computer using external storage is proposed. The range of sort-keys in a sample of data to be sorted is divided into a fixed set of partitions, which should also give an adequate representation of new data from a similar source. The partitions are composed of ordered key ranges. An incoming data stream is distributed into a series of bins according to the partition in which the key lies, and the bins are then seperately sorted, using an internal sort, to give an ordered file. It is shown how the number of disc accesses needed depends on the manner in which the bins become filled, and thus on statistics of the data. Experiments using an INSPEC data-base give information on which estimates of the efficiency of the method can be based.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号