Implementing agglomerative hierarchic clustering algorithms for use in document retrieval期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Implementing agglomerative hierarchic clustering algorithms for use in document retrieval

Authors:	Ellen M Voorhees

Abstract:	Searching hierarchically clustered document collections can be effective6], but creating the cluster hierarchies is expensive, since there are both many documents and many terms. However, the information in the document-term matrix is sparse: Documents are usually indexed by relatively few terms. This paper describes the implementations of three agglomerative hierarchic clustering algorithms that exploit this sparsity so that collections much larger than the algorithms' worst case running times would suggest can be clustered. The implementations described in the paper have been used to cluster a collection of 12,000 documents.

Keywords:
本文献已被 ScienceDirect 等数据库收录！