首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种基于改进K-means的文档聚类算法的实现研究
引用本文:岑咏华,王晓蓉,吉雍慧.一种基于改进K-means的文档聚类算法的实现研究[J].现代图书情报技术,2008,3(12):73-79.
作者姓名:岑咏华  王晓蓉  吉雍慧
作者单位:1. 南京大学信息管理系,南京,210093;南京理工大学经济管理学院,南京,210094
2. 南京理工大学经济管理学院,南京,210094
3. 南京大学信息管理系,南京,210093
摘    要:在对文档聚类的含义、作用和一般过程的阐述基础上,分析一种基于“最小最大”原则初始质心优选的改进K-means聚类的基本思想,并重点设计相关的聚类算法,实现聚类系统,基于系统对300篇学术文档及其相关特征词语进行聚类实验。实验结果表明,本文所设计和实现的改进K-means的聚类算法表现出较好的性能。

关 键 词:文档聚类  K-means
收稿时间:2008-08-18
修稿时间:2008-09-05

Algorithm and Experiment Research of Textual Document Clustering Based on Improved K-means
Cen Yonghua,Wang Xiaorong,Ji Yonghui.Algorithm and Experiment Research of Textual Document Clustering Based on Improved K-means[J].New Technology of Library and Information Service,2008,3(12):73-79.
Authors:Cen Yonghua  Wang Xiaorong  Ji Yonghui
Institution:1(Department of Information Management,Nanjing University,Nanjing 210093,China) 2(Department of Information Management,Nanjing University of Science & Technology,Nanjing 210094,China)
Abstract:After a concise introduction of conotation,functions and general processs of textual document clustering, this paper expotiates the basic mechanism of a kind of improved K-means clustering based on initial centroids selection through minimum-maximum principle, designs its algorithm, implements the clustering system, and conducts several experiments taking 300 academic articles and relative characteristic words for instances, which prove the good performance of the algorithm proposed.
Keywords:Textual document clustering  K-means
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《现代图书情报技术》浏览原始摘要信息
点击此处可从《现代图书情报技术》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号