首页 | 本学科首页   官方微博 | 高级检索  
     

基于共词分析的文本主题词聚类与主题发现
引用本文:王小华,徐宁,谌志群. 基于共词分析的文本主题词聚类与主题发现[J]. 情报科学, 2011, 0(11)
作者姓名:王小华  徐宁  谌志群
作者单位:杭州电子科技大学计算机应用技术研究所;
基金项目:浙江省自然科学基金资助项目(Y1100176)
摘    要:文本主题检测可以很好的挖掘海量信息中的关键因子,本文主要通过基于共词分析方法对文本主题词进行聚类从而发现当前的主题,首先通过停用词过滤和TF-IDF关键词提取技术提取出主题词串,然后构建共词矩阵,最后通过Bisecting K-means算法对主题词串进行聚类分析,从而发现主题。实验结果表明,该方法对热点主题提取有一定的效果。

关 键 词:共词分析  TF-IDF  共词矩阵  Bisecting  K-means  主题  

Discovering of Subjects and Clustering of Textual Subject Terms Based on Co-word Analysis
WANG Xiao-hua,XU Ning,CHEN Zhi-qun. Discovering of Subjects and Clustering of Textual Subject Terms Based on Co-word Analysis[J]. Information Science, 2011, 0(11)
Authors:WANG Xiao-hua  XU Ning  CHEN Zhi-qun
Affiliation:WANG Xiao-hua,XU Ning,CHEN Zhi-qun (Institute of Computer Application Technology,Hangzhou University Dianzi Science and Technology,Hangzhou 310018,China)
Abstract:Text topic detection can detect the most important aspects of the vast information,This article clusters the subject terms based on the method of analysing common words,and then finds the current theme.Firstly,We extracted
Keywords:string  by filting the stop words and TF-IDF keywords extraction technique  next  we constructed the Co-word matrix.Last  we analysed keywords string in clustering through Bisecting K-means algorithm to find the theme.Experimental results show that this method is of hot subject extraction. Keywords:co-word analysis  TF-IDF  co-word matrix  bisecting  k-means  theme  
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号