首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种基于粗糙集角分类神经网络的文档分类方法
引用本文:张卫丰,徐宝文,崔自峰,徐峻岭.一种基于粗糙集角分类神经网络的文档分类方法[J].东南大学学报,2006,22(3):439-444.
作者姓名:张卫丰  徐宝文  崔自峰  徐峻岭
作者单位:南京邮电大学计算机学院,东南大学计算机科学与工程学院,东南大学计算机科学与工程学院,东南大学计算机科学与工程学院 南京210003东南大学计算机科学与工程学院,南京210096江苏省软件质量研究所,南京210096,南京210096江苏省软件质量研究所,南京210096,南京210096江苏省软件质量研究所,南京210096,南京210096江苏省软件质量研究所,南京210096
基金项目:The National Natural Science Foundation of China (No. 60503020, 60373066. 60403016, 60425206), the Natural Science Foundation of Jiangsu Higher Education Institutions ( No. 04KJB520096), the Doctoral Foundation of Nanjing University of Posts and Telecommunication (No. 0302).
摘    要:针对文档分类过程中不同大小文档表示、文档特征选择和文档特征编码问题,提出了一种基于粗糙集的角分类神经网络Rough-CC4.利用近义词构成等价类,以此表示文档,可以缩小文档表示的维数、解决由于文档不同大小导致的精度问题、模糊近义词之间的差别;利用二进制编码方法对文档特征编码,可以提高Rough-CC4的精度,同时减小Rough-CC4的空间复杂度.Rough-CC4可以广泛用于大量文档集合的自动分类.

关 键 词:文档分类  神经网络  粗糙集  元搜索引擎
收稿时间:04 6 2006 12:00AM

Document classification approach by rough-set-based corner classification neural network
Zhang Weifeng,Xu Baowen,Cui Zifeng,Xu Junling.Document classification approach by rough-set-based corner classification neural network[J].Journal of Southeast University(English Edition),2006,22(3):439-444.
Authors:Zhang Weifeng  Xu Baowen  Cui Zifeng  Xu Junling
Institution:1. College of Computer, Nanjing University of Posts and Telecommunications, Nanjing 210003, China; 2. College of Computer Science and Engineering, Southeast University, Nanjing 210096, China; 3.Jiangsu Institute of Software Quality, Nanjing 210096, China
Abstract:A rough set based corner classification neural network, the Rough-CC4, is presented to solve document classification problems such as document representation of different document sizes,document feature selection and document feature encoding.In the Rough-CC4,the documents are described by the equivalent classes of the approximate words.By this method,the dimensions representing the documents can be reduced,which can solve the precision problems caused by the different document sizes and also blur the differences caused by the approximate words.In the Rough-CC4,a binary encoding method is introduced,through which the importance of documents relative to each equivalent class is encoded.By this encoding method,the precision of the Rough-CC4 is improved greatly and the space complexity of the Rough-CC4 is reduced.The Rough-CC4 can be used in automatic classification of documents.
Keywords:document classification  neural network  rough set  meta search engine  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号