首页 | 本学科首页   官方微博 | 高级检索  
     检索      

网页体裁自动识别研究
引用本文:王海洋.网页体裁自动识别研究[J].人天科学研究,2013(4):1-3.
作者姓名:王海洋
作者单位:四川大学计算机学院,四川成都610065
摘    要:随着网络的飞速发展,网页数量急剧膨胀,近几年来更是以指数级进行增长,搜索引擎面临的挑战越来越严峻,很难从海量的网页中准确快捷地找到符合用户需求的网页。网页分类是解决这个问题的有效手段之一,基于网页主题分类和基于网页体裁分类是网页分类的两大主流,二者有效地提高了搜索引擎的检索效率。网页体裁分类是指按照网页的表现形式及其用途对网页进行分类。介绍了网页体裁的定义,网页体裁分类研究常用的分类特征,并且介绍了几种常用特征筛选方法、分类模型以及分类器的评估方法,为研究者提供了对网页体裁分类的概要性了解。

关 键 词:网页分类  网页体裁  特征选择  机器学习

Automatic Identification of Web Page Genres
Abstract:With the rapid development of the network, the number of web pages is expanding. In recent years, it is more index level growth, the challenges of search engine is more and more severe. It is difficult to find the pages, which meet the needs of users, from huge web pages accurately and quickly. Web page elassifieation is one of the effective methods to solve this problem. There are two big mainstream of web page classification: based on the web page subject classification and based on web genre classification. They can improve the searching efficiency of search engine effectively. Web page genre classification always distinguishes web pages according to its' form and purpose. This paper describes the definition of web page genre and introduces the popular features used screening methods, several classification models always used understanding of web genre classification to researchers. inweb page genre classification. It also introduces several and assessment methods of classifier. It provides a probably
Keywords:Web Page Classification  Web Page Genre  Feature Selection  Machine Learning
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号