首页 | 本学科首页   官方微博 | 高级检索  
     

基于句子关系图的网页文本主题句抽取*
引用本文:何维,王宇. 基于句子关系图的网页文本主题句抽取*[J]. 现代图书情报技术, 2009, 3(3): 57-61
作者姓名:何维  王宇
作者单位:大连理工大学管理学院,大连,116024
摘    要:针对网页文本结构信息少、噪声大的特点,将句子看作点,将句子间的相似性看作边,用句子关系图描述文本中句子间的关系。抽取文本主题句的任务转化为搜索图中边最多的点。利用语义词典,将句子相似度定义为句子语义相似度,解决短文本词频相似度低的问题。选用互联网公开语料进行测试,抽取的主题句达到平均80.6%的可接受性。

关 键 词:主题句  句子关系图  句子相似度
收稿时间:2008-12-29
修稿时间:2009-01-21

Extracting Topic Sentences form Web Text Based on Sentence Relationship Map
He Wei,Wang Yu. Extracting Topic Sentences form Web Text Based on Sentence Relationship Map[J]. New Technology of Library and Information Service, 2009, 3(3): 57-61
Authors:He Wei  Wang Yu
Affiliation:(School of Management, Dalian University of Technology, Dalian 116024, China)
Abstract:Concerning the issues of Web text with little structure information and big noise, sentences are viewed as nodes and similarities between them are viewed as edges, a relationship map is used to describe the relationship between sentences. Topic sentences of a text can be got through searching the nodes which have most of edges. Using the semantic dictionary, sentence similarity is defined as its semantic similarity to address the problem of low word frequency similarity of short text. An internet public campus is chosen to take a test, 80.6% acceptability have been achieved.
Keywords:Topic Sentence  Sentence Relationship Map  Sentence similarity
本文献已被 万方数据 等数据库收录!
点击此处可从《现代图书情报技术》浏览原始摘要信息
点击此处可从《现代图书情报技术》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号