首页 | 本学科首页   官方微博 | 高级检索  
     

网络信息抽取技术及其在TBT预警中的应用*
引用本文:翟东升,余旸,李莉. 网络信息抽取技术及其在TBT预警中的应用*[J]. 现代图书情报技术, 2005, 21(9): 76-79
作者姓名:翟东升  余旸  李莉
作者单位:北京工业大学经济与管理学院,北京,100022
基金项目:基金项目:北京市自然科学基金资助项目(9042001);国家社科基金资助项目(04BJY061).
摘    要:研究了一种能够实现对数据型网页中信息实施实时采集的信息技术。该技术能够智能识别表格结构,自动分离数据项,在对数据项的分析判断过程中,采用从单词上分类(By Words)和从表格排列方式(By Structure)划分相结合的方法,以Ontology思想为支撑,融合支持向量机算法(SVM)和隐马尔可夫模型(HMM)等一系列成熟模型。最后通过测试并将该技术应用于TBT预警信息动态采集子系统中,收到良好效果。

关 键 词:本体  信息抽取  TBT
收稿时间:2005-06-08
修稿时间:2005-06-25

The Technology of Web Information Extraction and Its Application in the TBT Early-Warning System
Zhai Dongsheng,Yu Yang,Li Li. The Technology of Web Information Extraction and Its Application in the TBT Early-Warning System[J]. New Technology of Library and Information Service, 2005, 21(9): 76-79
Authors:Zhai Dongsheng  Yu Yang  Li Li
Affiliation:(The Economics and Management School, Beijing University of Technology, Beijing 100022,China)
Abstract:This paper researches into an information technology, which could real-timely extract the interested information from data-type Web pages. The technology we employ could intelligently identify table structures, and automatically separate different kinds of data. In the process of analyzing and classifying data, it adopts the combination of sorting by words and dividing by table structure, which depends on the idea of ontology and aggregates a series of mature models, such as SVM and HMM. The technology, which has passed the test, is applied into a dynamic information gathering system of a TBT early-warning system and does a good work.
Keywords:Ontology Information extraction TBT
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《现代图书情报技术》浏览原始摘要信息
点击此处可从《现代图书情报技术》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号