首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于隐马尔科夫模型的中文术语识别研究
引用本文:岑咏华,韩哲,季培培.基于隐马尔科夫模型的中文术语识别研究[J].现代图书情报技术,2008,24(12):54-58.
作者姓名:岑咏华  韩哲  季培培
作者单位:1. 南京理工大学经济管理学院,南京210094;南京大学信息管理系,南京,210093
2. 南京大学信息管理系,南京,210093
3. 中国科学院国家科学图书馆,北京,100190;中国科学院研究生院,北京,100049
摘    要:基于对中文文本信息语法构成尤其是词性搭配的概率特征的分析,提出一种基于双层隐马尔科夫模型的中文泛术语识别和提取的思路和系统框架,并实现相关系统,基于训练语料对多个领域的文本信息进行术语提取测试。实验结果表明,所提出的基于隐马尔科夫模型的中文泛术语识别和提取思想具有较好的实践参考意义。

关 键 词:中文术语识别和提取  隐马尔科夫  HMM
收稿时间:2008-08-13
修稿时间:2008-09-09

Chinese Term Recognition Based on Hidden Markov Model
Cen Yonghua,Han Zhe,Ji Peipei.Chinese Term Recognition Based on Hidden Markov Model[J].New Technology of Library and Information Service,2008,24(12):54-58.
Authors:Cen Yonghua  Han Zhe  Ji Peipei
Institution:(School of Economics and Management,Nanjing University of Science &; Technology,Nanjing 210094,China) (Department of Information Management,Nanjing University,Nanjing 210093,China) (National Science Library, Chinese Academy of Sciences, Beijing 100190,China) (Graduate University of  Chinese Academy of Sciences, Beijing 100049, China)
Abstract:After a perceptive analysis of probabilistic characteristics of syntax composition especially POS matching of Chinese textual information,a system framework for Chinese term recognition and extraction based on dual layer HMM is presented and implemented.The method proposed shows a good performance in the tests with textual information from different domain,and the terms recognized and extracted by the implemented system can be treated as candidate terms for false-eliminating and optimizing combining with parameters of mutual information,log likelihood and domain dependency.
Keywords:Chinese term recognition Hidden markov model HMM
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《现代图书情报技术》浏览原始摘要信息
点击此处可从《现代图书情报技术》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号