首页 | 本学科首页   官方微博 | 高级检索  
     

基于多层特征的字符串相似度计算模型
引用本文:章成志. 基于多层特征的字符串相似度计算模型[J]. 情报学报, 2005, 24(6): 696-701
作者姓名:章成志
作者单位:南京大学信息管理系,南京,210093
摘    要:针对计算字符串相似度传统方法的不足之处,提出以相似元作为字符串的基本处理单元,综合考虑相似元的字面、语义及统计关联等多层特征的字符串相似度计算方法。对常规计算方法中存在的,由相似元排序引起的相似元位置信息丢失问题进行了修正。实验结果表明该算法的有效性,并且对句子间、段落间的相似度计算有启发意义。

关 键 词:字符串相似度  相似元  字面相似度  语义相似度  多特征度量
修稿时间:2004-12-14

A Model for Chinese String Similarity Based on Multi-Level Features
Zhang Chengzhi. A Model for Chinese String Similarity Based on Multi-Level Features[J]. Journal of the China Society for Scientific andTechnical Information, 2005, 24(6): 696-701
Authors:Zhang Chengzhi
Abstract:String similarity computation has been widely used in the field of Chinese information processing.In this paper,a unifying model for string similarity computation is presented based on multi-level features.The novel approach of similarity computation uses the literal,semantic and statistical relative features of strings.The method can take advantage of the normal approaches to improve the computation accuracy.Experiments show that the proposed method is an effective solution to the Chines string similarity computation problem,and it can be generalized to measure the similarity of other components of Chinese text,such as sentence,paragraph etc.
Keywords:Chinese string similarity  similarity unit  multiple-features measuring  literal similarity  semantic similarity.  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号