基于多层特征的字符串相似度计算模型 A Model for Chinese String Similarity Based on Multi-Level Features期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于多层特征的字符串相似度计算模型

引用本文：	章成志. 基于多层特征的字符串相似度计算模型[J]. 情报学报, 2005, 24(6): 696-701

作者姓名：	章成志

作者单位：	南京大学信息管理系,南京,210093

摘要：	针对计算字符串相似度传统方法的不足之处,提出以相似元作为字符串的基本处理单元,综合考虑相似元的字面、语义及统计关联等多层特征的字符串相似度计算方法。对常规计算方法中存在的,由相似元排序引起的相似元位置信息丢失问题进行了修正。实验结果表明该算法的有效性,并且对句子间、段落间的相似度计算有启发意义。
关键词：	字符串相似度相似元字面相似度语义相似度多特征度量
修稿时间：	2004-12-14
A Model for Chinese String Similarity Based on Multi-Level Features

Zhang Chengzhi. A Model for Chinese String Similarity Based on Multi-Level Features[J]. Journal of the China Society for Scientific andTechnical Information, 2005, 24(6): 696-701

Authors:	Zhang Chengzhi

Abstract:	String similarity computation has been widely used in the field of Chinese information processing.In this paper,a unifying model for string similarity computation is presented based on multi-level features.The novel approach of similarity computation uses the literal,semantic and statistical relative features of strings.The method can take advantage of the normal approaches to improve the computation accuracy.Experiments show that the proposed method is an effective solution to the Chines string similarity computation problem,and it can be generalized to measure the similarity of other components of Chinese text,such as sentence,paragraph etc.

Keywords:	Chinese string similarity similarity unit multiple-features measuring literal similarity semantic similarity.
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏