共查询到19条相似文献,搜索用时 171 毫秒
1.
汉语文献自动分词与标引研究综述 总被引:3,自引:0,他引:3
本文根据近年来国内发表的有关自动分词与标引的部分文章,对汉语文献自动分词方法和自动标引技术进行了分析和归纳,并提出了自动标引质量评价和标引模型评价指标的问题。 相似文献
2.
文章介绍了自动标引的基本原理和方法。讨论了基于知识库的网页自动标引、基于UCL的网页自动标引和基于遗传算法的网页自动标引方法,并对这三种网页自动标引方法进行了分析和比较。 相似文献
3.
[目的/意义]对2003年以来我国自动标引的研发现状进行总结并预测未来发展动向,以期为文献自动标引实践的发展提供借鉴和参考。[方法/过程]通过文献调研和相关案例回顾,系统梳理2003—2023年我国文献自动标引的系统研发及典型应用,具体从自动主题标引和自动分类标引两方面展开。[结果/结论]自动标引发展面临不少现实问题,今后自动标引研究及实践应聚焦于技术上重点突破中文自动分词的语言分析问题、研究和探索更高效的语料库智能学习机制、集成化开发多媒体信息自动标引方法、多方联动构建文献自动标引效果的评价体系与监测机制。 相似文献
4.
5.
[目的/意义]基于文本挖掘技术自动发现更具代表性的文献内容主题词,通过定位主题词在章节中的具体位置,并基于可视化技术进行主题标引,帮助读者直观高效发现文献主题间的潜在关系。[方法/过程]基于文本挖掘技术深入文献内容层挖掘主题词,并利用可视化工具直观呈现所获信息,在此基础上尝试构建可视化主题自动标引系统,并在格萨尔领域的多个主题中对该系统的自动标引效果进行验证。[结果/结论]研究结果显示,该标引方法在格萨尔领域实现了文献内容级的可视化主题自动标引,快速精准地定位到章节、段落和句子。标引相关信息获取过程直观可视,并且具有交互性,可提升用户体验和参与度。文章以《英雄格萨尔》为例完成系统验证,但该标引方法技术本身无领域限定,可应用于其他领域的文献。 相似文献
6.
7.
网络信息检索系统中信息自动标引方法的设计与实现 总被引:1,自引:0,他引:1
比较了目前主要使用的标引方法,根据网络信息的特点,提出了关键词标引和全文标引相结合的混合标引方法,并给出了具体实现方法,描绘了自动标引的流程图。最后给出了信息标引处理后数据检索方法。 相似文献
8.
9.
本文采用数据挖掘技术和情报语言学方法 ,构建了一个可以用于从因特网上提取信息、进行自动标引和自动分类的系统 ,提供了一种创建自动分类知识库的新方法 ;提出了一种用于主题抽取的位置加权算法 ,研制了一种改进汉语同义词识别性能的新方法 ,并在自动分类时运用了这种语义相似度识别算法。最后还对该系统性能进行了测试 相似文献
10.
[目的/意义]资源数字化时代文献服务向知识服务方向转变,高质量的文献自动标引是文献知识服务能力提升的基础和关键,针对目前英文科技文献自动标引准确率不高的问题,提出了基于语义感知的概念遴选优化方法。[方法/过程]基于知识组织系统的自动主题标引,采用自然语言处理中的神经网络词向量技术,对概念和英文文献内容语义进行表示并进行语义感知与评估,实现概念标引结果在语义层面的遴选。该方法采用基于知识组织系统与自然语言处理技术相结合的方法,弥补了在语义层面上的不足,从而进一步降低不相关概念的影响,提高概念标引结果的准确率。[结果/结论]实验结果表明,该方法具有较好的语义感知性能,在概念遴选上有效降低了不相关概念,大大提高了标引结果的文献相关性,为科技文献资源知识化服务建设和相关研究提供有价值的参考和支持。 相似文献
11.
《Information processing & management》2001,37(2):231-254
Does human intellectual indexing have a continuing role to play in the face of increasingly sophisticated automatic indexing techniques? In this two-part essay, a computer scientist and long-time TREC participant (Pérez-Carballo) and a practitioner and teacher of human cataloging and indexing (Anderson) pursue this question by reviewing the opinions and research of leading experts on both sides of this divide. We conclude that human analysis should be used on a much more selective basis, and we offer suggestions on how these two types of indexing might be allocated to best advantage. Part one of the essay critiques the comparative research, then explores the nature of human analysis of messages or texts and efforts to formulate rules to make human practice more rigorous and predictable. We find that research comparing human vs automatic approaches has done little to change strongly held beliefs, in large part because many associated variables have not been isolated or controlled.Part II focuses on current methods in automatic indexing, its gradual adoption by major indexing and abstracting services, and ways for allocating human and machine approaches. Overall, we conclude that both approaches to indexing have been found to be effective by researchers and searchers, each with particular advantages and disadvantages. However automatic indexing has the over-arching advantage of decreasing cost, as human indexing becomes ever more expensive. 相似文献
12.
《Information processing & management》2001,37(2):255-277
Does human intellectual indexing have a continuing role to play in the face of increasingly sophisticated automatic indexing techniques? In this two-part essay, a computer scientist and long-time TREC participant (Pérez-Carballo) and a practitioner and teacher of human cataloging and indexing (Anderson) pursue this question by reviewing the opinions and research of leading experts on both sides of this divide. We conclude that human analysis should be used on a much more selective basis, and we offer suggestions on how these two types indexing might be allocated to best advantage. Part I of the essay critiques the comparative research, then explores the nature of human analysis of messages or texts and efforts to formulate rules to make human practice more rigorous and predictable. We find that research comparing human versus automatic approaches has done little to change strongly held beliefs, in large part because many associated variables have not been isolated or controlled.Part II focuses on current methods in automatic indexing, its gradual adoption by major indexing and abstracting services, and ways for allocating human and machine approaches. Overall, we conclude that both approaches to indexing have been found to be effective by researchers and searchers, each with particular advantages and disadvantages. However, automatic indexing has the over-arching advantage of decreasing cost, as human indexing becomes ever more expensive. 相似文献
13.
14.
The profusion of online resources calls for tools and methods to help Internet users find precisely what they are looking for. Quality controlled gateway CISMeF provides such services for health resources. However, the human cost of maintaining and updating the catalogue are increasingly high. This paper presents the automatic indexing system currently developed in the CISMeF team to be used as such for preliminary indexing, or after human reviewing for the final indexing. The system architecture, using the INTEX platform for MeSH term extraction is detailed. The results of a first evaluation tend to indicate that the automatic indexing strategy is relevant, as it achieves a precision comparable to that of other existing operational systems. Moreover, the system presented in this paper retrieves keyword/qualifier pairs as opposed to single terms, therefore providing a significantly more precise indexing. Further development and tests will be carried out in order to improve the coverage of the dictionaries, and validate the efficiency of the system in the indexers’ everyday work. 相似文献
15.
16.
17.
18.
In image retrieval, most systems lack user-centred evaluation since they are assessed by some chosen ground truth dataset. The results reported through precision and recall assessed against the ground truth are thought of as being an acceptable surrogate for the judgment of real users. Much current research focuses on automatically assigning keywords to images for enhancing retrieval effectiveness. However, evaluation methods are usually based on system-level assessment, e.g. classification accuracy based on some chosen ground truth dataset. In this paper, we present a qualitative evaluation methodology for automatic image indexing systems. The automatic indexing task is formulated as one of image annotation, or automatic metadata generation for images. The evaluation is composed of two individual methods. First, the automatic indexing annotation results are assessed by human subjects. Second, the subjects are asked to annotate some chosen images as the test set whose annotations are used as ground truth. Then, the system is tested by the test set whose annotation results are judged against the ground truth. Only one of these methods is reported for most systems on which user-centred evaluation are conducted. We believe that both methods need to be considered for full evaluation. We also provide an example evaluation of our system based on this methodology. According to this study, our proposed evaluation methodology is able to provide deeper understanding of the system’s performance. 相似文献
19.
G S Dunham M G Pacak A W Pratt 《Journal of the American Society for Information Science》1978,29(2):81-90
A procedure for automated indexing of pathology diagnostic reports at the National Institutes of Health is described. Diagnostic statements in medical English are encoded by computer into the Systematized Nomenclature of Pathology (SNOP). SNOP is a structured indexing language constructed by pathologists for manual indexing. It is of interest that effective automatic encoding can be based upon an existing vocabulary and code designed for manual methods. Morphosyntactic analysis, a simple syntax analysis, matching of dictionary entries consisting of several words, and synonym substitutions are techniques utilized. 相似文献