首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 171 毫秒
1.
汉语文献自动分词与标引研究综述   总被引:3,自引:0,他引:3  
湛述勇 《情报科学》1992,13(5):66-71
本文根据近年来国内发表的有关自动分词与标引的部分文章,对汉语文献自动分词方法和自动标引技术进行了分析和归纳,并提出了自动标引质量评价和标引模型评价指标的问题。  相似文献   

2.
文章介绍了自动标引的基本原理和方法。讨论了基于知识库的网页自动标引、基于UCL的网页自动标引和基于遗传算法的网页自动标引方法,并对这三种网页自动标引方法进行了分析和比较。  相似文献   

3.
[目的/意义]对2003年以来我国自动标引的研发现状进行总结并预测未来发展动向,以期为文献自动标引实践的发展提供借鉴和参考。[方法/过程]通过文献调研和相关案例回顾,系统梳理2003—2023年我国文献自动标引的系统研发及典型应用,具体从自动主题标引和自动分类标引两方面展开。[结果/结论]自动标引发展面临不少现实问题,今后自动标引研究及实践应聚焦于技术上重点突破中文自动分词的语言分析问题、研究和探索更高效的语料库智能学习机制、集成化开发多媒体信息自动标引方法、多方联动构建文献自动标引效果的评价体系与监测机制。  相似文献   

4.
文章介绍自动标引技术的发展现状,并将自动标引技术应用于政府信息公开的标引工作中,针对政府信息公开工作中存在的问题和不足,运用统计加权算法,将词频统计、位置加权、词共现统计三者相结合,设计实现了基于关键词的政府信息公开的自动标引。  相似文献   

5.
[目的/意义]基于文本挖掘技术自动发现更具代表性的文献内容主题词,通过定位主题词在章节中的具体位置,并基于可视化技术进行主题标引,帮助读者直观高效发现文献主题间的潜在关系。[方法/过程]基于文本挖掘技术深入文献内容层挖掘主题词,并利用可视化工具直观呈现所获信息,在此基础上尝试构建可视化主题自动标引系统,并在格萨尔领域的多个主题中对该系统的自动标引效果进行验证。[结果/结论]研究结果显示,该标引方法在格萨尔领域实现了文献内容级的可视化主题自动标引,快速精准地定位到章节、段落和句子。标引相关信息获取过程直观可视,并且具有交互性,可提升用户体验和参与度。文章以《英雄格萨尔》为例完成系统验证,但该标引方法技术本身无领域限定,可应用于其他领域的文献。  相似文献   

6.
赵衍  张永娟  陈成材  陈恒 《情报杂志》2012,31(5):185-191
准确性问题一直是困扰计算机自动赋词标引工作的一大难点,很多学者从不同的角度提出了多种提高信息标引准确性的方法。通过比较研究,设计了一种信息标引"前-中-后"综合联动的控制方法来提高计算机自动赋词标引的准确性。该方法由标引前预处理、标引同期控制和后期反馈控制三阶段组成。系统地论述了该方法的原理、特点和实现方式,并在创新型CBA(中国生命科学文摘)数据库中进行实证研究,验证了此方法在提高计算机自动赋词标引准确度方面的有效性。  相似文献   

7.
网络信息检索系统中信息自动标引方法的设计与实现   总被引:1,自引:0,他引:1  
周晓红 《情报杂志》2005,24(12):41-43
比较了目前主要使用的标引方法,根据网络信息的特点,提出了关键词标引和全文标引相结合的混合标引方法,并给出了具体实现方法,描绘了自动标引的流程图。最后给出了信息标引处理后数据检索方法。  相似文献   

8.
介绍了计算机模糊检索在图书自动标引中的原理与方法 ,并探索用计算机技术构造出一个计算机辅助图书自动标引系统 ,以为今后计算机替代人工操作奠定了基础。  相似文献   

9.
本文采用数据挖掘技术和情报语言学方法 ,构建了一个可以用于从因特网上提取信息、进行自动标引和自动分类的系统 ,提供了一种创建自动分类知识库的新方法 ;提出了一种用于主题抽取的位置加权算法 ,研制了一种改进汉语同义词识别性能的新方法 ,并在自动分类时运用了这种语义相似度识别算法。最后还对该系统性能进行了测试  相似文献   

10.
孟旭阳  白海燕  梁冰  王莉 《情报杂志》2021,40(3):125-131,7
[目的/意义]资源数字化时代文献服务向知识服务方向转变,高质量的文献自动标引是文献知识服务能力提升的基础和关键,针对目前英文科技文献自动标引准确率不高的问题,提出了基于语义感知的概念遴选优化方法。[方法/过程]基于知识组织系统的自动主题标引,采用自然语言处理中的神经网络词向量技术,对概念和英文文献内容语义进行表示并进行语义感知与评估,实现概念标引结果在语义层面的遴选。该方法采用基于知识组织系统与自然语言处理技术相结合的方法,弥补了在语义层面上的不足,从而进一步降低不相关概念的影响,提高概念标引结果的准确率。[结果/结论]实验结果表明,该方法具有较好的语义感知性能,在概念遴选上有效降低了不相关概念,大大提高了标引结果的文献相关性,为科技文献资源知识化服务建设和相关研究提供有价值的参考和支持。  相似文献   

11.
Does human intellectual indexing have a continuing role to play in the face of increasingly sophisticated automatic indexing techniques? In this two-part essay, a computer scientist and long-time TREC participant (Pérez-Carballo) and a practitioner and teacher of human cataloging and indexing (Anderson) pursue this question by reviewing the opinions and research of leading experts on both sides of this divide. We conclude that human analysis should be used on a much more selective basis, and we offer suggestions on how these two types of indexing might be allocated to best advantage. Part one of the essay critiques the comparative research, then explores the nature of human analysis of messages or texts and efforts to formulate rules to make human practice more rigorous and predictable. We find that research comparing human vs automatic approaches has done little to change strongly held beliefs, in large part because many associated variables have not been isolated or controlled.Part II focuses on current methods in automatic indexing, its gradual adoption by major indexing and abstracting services, and ways for allocating human and machine approaches. Overall, we conclude that both approaches to indexing have been found to be effective by researchers and searchers, each with particular advantages and disadvantages. However automatic indexing has the over-arching advantage of decreasing cost, as human indexing becomes ever more expensive.  相似文献   

12.
Does human intellectual indexing have a continuing role to play in the face of increasingly sophisticated automatic indexing techniques? In this two-part essay, a computer scientist and long-time TREC participant (Pérez-Carballo) and a practitioner and teacher of human cataloging and indexing (Anderson) pursue this question by reviewing the opinions and research of leading experts on both sides of this divide. We conclude that human analysis should be used on a much more selective basis, and we offer suggestions on how these two types indexing might be allocated to best advantage. Part I of the essay critiques the comparative research, then explores the nature of human analysis of messages or texts and efforts to formulate rules to make human practice more rigorous and predictable. We find that research comparing human versus automatic approaches has done little to change strongly held beliefs, in large part because many associated variables have not been isolated or controlled.Part II focuses on current methods in automatic indexing, its gradual adoption by major indexing and abstracting services, and ways for allocating human and machine approaches. Overall, we conclude that both approaches to indexing have been found to be effective by researchers and searchers, each with particular advantages and disadvantages. However, automatic indexing has the over-arching advantage of decreasing cost, as human indexing becomes ever more expensive.  相似文献   

13.
14.
The profusion of online resources calls for tools and methods to help Internet users find precisely what they are looking for. Quality controlled gateway CISMeF provides such services for health resources. However, the human cost of maintaining and updating the catalogue are increasingly high. This paper presents the automatic indexing system currently developed in the CISMeF team to be used as such for preliminary indexing, or after human reviewing for the final indexing. The system architecture, using the INTEX platform for MeSH term extraction is detailed. The results of a first evaluation tend to indicate that the automatic indexing strategy is relevant, as it achieves a precision comparable to that of other existing operational systems. Moreover, the system presented in this paper retrieves keyword/qualifier pairs as opposed to single terms, therefore providing a significantly more precise indexing. Further development and tests will be carried out in order to improve the coverage of the dictionaries, and validate the efficiency of the system in the indexers’ everyday work.  相似文献   

15.
16.
网页自动标引方案的优选及标引性能的测评   总被引:2,自引:0,他引:2  
仲云云  侯汉清  薛鹏军 《情报科学》2002,20(10):1108-1110
本文介绍了三种网页自动标引方案,通过对“中国经济网”上50页网页的手工标引、自动标引结果比较,从而优选出一种方案,即对网页全文不同部位加权,采用词频加权统计法。最后对该方案自动主题标引和分类标引分别从人机相符率方面进行测评。  相似文献   

17.
潜在语义索引方法是一种无监督的学习方法,能够自动地从未经加工的文本中学习词法分析处理的数据。通过计算单词之间的语义相关性,提高学习的效果。本文首先对词法分析和词法学习的概念和早期出现过的词法学习的方法进行简单阐述,然后描述了基于这一理论进行词法学习的方法,接着是对这一方法的一些改进和测评,最后是结论和展望。  相似文献   

18.
In image retrieval, most systems lack user-centred evaluation since they are assessed by some chosen ground truth dataset. The results reported through precision and recall assessed against the ground truth are thought of as being an acceptable surrogate for the judgment of real users. Much current research focuses on automatically assigning keywords to images for enhancing retrieval effectiveness. However, evaluation methods are usually based on system-level assessment, e.g. classification accuracy based on some chosen ground truth dataset. In this paper, we present a qualitative evaluation methodology for automatic image indexing systems. The automatic indexing task is formulated as one of image annotation, or automatic metadata generation for images. The evaluation is composed of two individual methods. First, the automatic indexing annotation results are assessed by human subjects. Second, the subjects are asked to annotate some chosen images as the test set whose annotations are used as ground truth. Then, the system is tested by the test set whose annotation results are judged against the ground truth. Only one of these methods is reported for most systems on which user-centred evaluation are conducted. We believe that both methods need to be considered for full evaluation. We also provide an example evaluation of our system based on this methodology. According to this study, our proposed evaluation methodology is able to provide deeper understanding of the system’s performance.  相似文献   

19.
A procedure for automated indexing of pathology diagnostic reports at the National Institutes of Health is described. Diagnostic statements in medical English are encoded by computer into the Systematized Nomenclature of Pathology (SNOP). SNOP is a structured indexing language constructed by pathologists for manual indexing. It is of interest that effective automatic encoding can be based upon an existing vocabulary and code designed for manual methods. Morphosyntactic analysis, a simple syntax analysis, matching of dictionary entries consisting of several words, and synonym substitutions are techniques utilized.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号