首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
吕美香 《情报科学》2012,(8):1160-1166
词表是图书馆和信息检索领域最重要的知识组织工具,《中国分类主题词表》是传统词表的一种,它的更新和维护一直依靠手工进行,这制约了它在数字图书馆和网络信息环境下的应用。本文介绍了一项基于统计的、从元数据的标题中抽取关键词并定位在词表中的方法。大致包括三个步骤:从标题中提取关键词;确定抽取出的关键词的专指度;将专指度高的专业词汇定位在词表中。在《中国分类主题词表》和上海图书馆提供的计算机科技领域的元数据上所进行实验,结果证明该方法是可行的。这一方法可以应用到自动标引或编目中,有一定的实用性和广阔的应用前景。  相似文献   

2.
通过对本体、形式概念分析等理论研究进行分析,提出一种以"文档——术语"为核心,形式概念分析为技术手段的气象灾害领域的本体构建方法。针对气象灾害领域知识库和主题词表的缺失,以中英文学术论文为数据源,对气象灾害领域术语的层次关系抽取和分析进行了详细阐述和论证,具体包括领域术语的抽取和筛选,文档术语矩阵的建立,主题概念格的生成,术语层次关系分析;本体OWL描述和可视化展示等过程,最后利用GATE Developer对构建本体的有效性进行了验证。  相似文献   

3.
4.
雷晓  常春  刘伟 《情报科学》2021,39(1):135-141
【目的/意义】为保证叙词表术语收录的完整性,需要及时将领域出现但未收录的新术语补充收录到叙词表 中,结合候选词的时间及文档词频特征,从时间序列角度探索新术语的分布情况以指导新术语遴选是值得研究的 问题。【方法/过程】文章主要对词汇文档词频对应的时间序列进行研究,将时间序列进行词频归一化及时间等长预 处理,引入k-means聚类算法,对候选词汇进行基于时间序列趋势变化的聚类,探索术语以及非术语趋势变化的规 律,进而总结新术语应该满足的趋势变化特征。【结果/结论】通过聚类研究,总结得出新术语普遍处于增长趋势。 实证将处于增长状态的候选词汇遴选出来,经过专家判断,该方法可以有效从候选词汇中遴选出其中能补充到叙 词表中的新术语,该方法有比较高的准确率。【创新/局限】创新之处表现为叙词表新术语的遴选中同时考虑了时间 变化和文档词频因素,局限于数据处理规模,实证中只统计了论文关键词的词频数据。  相似文献   

5.
范炜 《情报科学》2006,24(7):1073-1077
本文对叙词表在知识组织体系知识层面与语义网核心Ontology结合,阐述了叙词表表达知识概念与语义关系的能力为语义网建设提供资源组织基础;在描述层面利用SKOS对叙词表进行实例改造,对叙词语义关系可视化表示以及对网页进行主题标引。最后指出SKOS为知识组织体系提供一套简单,灵活,可扩展的机器可理解的描述和转化机制。  相似文献   

6.
We report on the design and construction of features of an automated query system which will assist pharmacologists who are not information specialists to access the Derwent Drug File (DDF) pharmacological database. Our approach was to first elucidate those search skills of the search intermediary which might prove tractable to automation. Modules were then produced which assist in the three important subtasks of search statement generation, namely vocabulary selection, the choice of context indicators and query reformulation. Vocabulary selection is facilitated by approximate string matching, morphological analysis, browsing and menu searching. The context of the study, such as treatment or metabolism, is determined using a system of advisory menus. The task of query reformulation is performed using user feedback on retrieved documents, thesaurus relations between document index terms and term postings data. Use is made of diverse information sources, including electronic forms of printed search aids, a thesaurus and a medical dictionary. The system will be of use both to semicasual users and experienced intermediaries. Many of the ideas developed should prove transportable to domains other than pharmacology: the techniques for thesaurus manipulation are designed for use with any hierarchical thesaurus.  相似文献   

7.
Authors and searchers usually express the same things in many different ways, which causes problems in free text searching of text databases. Thus, a switching tool connecting the different names of one concept is needed. This study tests the effectiveness of a thesaurus as a search-aid in free text searching of a full text database. A set of queries was searched against a large full text database of newspaper articles. The search-aid thesaurus constructed for the test contains the usual relationships of a thesaurus, namely equivalence, hierarchical, and associative relationships. Each query was searched in five distinct modes: basic search, synonym search, narrower term search, related term search, and union of all previous searches. The basic searches contained only terms included in the original query statements. In the synonym searches, the terms of the basic search were extended by disjunction of the synonyms given by the search-aid thesaurus without modifying the overall logic of the basic search. Likewise, the basic search was extended in turn with the narrower terms and with the related terms given by the search-aid thesaurus. The last search mode included the basic terms and all the terms used in the previous searches. The searches were analyzed in terms of relative recall and precision; relative recall was estimated by setting the recall of the union search to 100%. On the average the value of relative recall was 47.2% in the basic search, compared with 100% in the union search; the average value of precision decreased only from 62.5% in the basic search to 51.2% in the union search.  相似文献   

8.
9.
10.
Decisions in thesaurus construction and use   总被引:1,自引:0,他引:1  
A thesaurus and an ontology provide a set of structured terms, phrases, and metadata, often in a hierarchical arrangement, that may be used to index, search, and mine documents. We describe the decisions that should be made when including a term, deciding whether a term should be subdivided into its subclasses, or determining which of more than one set of possible subclasses should be used. Based on retrospective measurements or estimates of future performance when using thesaurus terms in document ordering, decisions are made so as to maximize performance. These decisions may be used in the automatic construction of a thesaurus. The evaluation of an existing thesaurus is described, consistent with the decision criteria developed here. These kinds of user-focused decision-theoretic techniques may be applied to other hierarchical applications, such as faceted classification systems used in information architecture or the use of hierarchical terms in “breadcrumb navigation”.  相似文献   

11.
Following a general discussion on the philosophy and design of information systems, with particular attention to the definition, needs and psychology of the ultimate user of systems providing on-line access to biomedical information, the role of the documentalist, the differences between document retrieval and true information retrieval and the operational characteristics of on-line systems which affect their cost and hence their design and acceptability, the authors make some tentative predictions as to the future demand for such information retrieval services and their probable organizational form. A brief report is then presented on the principal findings and conclusions of a user's study of the Excerpta Medica system, the key features and history of which are briefly described. Based on the conclusions of this study, particularly as regards the complexity of the average search question, the role of the search formulators in determining the results of computer searching, the importance of secondary concepts for retrieval and the optimal level of specificity of a computer thesaurus, some of the changes in the Excerpta Medica system which are in the planning stage and will be incorporated into the system's Mark II version are outlined, as are the principal features of the two systems currently offering on-line access to the Excerpta Medica database in Western Germany and the U.S.A. Finally, attention is given to the planned partial hierarchic structuring of the Excerpta Medica thesaurus (Malimet), a project which is to be based largely on frequency counts of the existing database and the elimination of over-specific terms by posting under broader concepts. The results of some of the initial steps in this direction (i.e. frequency counts of portions of the database and the structuring of some of the terms used in the cancer field) are presented by way of illustration.  相似文献   

12.
The paper aims at analysing the relationship between the market extension of Knowledge-Intensive Business Services (KIBS) and their knowledge management strategies. The literature emphasizes the strong relationship existing between KIBS and their customers in terms of innovation process and knowledge creation. We argue that the knowledge management strategies – in terms of knowledge codification, personalization, and knowledge creation – implemented by a KIBS is related to their geographical market extension. A quantitative approach is developed based on more than 150 Italian KIBS specializing in design and communication. The paper enriches the research framework concerning KIBS by emphasizing also the role of partners other than customers in KIBS’ knowledge management strategies.  相似文献   

13.
14.
《汉语主题词表》是我国情报检索语言发展历史中的一个里程碑。在网络时代,《汉语主题词表》将得到新的发展和应用。文章针对《汉语主题词表》的现状,回顾了它的编制和修订历史,其作为情报语言检索工具,在信息组织中发挥了重要作用。对如何在知识组织中发挥作用,如何在网络环境下构筑适应计算机环境的新型词表,向网络环境下的词系统推进,作者提出了新的发展思路和策略方法。  相似文献   

15.
本文简要地介绍了MultiTes 2007 Pro的使用方法,并通过创建一个小型的情报学叙词表,讨论了该软件的功能和特点,情报学主题词的获取以及创建一个简单叙词表的步骤,最后,本文对MultiTes 2007 Pro的优缺点进行了简要评价。  相似文献   

16.
对两种知识组织系统--叙词表与Ontology的比较研究   总被引:8,自引:0,他引:8  
本文列举了一些知识组织系统,论述了叙词表与Ontology两种知识组织系统的相关理论,最后对叙词表与Ontology进行了比较分析,对两者的区别与联系进行了阐述。  相似文献   

17.
Information-systems are classified into two types, termed “Evidence-of Existence” and “Presentation” of information. The objective of the evidence-type system lies in the domain of documentation and retrieval of information. The structure of this system-type is developed, with application of cybernetic concepts, as an isomorphic model in analogy to the system-structure of communication technology. The latter postulates three criteria of structuring: (1) Source-Channel-Sink, with input-output characteristics, (2) Filter-type communication-channel, (3) Reversable code. These criteria are applied to the structuring of information-systems of the evidence-of-existence type. For the purpose of two-way communication the information-systems have to be represented by closed-loop models. The selective-retrieval requirements necessitate the system-channel to be a filter of information. These information-filters are implemented by keyword-phrases, being identical with the codewords. They yield a uniquely decodable code which is totally reversible to adequately serve both the documentation and the retrieval of documents. It is proven that hierarchic information-systems, applying categorization or subject-heading objects of information, do not meet the mandatory code-requirements. The inherent coding-deficiencies of hierarchic systems generate intolerable retrieval ambiguities. The same critique applies to the thesaurus concept. The development of a novel species of thesaurus is suggested, realizing a kind of Linnéan encyclopedia of general human knowledge, presenting all relevant interrelations of objects of knowledge. Such thesaurus would provide the much needed support for formulating efficient search queries. Other relevant features of communication technology, like the information-potential, should be isomorphically transformed into information-system models.  相似文献   

18.
叙词在网络环境中的应用   总被引:1,自引:1,他引:1  
戴剑波 《情报科学》2004,22(4):502-505
本文叙述了叙词在网络环境下的三种应用模式,在一些专业性的网站以及网关检索系统中用叙词直接标引和检索是非常的普遍;叙词由于其概念定义明确,有很好的词问关系的显示,叙词能在基于关键词检索的搜索引擎中实现检索式的扩展的功能;不同部门对所拥有的资料和图书馆等信息源一般所采用的不同的叙词表或采用分类法,在网络环境下,通过一种主题的途径来检索这些信息是信息情报界研究的一个热点,叙词在这方面有着重要的作用。  相似文献   

19.
Cost optimization continues to be a critical concern for many human resources departments. The key is to balance between costs and business value. In particular, computer science organizations prefer to hire people who are expert in only one skill area and have a slight superficial knowledge in other areas that gives them the ability to collaborate across different aspects of project. Community Question Answering networks provide good platforms for people and organizations to share knowledge and find experts. An important issue in expert finding is that an expert has to constantly update his knowledge after being saturated in his field of expertise to still be identified as expert. A person who fails to preserve his expertise is likely to lose his expertise. This work justifies this question that does take the concept of time into account improve the quality of expertise retrieval. We propose a new method for T-shaped expert finding that is based on temporal expert profiling. The proposed method takes the temporal property of expertise into account to mine the shape of expertise for each candidate expert based on his profile. To this end, for each candidate expert, we take snapshots of his expertise trees at regular time intervals and learn the relation between temporal changes in different expertise trees and candidates’ profile. Finally, we use a filtering technique that is applied on top of the profiling method, to find shape of expertise for candidate experts. Experimental results on a large test collection show the superiority of the proposed method in terms of quality of results in comparison with state-of-the-art.  相似文献   

20.
Knowledge acquisition and bilingual terminology extraction from multilingual corpora are challenging tasks for cross-language information retrieval. In this study, we propose a novel method for mining high quality translation knowledge from our constructed Persian–English comparable corpus, University of Tehran Persian–English Comparable Corpus (UTPECC). We extract translation knowledge based on Term Association Network (TAN) constructed from term co-occurrences in same language as well as term associations in different languages. We further propose a post-processing step to do term translation validity check by detecting the mistranslated terms as outliers. Evaluation results on two different data sets show that translating queries using UTPECC and using the proposed methods significantly outperform simple dictionary-based methods. Moreover, the experimental results show that our methods are especially effective in translating Out-Of-Vocabulary terms and also expanding query words based on their associated terms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号