首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
语言模型在信息检索中的应用   总被引:1,自引:0,他引:1  
基于语言模型的检索方法为信息检索领域开辟了一个很有前景同时也具有相当挑战性的方向。与传统检索模型相比,语言模型不仅具有良好的理论基础,而且非常灵活,经过简单的变换很容易推演出其他经典的检索模型。此外,大量的实验结果表明,该方法的检索效果优于其他检索模型,因而一经提出便受到了广大研究人员的青睐。然而当前语言模型方法的研究主要集中在单语检索任务中,很少有研究关注语言模型方法在跨语言检索中的应用,针对这个问题,本文在系统介绍基于语言模型检索方法的基础上,将语言模型方法扩展到跨语言检索任务中,介绍了两个跨语言检索模型:统计翻译模型和跨语言相关语言模型。  相似文献   

2.
白华 《图书馆杂志》2006,25(11):3-6
情报检索语言的发展面临现代化、理论方法综合化和集成化的新条件,研究新的语言模型,提高检索语言的表达、描述等功能是当前的重要研究课题,同时要具体研究数据库检索语言、网络检索语言、本体论和语义网检索语言。在所有这些领域里,检索语言的机器理解功能日益重要;为适应机器处理的环境,对它的逻辑推理能力的要求越来越高,知识建模功能与系统表达能力受到越来越多的重视。  相似文献   

3.
Romani is a language of northern Indic origin spoken natively by an estimated 2.5 million people, primarily in Eurasia but also in North America. The history of publication patterns in Romani has not been well documented. Extracting data about this history based on available information in large bibliographic databases such as OCLC WorldCat has been hampered by unfortunate misapplication of certain language codes, making it all but impossible to filter search results efficiently using Romani language as a parameter. The author discusses how he was able to correct much of this inaccurate data in OCLC WorldCat.  相似文献   

4.
EBSCO数据库的综合评价——与ProQuest平台的比较   总被引:2,自引:0,他引:2  
EBSCO和ProQuest数据库是目前各图书馆常用的外文数据库。通过对EBSCO数据库和ProQuest数据库的内容、检索系统的比较来看,EBSCO数据库的核心期刊收录比率较高、期刊品种较多、保持更新的全文刊较多、全文期刊收录的时间跨度较长,而且其检索功能和技术相对更强大一些,更有利于精确检索,更能体现专业数据库的特殊性。在数据库的使用方面,各个图书馆可根据自己的使用情况核算成本,应选择数据库内容和检索系统收录质量较高、综合优势较强的平台。  相似文献   

5.
文章在分析粤西四所高校图书馆外文数据库资源利用特点的基础上,提出通过与数据商达成商业协议、共同订购Elsevier数据库、构建区域环境下网络共享平台等方式缓解图书馆经费紧张、资源不足及利用率低下等矛盾。  相似文献   

6.
董旻  方曙 《图书情报工作》2007,51(10):25-28
针对Deep Web信息资源的利用问题,指出对其进行信息抽取的意义,分析对比在信息抽取过程中处理查询接口和抽取结构化数据这两个主要步骤所使用的技术,采用基于关键词查询和建立文档对象模型的方法对专利数据库进行抽取实验。通过分析实验结果,验证抽取方法的准确性,指出不足之处和解决的途径,以期达到充分利用Deep Web信息资源的目的。  相似文献   

7.
This study aims to investigate the variations in the search pattern of scholarly databases due to disciplinary differences among graduate students. It is a case study which employed the survey method to solicit the views of purposely selected postgraduate students from six colleges of Kwame Nkrumah University of Science and Technology (KNUST). The findings of the study indicate that differences exist in the extent as well as patterns of use of journal databases among graduate students, and that search patterns are dependent on the degree of scatter of topical content. Knowledge of the internal relationships among scholarly journals will ensure that relevant databases are selected for maximum exploitation. The study provides the domain analytic approach to database use in the Ghanaian situation which is woefully deficient in the literature.  相似文献   

8.
智能搜索引擎信息过滤机制研究   总被引:3,自引:0,他引:3  
智能搜索引擎是人工智能技术和传统搜索引擎技术相结合的产物。面对信息无时无刻不在进行更替的网络环境,智能搜索引擎具有自然语言过滤智能化、多文档处理智能化、用户服务智能化等信息处理机制。为促进智能搜索引擎发展,应重视用户建模技术研究,加强基于多Agent智能搜索引擎系统的研制与实践,加大智能搜索引擎关键技术研究力度。  相似文献   

9.
The Internet has created new opportunities for librarians to present literature search results to clinicians. In order to take full advantage of these opportunities, libraries need to create locally maintained bibliographic databases. A simple method of creating a local bibliographic database and publishing it on the Web is described. The method uses off-the-shelf software and requires minimal programming. A hedge search strategy for outcome studies of clinical process interventions is created, and Ovid is used to search MEDLINE. The search results are saved and imported into EndNote libraries. The citations are modified, exported to a Microsoft Access database, and published on the Web. Clinicians can use a Web browser to search the database. The bibliographic database contains 13,803 MEDLINE citations of outcome studies. Most searches take between four and ten seconds and retrieve between ten and 100 citations. The entire cost of the software is under $900. Locally maintained bibliographic databases can be created easily and inexpensively. They significantly extend the evidence-based health care services that libraries can offer to clinicians.  相似文献   

10.
Background: Complementary and alternative medicine (CAM) has succeeded to implement itself in the academic context of universities. In order to get information on CAM, clinicians, researchers and healthcare professionals as well as the lay public are increasingly turning to online portals and databases, which disseminate relevant resources. One specific type of online information retrieval systems, namely the database, is being reviewed in this article. Question: This overview aims at systematically retrieving and describing all databases covering the field of CAM. One of the requirements for inclusion was that the database would also have to be published in a medical journal. Data sources: The databases amed , CAMbase , embase , and medline /Pub Med were searched between December 2008 and December 2009 for publications relevant to CAM databases. The authors’ specialist library was also searched for grey literature to be included. Study selection: All included databases were then visited online and information on the context, structure and volume of the database was extracted. Main results: Forty‐five databases were included in this overview. Databases covered herbal therapies (n = 11), traditional Chinese medicine (n = 9) and some dealt with a vast number of CAM modalities (n = 9), amongst others. The amount of time the databases had been in existence ranged from 4 to 53 years. Countries of origin included the USA (n = 14), UK (n = 7) and Germany (n = 6), amongst others. The main language in 42 of 45 databases was English. Conclusions: Although this overview is quite comprehensive with respect to the field of CAM, certain CAM practices such as chiropractic, massage, reflexology, meditation or yoga may not have been covered adequately. A more detailed assessment of the quality of the included databases might give additional insights into the listed resources. The creation of a personalised meta‐search engine is suggested, towards which this overview could be seen as a first step.  相似文献   

11.
12.
The number of Web users whose first language is not English continues to grow, as does the amount of content provided in languages other than English. This poses new challenges for actors on the Web, such as in which language(s) content should be offered, how search tools should deal with mono- and multilingual content, and how users can make the best use of navigation and search options, suited to their individual linguistic skills. How should these challenges be dealt with? Technological approaches to non-English (or in general, cross-language) Web search have made large progress; however, translation remains a hard problem. This precludes a low-cost but high-quality blanket all-language coverage of the whole Web. In this paper, we propose a user-centric approach to answering questions of where to best concentrate efforts and investments. Drawing on linguistic research, we describe data on the availability of content and access to it in first and second languages across the Web. We then present three studies that investigated the impact of the availability (or not) of first-language content and access forms on user behaviour and attitudes. The results indicate that non-English languages are under-represented on the Web and that this is partly due to content-creation, link-setting and link-following behaviour. They also show that user satisfaction is influenced both by the cognitive effort of searching and the availability of alternative information in that language. These findings suggest that more cross-language tools are desirable. However, they also indicate that context (such as user groups’ domain expertise or site type) should be considered when tradeoffs between information quality and multilinguality need to be taken into account.  相似文献   

13.
Spanish is one of the most widely spoken languages in the world and the various subject heading lists in the language reflect its geographic diversity. Catalogers assigning Spanish subject headings typically must rely on a variety of different sources in different formats. The lcsh-es.org database unites several of these sources in a single search interface to simplify the work of Spanish language subject catalogers and encourage collaboration. A look at current developments suggests that high-level international agreement on linked data technology and policy bode well for the future of multilingual subject authorities.  相似文献   

14.
Objective:We aimed to determine overlaps and optimal combination of multiple database retrieval and citation tracking for evidence synthesis, based on a previously conducted scoping review on facilitators and barriers to implementing nurse-led interventions in dementia care.Methods:In our 2019 scoping review, we performed a comprehensive literature search in eight databases (CENTRAL, CINAHL, Embase, Emcare, MEDLINE, Ovid Nursing Database, PsycINFO, and Web of Science Core Collection) and used citation tracking. We retrospectively analyzed the coverage and overlap of 10,527 retrieved studies published between 2015 and 2019. To analyze database overlap, we used cross tables and multiple correspondence analysis (MCA).Results:Of the retrieved studies, 6,944 were duplicates and 3,583 were unique references. Using our search strategies, considerable overlaps can be found in some databases, such as between MEDLINE and Web of Science Core Collection or between CINAHL, Emcare, and PsycINFO. Searching MEDLINE, CINAHL, and Web of Science Core Collection and using citation tracking were necessary to retrieve all included studies of our scoping review.Conclusions:Our results can contribute to enhancing future search practice related to database selection in dementia care research. However, due to limited generalizability, researchers and librarians should carefully choose databases based on the research question. More research on optimal database retrieval in dementia care research is required for the development of methodological standards.  相似文献   

15.
浅谈中西文图书的界定方法   总被引:1,自引:0,他引:1  
陈利 《图书馆论坛》2007,27(2):136-138,102
随着多种MARC格式的相继出台,国内越来越多的图书馆都采用多MARC格式来编制各类文献的编目数据。由于多MARC编目数据的并存,再加上有些CALIS成员馆的采编人员对中西文文献的界定原则不清,致使CALIS联合目录数据库中存在一书多MRAC的现象。文章详细分析了一书多MRAC的成因,指出了中西文图书界定中容易混淆的几种情况,并提出正确界定中西文图书的步骤和方法。  相似文献   

16.
The availability of web search engines offers opportunities in addition to those provided by bibliographic databases for identifying academic literature, but their usefulness for retrieving research is uncertain. A rigorous literature search was undertaken to investigate whether web search engines might replace bibliographic databases, using empirical research in health and social care as a case study. Eight databases and five web search engines were searched between 20 July and 6 August 2015. Sixteen unique studies which compared at least one database with at least one web search engine were examined, as well as drawing lessons from the authors’ own search process. Web search engines were limited in that the searcher cannot be certain that the principles of Boolean logic apply and they were more limited than bibliographic databases in their functions, such as exporting abstracts. Recommendations are made for improving the rigour and quality of reporting studies of academic literature searching.  相似文献   

17.
Comprehensive yet efficient search methods are essential for any systematic or scoping review. This article outlines the stages of development of a systematic search methodology for a scoping review within the library and information science (LIS) literature. The effectiveness of the database search strategies (LISTA, LISA, ERIC, Scopus, Web of Science) and supplemental search techniques are measured through a retrospective analysis of performance metrics. Findings show that for research topics limited to the library setting, it may be more effective to search fewer databases (LISTA and Scopus only) for peer reviewed journal articles and allot more time to alternate search techniques such as web searching to identify non-journal literature. The article provides an evidence-based, methodological approach to developing a systematic search plan, unique to LIS researchers, that accounts for time and resource needs.  相似文献   

18.
为了满足日益增长的对专利检索的需求,国家高技术研究发展计划(863计划)启动了族性化学结构数据库系统的研究与开发。族性化学结构数据库系统主要涉及两方面的关键技术:(1)族性化学结构的计算机表达, (2)族性化学结构的检索算法。本文主要讨论族性化学结构的计算机表达。存在于化学专利原始文献中的族性化学结构是用具有一定规范的自然语言表述的。为了能在计算机系统中储存与检索这些信息,自然语言表述的族性化学结构必须转换为计算机可以接受的无歧义的形式语言。这个过程叫做族性化学结构的标引。国际上一般采用的基于结构片断的族性化学结构标引形式语言开发于20世纪70~80年代,这种形式语言与化学家采用的图形自然语言相去甚远,标引速度慢,成本高。本文介绍在ISIS/Draw绘图功能基础上发展起来的标引族性化学结构的图形形式语言,它的主要特点是与化学家日常使用的图形自然语言接近,规则简单易于掌握,从而提高标引效率,降低族性化学结构数据库系统的实现成本。  相似文献   

19.
在分词技术、索引技术、结构化查询语言技术的基础上,提出了一个基于XML文档数据库的信息检索系统,这一系统模型主要由分词模块、索引模块及查询模块组成。  相似文献   

20.
Transaction logs from online search engines are valuable for two reasons: First, they provide insight into human information-seeking behavior. Second, log data can be used to train user models, which can then be applied to improve retrieval systems. This article presents a study of logs from PubMed®, the public gateway to the MEDLINE® database of bibliographic records from the medical and biomedical primary literature. Unlike most previous studies on general Web search, our work examines user activities with a highly-specialized search engine. We encode user actions as string sequences and model these sequences using n-gram language models. The models are evaluated in terms of perplexity and in a sequence prediction task. They help us better understand how PubMed users search for information and provide an enabler for improving users’ search experience.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号