首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Collection selection is a crucial function, central to the effectiveness and efficiency of a federated information retrieval system. A variety of solutions have been proposed for collection selection adapting proven techniques used in centralised retrieval. This paper defines a new approach to collection selection that models the topical distribution in each collection. We describe an extended version of latent Dirichlet allocation that uses a hierarchical hyperprior to enable the different topical distributions found in each collection to be modelled. Under the model, resources are ranked based on the topical relationship between query and collection. By modelling collections in a low dimensional topic space, we can implicitly smooth their term-based characterisation with appropriate terms from topically related samples, thereby dealing with the problem of missing vocabulary within the samples. An important advantage of adopting this hierarchical model over current approaches is that the model generalises well to unseen documents given small samples of each collection. The latent structure of each collection can therefore be estimated well despite imperfect information for each collection such as sampled documents obtained through query-based sampling. Experiments demonstrate that this new, fully integrated topical model is more robust than current state of the art collection selection algorithms.  相似文献   

2.
Peer-to-peer (P2P) networks integrate autonomous computing resources without requiring a central coordinating authority, which makes them a potentially robust and scalable model for providing federated search capability to large-scale networks of text-based digital libraries. However, peer-to-peer networks have so far provided very limited support for full-text federated search with relevance-based document ranking. This paper provides solutions to full-text federated search of text-based digital libraries in hierarchical peer-to-peer networks. Existing approaches to full-text search are adapted and new methods are developed for the problems of resource representation, resource selection, and result merging according to the unique characteristics of hierarchical peer-to-peer networks. Experimental results demonstrate that the proposed approaches offer a better combination of accuracy and efficiency than more common alternatives for federated search of text-based digital libraries in peer-to-peer networks.  相似文献   

3.
中外元搜索引擎的比较研究   总被引:1,自引:0,他引:1  
分别选择具有代表性的中外元搜索引擎,对其检索性能、检索结果等进行比较研究,并在此基础上指出目前元搜索引擎存在的主要问题,提出元搜索引擎发展的一些建议。  相似文献   

4.
This research investigates structural change made by Naver's online news section and how it has impacted the overall traffic flow of Korea's online news. This paper examined 45 websites in 2008 and 2010, and the total number of pages viewed within these sites was considered in the analysis. Social network analysis was applied to study the relationships between the news sites. The analysis through degree centrality and Bonacich powers shows that there has been a shift in market leadership. In 2007, Naver, the top search engine in Korea, stepped down from its leading position after they started to provide news services. Daum, the second largest search engine, has taken over the central position as the most influential news site. Based on the results of this study, practical implications for online service markets and theoretical implications for online services are recommended accordingly.  相似文献   

5.
本文介绍了搜索引擎的现状,通过对其使用情况的调查,分析了搜索引擎的不足,并展望了搜索引擎的发展。  相似文献   

6.
医院图书馆在开展循证医学中的应对方略   总被引:1,自引:0,他引:1  
简述了循证医学的概念、国内发展现状及其对医院图书馆的要求,进而提出了医院图书馆在开展循证医学过程中的应对方略。  相似文献   

7.
Bing and Google customize their results to target people with different geographic locations and languages but, despite the importance of search engines for web users and webometric research, the extent and nature of these differences are unknown. This study compares the results of seventeen random queries submitted automatically to Bing for thirteen different English geographic search markets at monthly intervals. Search market choice alters a small majority of the top 10 results but less than a third of the complete sets of results. Variation in the top 10 results over a month was about the same as variation between search markets but variation over time was greater for the complete results sets. Most worryingly for users, there were almost no ubiquitous authoritative results: only one URL was always returned in the top 10 for all search markets and points in time, and Wikipedia was almost completely absent from the most common top 10 results. Most importantly for webometrics, results from at least three different search markets should be combined to give more reliable and comprehensive results, even for queries that return fewer than the maximum number of URLs.  相似文献   

8.
搜索引擎的选择与使用技巧   总被引:3,自引:0,他引:3  
本文讨论了搜索引擎的分类及特点,结合检索实例,总结了搜索引擎的选择和使用技巧。  相似文献   

9.
搜索引擎营销模式的分析及其发展趋势   总被引:3,自引:0,他引:3  
蔡红 《图书馆论坛》2006,26(1):95-97,149
在对搜索引擎的发展作了简要回顾之后,分析了搜索引擎目前比较流行的几种营销模式,及其对搜索引擎的影响;从技术、内容、服务、营销等方面对搜索引擎的发展趋势作了阐述。  相似文献   

10.
It is known that users of internet search engines often enter queries with misspellings in one or more search terms. Several web search engines make suggestions for correcting misspelled words, but the methods used are proprietary and unpublished to our knowledge. Here we describe the methodology we have developed to perform spelling correction for the PubMed search engine. Our approach is based on the noisy channel model for spelling correction and makes use of statistics harvested from user logs to estimate the probabilities of different types of edits that lead to misspellings. The unique problems encountered in correcting search engine queries are discussed and our solutions are outlined.  相似文献   

11.
循证医学数字图书馆体系结构的研究   总被引:1,自引:0,他引:1  
医学模式从传统的经验医学逐渐向现代的求证医学转变,这种变化对医学信息的组织提出更高要求,基于网络和资源整合的数字图书馆技术恰恰适应现代循证医学的要求。面向循证医学的主题数字图书馆的研究论述了主题(面向循证医学)数字图书馆的特点和基本技术。文章重点讨论了循证医学数字图书馆的体系结构。  相似文献   

12.
Google学术搜索引擎与跨库检索系统的功能对比   总被引:1,自引:0,他引:1  
徐芳 《图书馆学研究》2008,(2):72-73,95
文章介绍了两种数字资源整合利用的方法--Google中文学术搜索引擎和Cross-Search跨库检索系统,并将它们各自的功能进行了对比.  相似文献   

13.
BACKGROUND: Cochrane-style systematic reviews increasingly require the participation of librarians. Guidelines on the appropriate search strategy to use for systematic reviews have been proposed. However, research evidence supporting these recommendations is limited. OBJECTIVE: This study investigates the effectiveness of various systematic search methods used to uncover randomized controlled trials (RCTs) for systematic reviews. Effectiveness is defined as the proportion of relevant material uncovered for the systematic review using extended systematic review search methods. The following extended systematic search methods are evaluated: searching subject-specific or specialized databases (including trial registries), hand searching, scanning reference lists, and communicating personally. METHODS: Two systematic review projects were prospectively monitored regarding the method used to identify items as well as the type of items retrieved. The proportion of RCTs identified by each systematic search method was calculated. RESULTS: The extended systematic search methods uncovered 29.2% of all items retrieved for the systematic reviews. The search of specialized databases was the most effective method, followed by scanning of reference lists, communicating personally, and hand searching. Although the number of items identified through hand searching was small, these unique items would otherwise have been missed. CONCLUSIONS: Extended systematic search methods are effective tools for uncovering material for the systematic review. The quality of the items uncovered has yet to be assessed and will be key in evaluating the value of the systematic search methods.  相似文献   

14.
本文从目前搜索引擎中使用分类法的现状入手,针对图书情报界提出的优化搜索引擎中分类体系的改造方案,提出了一些有用的建议。  相似文献   

15.
阐述了循证医学的定义、核心思想、实践的基本过程和临床开展循证医学的意义,论述了在医学院校图书馆开展循证医学信息服务的必要性。同时,探讨了医学院校图书馆员应具备的素质,并提出了医学院校图书馆开展循证医学服务的创新举措。  相似文献   

16.
Transaction logs of NAVER, a major Korean Web search engine, were analyzed to track the information-seeking behavior of Korean Web users. These transaction logs include more than 40 million queries collected over 1 week. This study examines current transaction log analysis methodologies and proposes a method for log cleaning, session definition, and query classification. A term definition method which is necessary for Korean transaction log analysis is also discussed. The results of this study show that users behave in a simple way: they type in short queries with a few query terms, seldom use advanced features, and view few results' pages. Users also behave in a passive way: they seldom change search environments set by the system. It is of interest that users tend to change their queries totally rather than adding or deleting terms to modify the previous queries. The results of this study might contribute to the development of more efficient and effective Web search engines and services.  相似文献   

17.
Multilingual information retrieval is generally understood to mean the retrieval of relevant information in multiple target languages in response to a user query in a single source language. In a multilingual federated search environment, different information sources contain documents in different languages. A general search strategy in multilingual federated search environments is to translate the user query to each language of the information sources and run a monolingual search in each information source. It is then necessary to obtain a single ranked document list by merging the individual ranked lists from the information sources that are in different languages. This is known as the results merging problem for multilingual information retrieval. Previous research has shown that the simple approach of normalizing source-specific document scores is not effective. On the other side, a more effective merging method was proposed to download and translate all retrieved documents into the source language and generate the final ranked list by running a monolingual search in the search client. The latter method is more effective but is associated with a large amount of online communication and computation costs. This paper proposes an effective and efficient approach for the results merging task of multilingual ranked lists. Particularly, it downloads only a small number of documents from the individual ranked lists of each user query to calculate comparable document scores by utilizing both the query-based translation method and the document-based translation method. Then, query-specific and source-specific transformation models can be trained for individual ranked lists by using the information of these downloaded documents. These transformation models are used to estimate comparable document scores for all retrieved documents and thus the documents can be sorted into a final ranked list. This merging approach is efficient as only a subset of the retrieved documents are downloaded and translated online. Furthermore, an extensive set of experiments on the Cross-Language Evaluation Forum (CLEF) () data has demonstrated the effectiveness of the query-specific and source-specific results merging algorithm against other alternatives. The new research in this paper proposes different variants of the query-specific and source-specific results merging algorithm with different transformation models. This paper also provides thorough experimental results as well as detailed analysis. All of the work substantially extends the preliminary research in (Si and Callan, in: Peters (ed.) Results of the cross-language evaluation forum-CLEF 2005, 2005).
Hao YuanEmail:
  相似文献   

18.
针对不容乐观的EBM网络资源利用现状,提出面向知识服务的EBM网络资源组织,介绍EBM知识服务的内涵,对EBM网络资源的知识挖掘、组织流程进行说明,结合现存的两个EBM元数据方案提出一个改良的EBM元数据方案,举出运用EBM网络资源组织从而提供知识服务的实例,旨在为充分利用EBM网络资源打下基础。  相似文献   

19.
Google scholar搜索引擎特征研究   总被引:3,自引:0,他引:3  
本文分析了Google scholar的资料来源、检索策略、检索结果显示等特征,并主要从引文数据库角度评价其检索优劣势,谈其发展前景.  相似文献   

20.
随着互联网上信息数量的不断增长,传统的信息检索技术已经很难满足人们对查询质量的苛刻要求。为了方便用户从检索结果中快速、准确地定位自己想要的信息,集成了文档聚类功能的搜索引擎应运而生。本文讨论了文档聚类技术在搜索引擎中的应用问题,介绍了一些算法,重点分析了Vivisimo这个比较有代表性的聚类搜索引擎,并预测了搜索引擎聚类技术的发展趋势。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号