共查询到20条相似文献,搜索用时 46 毫秒
1.
《期刊图书馆员》2013,64(3):101-108
Abstract Librarians and patrons can access many thousands of electronic journals today. Librarians who incorporate records for these journals into their OPACs will maximize the journals' access and use, but accurately tracking them can be difficult. This paper describes a commercial solution to the problem. Serials Solutions combines aggregator title lists with CONSER MARC records to improve access to these titles. This paper reviews the benefits and challenges associated with this process, and identifies several areas in which aggregators and OPAC vendors can improve the services they provide, to facilitate this process. 相似文献
2.
《Cataloging & classification quarterly》2013,51(3):109-137
ABSTRACT Recently, the Library of Congress adopted the pinyin Romanization system for transcribing Chinese data in its bibliographic records. In its canonical form, pinyin aggregates Chinese “words” into single linguistic units, but pinyin entries could be constructed following either a monosyllabic or a polysyllabic pattern. Although the former is easier and less costly to implement, the latter method is potentially more beneficial for end-users, as it reduces ambiguity, and generates a much larger variety of indexable terms. The current study investigates if following the polysyllabic method improves retrieval efficiency and effectiveness in item-specific searching within online bibliographic databases. Analysis of the results revealed that aggregation of monosyllables does improve efficiency significantly (p < .05), especially during keyword searches, while effectiveness remains mainly unaffected. 相似文献
3.
FRBR与OPAC发展 总被引:10,自引:1,他引:10
本文介绍了IFLA书目记录功能需求(简称FRBR)三组四层模型,并跟踪了这一理念在OPAC中的应用。通过实例分析提出了传统OPAC的发展方向,并对我国OPAC的FRBR化提出了几点建议。 相似文献
4.
Jacques Savoy 《Information Retrieval》2007,10(6):509-529
This paper reports on the underlying IR problems encountered when indexing and searching with the Bulgarian language. For
this language we propose a general light stemmer and demonstrate that it can be quite effective, producing significantly better
MAP (around + 34%) than an approach not applying stemming. We implement the GL2 model derived from the Divergence from Randomness paradigm and find its retrieval effectiveness better than other probabilistic, vector-space and language models. The resulting
MAP is found to be about 50% better than the classical tf idf approach. Moreover, increasing the query size enhances the MAP by around 10% (from T to TD). In order to compare the retrieval
effectiveness of our suggested stopword list and the light stemmer developed for the Bulgarian language, we conduct a set
of experiments on another stopword list and also a more complex and aggressive stemmer. Results tend to indicate that there
is no statistically significant difference between these variants and our suggested approach. This paper evaluates other indexing
strategies such as 4-gram indexing and indexing based on the automatic decompounding of compound words. Finally, we analyze
certain queries to discover why we obtained poor results, when indexing Bulgarian documents using the suggested word-based
approach. 相似文献
5.
This paper investigates the impact of three approaches to XML retrieval: using Zettair, a full-text information retrieval system; using eXist, a native XML database; and using a hybrid system that takes full article answers from Zettair and uses eXist to extract elements from those articles. For the content-only topics, we undertake a preliminary analysis of the INEX 2003 relevance assessments in order to identify the types of highly relevant document components. Further analysis identifies two complementary sub-cases of relevance assessments (General and Specific) and two categories of topics (Broad and Narrow). We develop a novel retrieval module that for a content-only topic utilises the information from the resulting answer list of a native XML database and dynamically determines the preferable units of retrieval, which we call Coherent Retrieval Elements. The results of our experiments show that—when each of the three systems is evaluated against different retrieval scenarios (such as different cases of relevance assessments, different topic categories and different choices of evaluation metrics)—the XML retrieval systems exhibit varying behaviour and the best performance can be reached for different values of the retrieval parameters. In the case of INEX 2003 relevance assessments for the content-only topics, our newly developed hybrid XML retrieval system is substantially more effective than either Zettair or eXist, and yields a robust and a very effective XML retrieval. 相似文献
6.
Norbert Fuhr 《Information Retrieval》2008,11(3):251-265
The classical Probability Ranking Principle (PRP) forms the theoretical basis for probabilistic Information Retrieval (IR)
models, which are dominating IR theory since about 20 years. However, the assumptions underlying the PRP often do not hold,
and its view is too narrow for interactive information retrieval (IIR). In this article, a new theoretical framework for interactive
retrieval is proposed: The basic idea is that during IIR, a user moves between situations. In each situation, the system presents
to the user a list of choices, about which s/he has to decide, and the first positive decision moves the user to a new situation.
Each choice is associated with a number of cost and probability parameters. Based on these parameters, an optimum ordering
of the choices can the derived—the PRP for IIR. The relationship of this rule to the classical PRP is described, and issues
of further research are pointed out.
相似文献
Norbert FuhrEmail: |
7.
《Cataloging & classification quarterly》2013,51(4):7-13
ABSTRACT The author presents the results of the December 1998 CONSER “Survey on Providing Access to Serial Titles within Aggregator Databases.” Major findings include 71% of respondents desiring to see full-text serial titles incorporated into the online catalog and nearly 75% interested in acquiring record sets. Also included are an analysis of the numerous survey comments received, strategies toward creating the necessary records and integrating them into OPACs, examples of aggregator analytic records, and other background information on the work of the Program for Cooperative Cataloging's Task Group on Journals in Aggregator Databases.1 相似文献
8.
Features for image retrieval: an experimental comparison 总被引:6,自引:0,他引:6
9.
《Cataloging & classification quarterly》2013,51(4):43-61
ABSTRACT This article illustrates the collaboration of three entities- the library, its online system and the authority service vendor-to achieve online authority control in a medium-sized academic library efficiently and cost-effectively. The authority service vendor checks and revises (as needed) the headings of the library's new bibliographic records, providing new authority records for them. Through a separate notification service, the vendor also provides revised authority records for the existing authority file, alerting the library when in-house online maintenance is needed. In this online environment, authority records and bibliographic records are maintained automatically by the vendor authority service. Thus, using various authority heading reports which can be generated by the local online system in conjunction with the notification service, the library concentrates its authority work on the elimination or correction of obsolete headings. The authority vendor services are effective and reduce the cost of maintaining the online catalog. Ultimately, accurate standardized headings enhance the ease and effectiveness of online searching and retrieval for all who consult the library's catalog. 相似文献
10.
Massimo Melucci 《Information Retrieval》1999,1(1-2):91-114
This paper assesses the retrieval effectiveness of automatically constructed inter-document hypertext links in Information Retrieval (IR). The objectives of the experiments described are to obtain evidence concerning the usefulness of querying and browsing automatically constructed IR hypertexts. Links are built by using IR techniques, as these enable rapid, automatic production of hypertexts from a document collection for accessing the collection itself. These tests are carried out in a laboratory environment and through simulation of link browsing. Results of experiments show that browsing has little impact on the retrieval of relevant documents if used in place of querying or relevance feedback methods, though may be practical if used in combination with them. 相似文献
11.
Arabic documents that are available only in print continue to be ubiquitous and they can be scanned and subsequently OCR’ed
to ease their retrieval. This paper explores the effect of context-based OCR correction on the effectiveness of retrieving
Arabic OCR documents using different index terms. Different OCR correction techniques based on language modeling with different
correction abilities were tested on real OCR and synthetic OCR degradation. Results show that the reduction of word error
rates needs to pass a certain limit to get a noticeable effect on retrieval. If only moderate error reduction is available,
then using short character n-gram for retrieval without error correction is not a bad strategy. Word-based correction in conjunction
with language modeling had a statistically significant impact on retrieval even for character 3-grams, which are known to
be among the best index terms for OCR degraded Arabic text. Further, using a sufficiently large language model for correction
can minimize the need for morphologically sensitive error correction.
相似文献
Kareem DarwishEmail: |
12.
[目的/意义]针对新生代用户群体对信息检索系统的需求,提出一种游戏化信息检索系统的理论模型,实现激发用户使用检索系统的兴趣,支持用户的信息检索与交互以及鼓励用户持续使用的目标。[方法/过程]基于游戏化基础理论、相关框架及信息检索系统的机制,对不同游戏元素进行组合,在考虑不同游戏元素与规则之间关系的前提下,设计具有特定功能的模块,实现游戏元素在非游戏情境中的应用。[结果/结论]为构建游戏化信息检索系统的理论模型,确定20种游戏元素,并按其功能进行组合,设计出12类游戏模块,包括5类简单模块和7类复合模块,使信息检索系统具备游戏功能。提出的构建思路和理论模型弥补当前游戏化信息检索领域研究的不足,为开发游戏化信息检索系统及后续的相关研究提供了理论框架。 相似文献
13.
Multilingual information retrieval is generally understood to mean the retrieval of relevant information in multiple target
languages in response to a user query in a single source language. In a multilingual federated search environment, different
information sources contain documents in different languages. A general search strategy in multilingual federated search environments
is to translate the user query to each language of the information sources and run a monolingual search in each information
source. It is then necessary to obtain a single ranked document list by merging the individual ranked lists from the information
sources that are in different languages. This is known as the results merging problem for multilingual information retrieval.
Previous research has shown that the simple approach of normalizing source-specific document scores is not effective. On the
other side, a more effective merging method was proposed to download and translate all retrieved documents into the source
language and generate the final ranked list by running a monolingual search in the search client. The latter method is more
effective but is associated with a large amount of online communication and computation costs. This paper proposes an effective
and efficient approach for the results merging task of multilingual ranked lists. Particularly, it downloads only a small
number of documents from the individual ranked lists of each user query to calculate comparable document scores by utilizing
both the query-based translation method and the document-based translation method. Then, query-specific and source-specific
transformation models can be trained for individual ranked lists by using the information of these downloaded documents. These
transformation models are used to estimate comparable document scores for all retrieved documents and thus the documents can
be sorted into a final ranked list. This merging approach is efficient as only a subset of the retrieved documents are downloaded
and translated online. Furthermore, an extensive set of experiments on the Cross-Language Evaluation Forum (CLEF) () data has demonstrated the effectiveness of the query-specific and source-specific results merging algorithm against other
alternatives. The new research in this paper proposes different variants of the query-specific and source-specific results
merging algorithm with different transformation models. This paper also provides thorough experimental results as well as
detailed analysis. All of the work substantially extends the preliminary research in (Si and Callan, in: Peters (ed.) Results
of the cross-language evaluation forum-CLEF 2005, 2005).
相似文献
Hao YuanEmail: |
14.
15.
Content-only queries in hierarchically structured documents should retrieve the most specific document nodes which are exhaustive
to the information need. For this problem, we investigate two methods of augmentation, which both yield high retrieval quality.
As retrieval effectiveness, we consider the ratio of retrieval quality and response time; thus, fast approximations to the
'correct' retrieval result may yield higher effectiveness. We present a classification scheme for algorithms addressing this
issue, and adopt known algorithms from standard document retrieval for XML retrieval. As a new strategy, we propose incremental-interruptible retrieval, which allows for instant presentation of the top ranking documents. We develop a new algorithm implementing this strategy
and evaluate the different methods with the INEX collection. 相似文献
16.
Information Retrieval systems typically sort the result with respect to document retrieval status values (RSV). According to the Probability Ranking Principle, this ranking ensures optimum retrieval quality if the RSVs are monotonously increasing with the probabilities of relevance (as e.g. for probabilistic IR models). However, advanced applications like filtering or distributed retrieval require estimates of the actual probability of relevance. The relationship between the RSV of a document and its probability of relevance can be described by a normalisation function which maps the retrieval status value onto the probability of relevance (mapping functions). In this paper, we explore the use of linear and logistic mapping functions for different retrieval methods. In a series of upper-bound experiments, we compare the approximation quality of the different mapping functions. We also investigate the effect on the resulting retrieval quality in distributed retrieval (only merging, without resource selection). These experiments show that good estimates of the actual probability of relevance can be achieved, and that the logistic model outperforms the linear one. Retrieval quality for distributed retrieval is only slightly improved by using the logistic function. 相似文献
17.
Query languages for XML such as XPath or XQuery support Boolean retrieval: a query result is a (possibly restructured) subset of XML elements or entire documents that satisfy the search conditions of the query. This search paradigm works for highly schematic XML data collections such as electronic catalogs. However, for searching information in open environments such as the Web or intranets of large corporations, ranked retrieval is more appropriate: a query result is a ranked list of XML elements in descending order of (estimated) relevance. Web search engines, which are based on the ranked retrieval paradigm, do, however, not consider the additional information and rich annotations provided by the structure of XML documents and their element names.This article presents the XXL search engine that supports relevance ranking on XML data. XXL is particularly geared for path queries with wildcards that can span multiple XML collections and contain both exact-match as well as semantic-similarity search conditions. In addition, ontological information and suitable index structures are used to improve the search efficiency and effectiveness. XXL is fully implemented as a suite of Java classes and servlets. Experiments in the context of the INEX benchmark demonstrate the efficiency of the XXL search engine and underline its effectiveness for ranked retrieval. 相似文献
18.
This paper presents four novel techniques for open-vocabulary spoken document retrieval: a method to detect slots that possibly contain a query feature; a method to estimate occurrence probabilities; a technique that we call collection-wide probability re-estimation and a weighting scheme which takes advantage of the fact that long query features are detected more reliably. These four techniques have been evaluated using the TREC-6 spoken document retrieval test collection to determine the improvements in retrieval effectiveness with respect to a baseline retrieval method. Results show that the retrieval effectiveness can be improved considerably despite the large number of speech recognition errors. 相似文献
19.
Objective:The National Library of Medicine (NLM) inaugurated a “publication type” concept to facilitate searches for systematic reviews (SRs). On the other hand, clinical queries (CQs) are validated search strategies designed to retrieve scientifically sound, clinically relevant original and review articles from biomedical literature databases. We compared the retrieval performance of the SR publication type (SR[pt]) against the most sensitive CQ for systematic review articles (CQrs) in PubMed.Methods:We ran date-limited searches of SR[pt] and CQrs to compare the relative yield of articles and SRs, focusing on the differences in retrieval of SRs by SR[pt] but not CQrs (SR[pt] NOT CQrs) and CQrs NOT SR[pt]. Random samples of articles retrieved in each of these comparisons were examined for SRs until a consistent pattern became evident.Results:For SR[pt] NOT CQrs, the yield was relatively low in quantity but rich in quality, with 79% of the articles being SRs. For CQrs NOT SR[pt], the yield was high in quantity but low in quality, with only 8% being SRs. For CQrs AND SR[pt], the quality was highest, with 92% being SRs.Conclusions:We found that SR[pt] had high precision and specificity for SRs but low recall (sensitivity), whereas CQrs had much higher recall. SR[pt] OR CQrs added valid SRs to the CQrs yield at low cost (i.e., added few non-SRs). For searches that are intended to be exhaustive for SRs, SR[pt] can be added to existing sensitive search filters. 相似文献