共查询到20条相似文献,搜索用时 31 毫秒
1.
Most current machine learning methods for building search engines are based on the assumption that there is a target evaluation
metric that evaluates the quality of the search engine with respect to an end user and the engine should be trained to optimize
for that metric. Treating the target evaluation metric as a given, many different approaches (e.g. LambdaRank, SoftRank, RankingSVM,
etc.) have been proposed to develop methods for optimizing for retrieval metrics. Target metrics used in optimization act
as bottlenecks that summarize the training data and it is known that some evaluation metrics are more informative than others.
In this paper, we consider the effect of the target evaluation metric on learning to rank. In particular, we question the
current assumption that retrieval systems should be designed to directly optimize for a metric that is assumed to evaluate
user satisfaction. We show that even if user satisfaction can be measured by a metric X, optimizing the engine on a training
set for a more informative metric Y may result in a better test performance according to X (as compared to optimizing the
engine directly for X on the training set). We analyze the situations as to when there is a significant difference in the
two cases in terms of the amount of available training data and the number of dimensions of the feature space. 相似文献
2.
XML retrieval is a departure from standard document retrieval in which each individual XML element, ranging from italicized words or phrases to full blown articles, is a retrievable unit. The distribution of XML element lengths is unlike what we usually observe in standard document collections, prompting us to revisit the issue of document length normalization. We perform a comparative analysis of arbitrary elements versus relevant elements, and show the importance of element length as a parameter for XML retrieval. Within the language modeling framework, we investigate a range of techniques that deal with length either directly or indirectly. We observe a length-bias introduced by the amount of smoothing, and show the importance of extreme length bias for XML retrieval. We also show that simply removing shorter elements from the index (by introducing a cut-off value) does not create an appropriate element length normalization. Even after restricting the minimal size of XML elements occurring in the index, the importance of an extreme explicit length bias remains. 相似文献
3.
Information Retrieval systems typically sort the result with respect to document retrieval status values (RSV). According to the Probability Ranking Principle, this ranking ensures optimum retrieval quality if the RSVs are monotonously increasing with the probabilities of relevance (as e.g. for probabilistic IR models). However, advanced applications like filtering or distributed retrieval require estimates of the actual probability of relevance. The relationship between the RSV of a document and its probability of relevance can be described by a normalisation function which maps the retrieval status value onto the probability of relevance (mapping functions). In this paper, we explore the use of linear and logistic mapping functions for different retrieval methods. In a series of upper-bound experiments, we compare the approximation quality of the different mapping functions. We also investigate the effect on the resulting retrieval quality in distributed retrieval (only merging, without resource selection). These experiments show that good estimates of the actual probability of relevance can be achieved, and that the logistic model outperforms the linear one. Retrieval quality for distributed retrieval is only slightly improved by using the logistic function. 相似文献
4.
Operational multimodal information retrieval systems have to deal with increasingly complex document collections and queries that are composed of a large set of textual and non-textual modalities such as ratings, prices, timestamps, geographical coordinates, etc. The resulting combinatorial explosion of modality combinations makes it intractable to treat each modality individually and to obtain suitable training data. As a consequence, instead of finding and training new models for each individual modality or combination of modalities, it is crucial to establish unified models, and fuse their outputs in a robust way. Since the most popular weighting schemes for textual retrieval have in the past generalized well to many retrieval tasks, we demonstrate how they can be adapted to be used with non-textual modalities, which is a first step towards finding such a unified model. We demonstrate that the popular weighting scheme BM25 is suitable to be used for multimodal IR systems and analyze the underlying assumptions of the BM25 formula with respect to merging modalities under the so-called raw-score merging hypothesis, which requires no training. We establish a multimodal baseline for two multimodal test collections, show how modalities differ with respect to their contribution to relevance and the difficulty of treating modalities with overlapping information. Our experiments demonstrate that our multimodal baseline with no training achieves a significantly higher retrieval effectiveness than using just the textual modality for the social book search 2016 collection and lies in the range of a trained multimodal approach using the optimal linear combination of the modality scores. 相似文献
5.
Employing effective methods of sentence retrieval is essential for many tasks in Information Retrieval, such as summarization,
novelty detection and question answering. The best performing sentence retrieval techniques attempt to perform matching directly
between the sentences and the query. However, in this paper, we posit that the local context of a sentence can provide crucial
additional evidence to further improve sentence retrieval. Using a Language Modeling Framework, we propose a novel reformulation
of the sentence retrieval problem that extends previous approaches so that the local context is seamlessly incorporated within
the retrieval models. In a series of comprehensive experiments, we show that localized smoothing and the prior importance
of a sentence can improve retrieval effectiveness. The proposed models significantly and substantially outperform the state
of the art and other competitive sentence retrieval baselines on recall-oriented measures, while remaining competitive on
precision-oriented measures. This research demonstrates that local context plays an important role in estimating the relevance
of a sentence, and that existing sentence retrieval language models can be extended to utilize this evidence effectively. 相似文献
6.
Document length is widely recognized as an important factor for adjusting retrieval systems. Many models tend to favor the
retrieval of either short or long documents and, thus, a length-based correction needs to be applied for avoiding any length
bias. In Language Modeling for Information Retrieval, smoothing methods are applied to move probability mass from document
terms to unseen words, which is often dependant upon document length. In this article, we perform an in-depth study of this
behavior, characterized by the document length retrieval trends, of three popular smoothing methods across a number of factors,
and its impact on the length of documents retrieved and retrieval performance. First, we theoretically analyze the Jelinek–Mercer,
Dirichlet prior and two-stage smoothing strategies and, then, conduct an empirical analysis. In our analysis we show how Dirichlet
prior smoothing caters for document length more appropriately than Jelinek–Mercer smoothing which leads to its superior retrieval
performance. In a follow up analysis, we posit that length-based priors can be used to offset any bias in the length retrieval
trends stemming from the retrieval formula derived by the smoothing technique. We show that the performance of Jelinek–Mercer
smoothing can be significantly improved by using such a prior, which provides a natural and simple alternative to decouple
the query and document modeling roles of smoothing. With the analysis of retrieval behavior conducted in this article, it
is possible to understand why the Dirichlet Prior smoothing performs better than the Jelinek–Mercer, and why the performance
of the Jelinek–Mercer method is improved by including a length-based prior.
相似文献
Leif AzzopardiEmail: |
7.
分布式情报检索系统的拓扑模型 总被引:3,自引:0,他引:3
本文建立了分布式情报检索系统的三种拓扑模型———检索拓扑 ,伪度量拓扑和相似性拓扑 ,并证明了检索拓扑与相似性拓扑具有一定的分布式特征 ,从而说明了这两种拓扑模型在分布式意义下的合理性。 相似文献
8.
Qiang Wu Christopher J. C. Burges Krysta M. Svore Jianfeng Gao 《Information Retrieval》2010,13(3):254-270
We present a new ranking algorithm that combines the strengths of two previous methods: boosted tree classification, and LambdaRank,
which has been shown to be empirically optimal for a widely used information retrieval measure. Our algorithm is based on
boosted regression trees, although the ideas apply to any weak learners, and it is significantly faster in both train and
test phases than the state of the art, for comparable accuracy. We also show how to find the optimal linear combination for
any two rankers, and we use this method to solve the line search problem exactly during boosting. In addition, we show that
starting with a previously trained model, and boosting using its residuals, furnishes an effective technique for model adaptation,
and we give significantly improved results for a particularly pressing problem in web search—training rankers for markets
for which only small amounts of labeled data are available, given a ranker trained on much more data from a larger market. 相似文献
9.
Computational modelling of music similarity is an increasingly important part of personalisation and optimisation in music information retrieval and research in music perception and cognition. The use of relative similarity ratings is a new and promising approach to modelling similarity that avoids well known problems with absolute ratings. In this article, we use relative ratings from the MagnaTagATune dataset with new and existing variants of state-of-the-art algorithms and provide the first comprehensive and rigorous evaluation of this approach. We compare metric learning based on support vector machines (SVMs) and metric-learning-to-rank (MLR), including a diagonal and a novel weighted variant, and relative distance learning with neural networks (RDNN). We further evaluate the effectiveness of different high and low level audio features and genre data, as well as dimensionality reduction methods, weighting of similarity ratings, and different sampling methods. Our results show that music similarity measures learnt on relative ratings can be significantly better than a standard Euclidian metric, depending on the choice of learning algorithm, feature sets and application scenario. MLR and SVM outperform DMLR and RDNN, while MLR with weighted ratings leads to no further performance gain. Timbral and music-structural features are most effective, and all features jointly are significantly better than any other combination of feature sets. Sharing audio clips (but not the similarity ratings) between test and training sets improves performance, in particular for the SVM-based methods, which is useful for some applications scenarios. A testing framework has been implemented in Matlab and made publicly available http://mi.soi.city.ac.uk/datasets/ir2012framework so that these results are reproducible. 相似文献
10.
Document clustering offers the potential of supporting users in interactive retrieval, especially when users have problems
in specifying their information need precisely. In this paper, we present a theoretic foundation for optimum document clustering.
Key idea is to base cluster analysis and evalutation on a set of queries, by defining documents as being similar if they are
relevant to the same queries. Three components are essential within our optimum clustering framework, OCF: (1) a set of queries,
(2) a probabilistic retrieval method, and (3) a document similarity metric. After introducing an appropriate validity measure,
we define optimum clustering with respect to the estimates of the relevance probability for the query-document pairs under
consideration. Moreover, we show that well-known clustering methods are implicitly based on the three components, but that
they use heuristic design decisions for some of them. We argue that with our framework more targeted research for developing
better document clustering methods becomes possible. Experimental results demonstrate the potential of our considerations. 相似文献
11.
As retrieval systems become more complex, learning to rank approaches are being developed to automatically tune their parameters. Using online learning to rank, retrieval systems can learn directly from implicit feedback inferred from user interactions. In such an online setting, algorithms must obtain feedback for effective learning while simultaneously utilizing what has already been learned to produce high quality results. We formulate this challenge as an exploration–exploitation dilemma and propose two methods for addressing it. By adding mechanisms for balancing exploration and exploitation during learning, each method extends a state-of-the-art learning to rank method, one based on listwise learning and the other on pairwise learning. Using a recently developed simulation framework that allows assessment of online performance, we empirically evaluate both methods. Our results show that balancing exploration and exploitation can substantially and significantly improve the online retrieval performance of both listwise and pairwise approaches. In addition, the results demonstrate that such a balance affects the two approaches in different ways, especially when user feedback is noisy, yielding new insights relevant to making online learning to rank effective in practice. 相似文献
12.
On Collection Size and Retrieval Effectiveness 总被引:3,自引:0,他引:3
The relationship between collection size and retrieval effectiveness is particularly important in the context of Web search. We investigate it first analytically and then experimentally, using samples and subsets of test collections. Different retrieval systems vary in how the score assigned to an individual document in a sample collection relates to the score it receives in the full collection; we identify four cases.We apply signal detection (SD) theory to retrieval from samples, taking into account the four cases and using a variety of shapes for relevant and irrelevant distributions. We note that the SD model subsumes several earlier hypotheses about the causes of the decreased precision in samples. We also discuss other models which contribute to an understanding of the phenomenon, particularly relating to the effects of discreteness. Different models provide complementary insights.Extensive use is made of test data, some from official submissions to the TREC-6 VLC track and some new, to illustrate the effects and test hypotheses. We empirically confirm predictions, based on SD theory, that P@n should decline when moving to a sample collection and that average precision and R-precision should remain constant. SD theory suggests the use of recall-fallout plots as operating characteristic (OC) curves. We plot OC curves of this type for a real retrieval system and query set and show that curves for sample collections are similar but not identical to the curve for the full collection. 相似文献
13.
Content-oriented XML retrieval approaches aim at a more focused retrieval strategy: Instead of retrieving whole documents, document components that are exhaustive to the information need while at the same time being as specific as possible should be retrieved. In this article, we show that the evaluation methods developed for standard retrieval must be modified in order to deal with the structure of XML documents. More precisely, the size and overlap of document components must be taken into account. For this purpose, we propose a new effectiveness metric based on the definition of a concept space defined upon the notions of exhaustiveness and specificity of a search result. We compare the results of this new metric by the results obtained with the official metric used in INEX, the evaluation initiative for content-oriented XML retrieval.
相似文献
Gabriella KazaiEmail: |
14.
When speaking of information retrieval, we often mean text retrieval. But there exist many other forms of information retrieval applications. A typical example is collaborative filtering that suggests interesting items to a user by taking into account other users’ preferences or tastes. Due to the uniqueness of the problem, it has been modeled and studied differently in the past, mainly drawing from the preference prediction and machine learning view point. A few attempts have yet been made to bring back collaborative filtering to information (text) retrieval modeling and subsequently new interesting collaborative filtering techniques have been thus derived. In this paper, we show that from the algorithmic view point, there is an even closer relationship between collaborative filtering and text retrieval. Specifically, major collaborative filtering algorithms, such as the memory-based, essentially calculate the dot product between the user vector (as the query vector in text retrieval) and the item rating vector (as the document vector in text retrieval). Thus, if we properly structure user preference data and employ the target user’s ratings as query input, major text retrieval algorithms and systems can be directly used without any modification. In this regard, we propose a unified formulation under a common notational framework for memory-based collaborative filtering, and a technique to use any text retrieval weighting function with collaborative filtering preference data. Besides confirming the rationale of the framework, our preliminary experimental results have also demonstrated the effectiveness of the approach in using text retrieval models and systems to perform item ranking tasks in collaborative filtering. 相似文献
15.
The deployment of Web 2.0 technologies has led to rapid growth of various opinions and reviews on the web, such as reviews
on products and opinions about people. Such content can be very useful to help people find interesting entities like products,
businesses and people based on their individual preferences or tradeoffs. Most existing work on leveraging opinionated content
has focused on integrating and summarizing opinions on entities to help users better digest all the opinions. In this paper,
we propose a different way of leveraging opinionated content, by directly ranking entities based on a user’s preferences.
Our idea is to represent each entity with the text of all the reviews of that entity. Given a user’s keyword query that expresses
the desired features of an entity, we can then rank all the candidate entities based on how well opinions on these entities
match the user’s preferences. We study several methods for solving this problem, including both standard text retrieval models
and some extensions of these models. Experiment results on ranking entities based on opinions in two different domains (hotels
and cars) show that the proposed extensions are effective and lead to improvement of ranking accuracy over the standard text
retrieval models for this task. 相似文献
16.
Bing Bai Jason Weston David Grangier Ronan Collobert Kunihiko Sadamasa Yanjun Qi Olivier Chapelle Kilian Weinberger 《Information Retrieval》2010,13(3):291-314
In this article we present Supervised Semantic Indexing which defines a class of nonlinear (quadratic) models that are discriminatively
trained to directly map from the word content in a query-document or document-document pair to a ranking score. Like Latent
Semantic Indexing (LSI), our models take account of correlations between words (synonymy, polysemy). However, unlike LSI our
models are trained from a supervised signal directly on the ranking task of interest, which we argue is the reason for our
superior results. As the query and target texts are modeled separately, our approach is easily generalized to different retrieval
tasks, such as cross-language retrieval or online advertising placement. Dealing with models on all pairs of words features
is computationally challenging. We propose several improvements to our basic model for addressing this issue, including low
rank (but diagonal preserving) representations, correlated feature hashing and sparsification. We provide an empirical study
of all these methods on retrieval tasks based on Wikipedia documents as well as an Internet advertisement task. We obtain
state-of-the-art performance while providing realistically scalable methods. 相似文献
17.
Blog feed search aims to identify a blog feed of recurring interest to users on a given topic. A blog feed, the retrieval
unit for blog feed search, comprises blog posts of diverse topics. This topical diversity of blog feeds often causes performance
deterioration of blog feed search. To alleviate the problem, this paper proposes several approaches based on passage retrieval,
widely regarded as effective to handle topical diversity at document level in ad-hoc retrieval. We define the global and local
evidence for blog feed search, which correspond to the document-level and passage-level evidence for passage retrieval, respectively,
and investigate their influence on blog feed search, in terms of both initial retrieval and pseudo-relevance feedback. For
initial retrieval, we propose a retrieval framework to integrate global evidence with local evidence. For pseudo-relevance
feedback, we gather feedback information from the local evidence of the top K ranked blog feeds to capture diverse and accurate information related to a given topic. Experimental results show that our
approaches using local evidence consistently and significantly outperform traditional ones. 相似文献
18.
To cope with the fact that, in the ad hoc retrieval setting, documents relevant to a query could contain very few (short)
parts (passages) with query-related information, researchers proposed passage-based document ranking approaches. We show that several of
these retrieval methods can be understood, and new ones can be derived, using the same probabilistic model. We use language-model
estimates to instantiate specific retrieval algorithms, and in doing so present a novel passage language model that integrates information from the containing document to an extent controlled by the estimated document homogeneity. Several document-homogeneity measures that we present yield passage language models that are more effective than the standard
passage model for basic document retrieval and for constructing and utilizing passage-based relevance models; these relevance models also outperform a document-based relevance model. Finally, we demonstrate the merits in using the
document-homogeneity measures for integrating document-query and passage-query similarity information for document retrieval. 相似文献
19.
Fernando Diaz 《Information Retrieval》2007,10(6):531-562
We adapt the cluster hypothesis for score-based information retrieval by claiming that closely related documents should have
similar scores. Given a retrieval from an arbitrary system, we describe an algorithm which directly optimizes this objective
by adjusting retrieval scores so that topically related documents receive similar scores. We refer to this process as score
regularization. Because score regularization operates on retrieval scores, regardless of their origin, we can apply the technique
to arbitrary initial retrieval rankings. Document rankings derived from regularized scores, when compared to rankings derived
from un-regularized scores, consistently and significantly result in improved performance given a variety of baseline retrieval
algorithms. We also present several proofs demonstrating that regularization generalizes methods such as pseudo-relevance
feedback, document expansion, and cluster-based retrieval. Because of these strong empirical and theoretical results, we argue
for the adoption of score regularization as general design principle or post-processing step for information retrieval systems.
相似文献
Fernando DiazEmail: |
20.
Jun Wang Stephen Robertson Arjen P. de Vries Marcel J. T. Reinders 《Information Retrieval》2008,11(6):477-497
Collaborative filtering is concerned with making recommendations about items to users. Most formulations of the problem are
specifically designed for predicting user ratings, assuming past data of explicit user ratings is available. However, in practice
we may only have implicit evidence of user preference; and furthermore, a better view of the task is of generating a top-N
list of items that the user is most likely to like. In this regard, we argue that collaborative filtering can be directly
cast as a relevance ranking problem. We begin with the classic Probability Ranking Principle of information retrieval, proposing a probabilistic
item ranking framework. In the framework, we derive two different ranking models, showing that despite their common origin,
different factorizations reflect two distinctive ways to approach item ranking. For the model estimations, we limit our discussions
to implicit user preference data, and adopt an approximation method introduced in the classic text retrieval model (i.e. the
Okapi BM25 formula) to effectively decouple frequency counts and presence/absence counts in the preference data. Furthermore,
we extend the basic formula by proposing the Bayesian inference to estimate the probability of relevance (and non-relevance),
which largely alleviates the data sparsity problem. Apart from a theoretical contribution, our experiments on real data sets
demonstrate that the proposed methods perform significantly better than other strong baselines.
相似文献
Marcel J. T. ReindersEmail: |