首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Modern information retrieval (IR) test collections have grown in size, but the available manpower for relevance assessments has more or less remained constant. Hence, how to reliably evaluate and compare IR systems using incomplete relevance data, where many documents exist that were never examined by the relevance assessors, is receiving a lot of attention. This article compares the robustness of IR metrics to incomplete relevance assessments, using four different sets of graded-relevance test collections with submitted runs—the TREC 2003 and 2004 robust track data and the NTCIR-6 Japanese and Chinese IR data from the crosslingual task. Following previous work, we artificially reduce the original relevance data to simulate IR evaluation environments with extremely incomplete relevance data. We then investigate the effect of this reduction on discriminative power, which we define as the proportion of system pairs with a statistically significant difference for a given probability of Type I Error, and on Kendall’s rank correlation, which reflects the overall resemblance of two system rankings according to two different metrics or two different relevance data sets. According to these experiments, Q′, nDCG′ and AP′ proposed by Sakai are superior to bpref proposed by Buckley and Voorhees and to Rank-Biased Precision proposed by Moffat and Zobel. We also point out some weaknesses of bpref and Rank-Biased Precision by examining their formal definitions.
Noriko KandoEmail:
  相似文献   

2.
Recently direct optimization of information retrieval (IR) measures has become a new trend in learning to rank. In this paper, we propose a general framework for direct optimization of IR measures, which enjoys several theoretical advantages. The general framework, which can be used to optimize most IR measures, addresses the task by approximating the IR measures and optimizing the approximated surrogate functions. Theoretical analysis shows that a high approximation accuracy can be achieved by the framework. We take average precision (AP) and normalized discounted cumulated gains (NDCG) as examples to demonstrate how to realize the proposed framework. Experiments on benchmark datasets show that the algorithms deduced from our framework are very effective when compared to existing methods. The empirical results also agree well with the theoretical results obtained in the paper.  相似文献   

3.
Direct optimization of evaluation measures has become an important branch of learning to rank for information retrieval (IR). Since IR evaluation measures are difficult to optimize due to their non-continuity and non-differentiability, most direct optimization methods optimize some surrogate functions instead, which we call surrogate measures. A critical issue regarding these methods is whether the optimization of the surrogate measures can really lead to the optimization of the original IR evaluation measures. In this work, we perform formal analysis on this issue. We propose a concept named “tendency correlation” to describe the relationship between a surrogate measure and its corresponding IR evaluation measure. We show that when a surrogate measure has arbitrarily strong tendency correlation with an IR evaluation measure, the optimization of it will lead to the effective optimization of the original IR evaluation measure. Then, we analyze the tendency correlations of the surrogate measures optimized in a number of direct optimization methods. We prove that the surrogate measures in SoftRank and ApproxRank can have arbitrarily strong tendency correlation with the original IR evaluation measures, regardless of the data distribution, when some parameters are appropriately set. However, the surrogate measures in SVM MAP , DORM NDCG , PermuRank MAP , and SVM NDCG cannot have arbitrarily strong tendency correlation with the original IR evaluation measures on certain distributions of data. Therefore SoftRank and ApproxRank are theoretically sounder than SVM MAP , DORM NDCG , PermuRank MAP , and SVM NDCG , and are expected to result in better ranking performances. Our theoretical findings can explain the experimental results observed on public benchmark datasets.  相似文献   

4.
In modern information processing technology there is a significant tendency to connect microfilm and Computer Science Techniques to each other. The purpose of it is to automatize information retrieval systems. Such an automatized system is shown here. It consists of a central computer based on a microprocessor with an external storage disk, a microfilm reader, a CRT terminal and the corresponding interfaces. The data structure handled by the system consists of a societies file and a documents file. The societies file has a hash organization and the documents file is structured as a linked stack.  相似文献   

5.
信息检索课在高等院校信息素质教育中发挥着重要的作用,提高信息素质教育水平也是该课程的教学目标。从构建新的教学目标、合理调整教学内容、整合多种现代教学方法、加强与学科专业课程的结合、建立有效的评价体系、提高教师队伍的综合素质6个方面对面向信息素质教育的信息检索课教学改革进行了探讨。  相似文献   

6.
网络信息检索的未来   总被引:8,自引:0,他引:8  
网络信息检索在未来的发展表现在以下几个方面:网络检索工具的综合化与专业化;网络检索工具的智能化;检索语言的两极化;对非文本信息检索能力的提高;人工参与检索工具的信息组织;收费网络信息检索工具的兴起.  相似文献   

7.
介绍了首都医科大学的在线考试系统,比较分析了学生的考试成绩,指出了在线考试系统的优点及需改进的问题。  相似文献   

8.
网络环境下的信息检索教学设计   总被引:4,自引:0,他引:4  
本文在分析网络信息检索教学现状的基础上,探讨了如何在新形势下进一步进行网络信息检索教学改革。主要在教学内容、教学方法及师资建设上进行新的尝试,来提高信息检索教学质量,以适应信息时代发展的需要。  相似文献   

9.
A standard approach to Information Retrieval (IR) is to model text as a bag of words. Alternatively, text can be modelled as a graph, whose vertices represent words, and whose edges represent relations between the words, defined on the basis of any meaningful statistical or linguistic relation. Given such a text graph, graph theoretic computations can be applied to measure various properties of the graph, and hence of the text. This work explores the usefulness of such graph-based text representations for IR. Specifically, we propose a principled graph-theoretic approach of (1) computing term weights and (2) integrating discourse aspects into retrieval. Given a text graph, whose vertices denote terms linked by co-occurrence and grammatical modification, we use graph ranking computations (e.g. PageRank Page et al. in The pagerank citation ranking: Bringing order to the Web. Technical report, Stanford Digital Library Technologies Project, 1998) to derive weights for each vertex, i.e. term weights, which we use to rank documents against queries. We reason that our graph-based term weights do not necessarily need to be normalised by document length (unlike existing term weights) because they are already scaled by their graph-ranking computation. This is a departure from existing IR ranking functions, and we experimentally show that it performs comparably to a tuned ranking baseline, such as BM25 (Robertson et al. in NIST Special Publication 500-236: TREC-4, 1995). In addition, we integrate into ranking graph properties, such as the average path length, or clustering coefficient, which represent different aspects of the topology of the graph, and by extension of the document represented as a graph. Integrating such properties into ranking allows us to consider issues such as discourse coherence, flow and density during retrieval. We experimentally show that this type of ranking performs comparably to BM25, and can even outperform it, across different TREC (Voorhees and Harman in TREC: Experiment and evaluation in information retrieval, MIT Press, 2005) datasets and evaluation measures.  相似文献   

10.
Over the last three decades, research in Information Retrieval (IR) shows performance improvement when many sources of evidence are combined to produce a ranking of documents. Most current approaches assess document relevance by computing a single score which aggregates values of some attributes or criteria. They use analytic aggregation operators which either lead to a loss of valuable information, e.g., the min or lexicographic operators, or allow very bad scores on some criteria to be compensated with good ones, e.g., the weighted sum operator. Moreover, all these approaches do not handle imprecision of criterion scores. In this paper, we propose a multiple criteria framework using a new aggregation mechanism based on decision rules identifying positive and negative reasons for judging whether a document should get a better ranking than another. The resulting procedure also handles imprecision in criteria design. Experimental results are reported showing that the suggested method performs better than standard aggregation operators.  相似文献   

11.
网络信息组织与检索   总被引:1,自引:0,他引:1  
本文主要阐述网络信息的收集、组织与检索.与传统的知识组织相比,网络信息组织更贴近网络用户的实际.搜索引擎在网络信息组织和控制中起重要的作用.  相似文献   

12.
We present a new ranking algorithm that combines the strengths of two previous methods: boosted tree classification, and LambdaRank, which has been shown to be empirically optimal for a widely used information retrieval measure. Our algorithm is based on boosted regression trees, although the ideas apply to any weak learners, and it is significantly faster in both train and test phases than the state of the art, for comparable accuracy. We also show how to find the optimal linear combination for any two rankers, and we use this method to solve the line search problem exactly during boosting. In addition, we show that starting with a previously trained model, and boosting using its residuals, furnishes an effective technique for model adaptation, and we give significantly improved results for a particularly pressing problem in web search—training rankers for markets for which only small amounts of labeled data are available, given a ranker trained on much more data from a larger market.  相似文献   

13.
信息检索中"相关性"的探究   总被引:3,自引:0,他引:3  
本文从“相关性”的动态、多维的内涵出发,探讨了“相关性”的影响因素,即信息源、检索系统、用户、时间与环境,最后得出了“相关性”对建立信息检索系统的一些启示。  相似文献   

14.
网络信息资源及其检索   总被引:6,自引:0,他引:6  
网络信息资源及其检索●许云文(佛山图书馆)Internet是世界上规模最大,用户最多、影响最大的网络互联系统,它已经覆盖了全球154个国家和地区,联入4.8万多个计算机网,近400万台主机,拥有4000多万个终端用户,预计到2000年,网络用户量将超...  相似文献   

15.
16.
There have been a number of linear, feature-based models proposed by the information retrieval community recently. Although each model is presented differently, they all share a common underlying framework. In this paper, we explore and discuss the theoretical issues of this framework, including a novel look at the parameter space. We then detail supervised training algorithms that directly maximize the evaluation metric under consideration, such as mean average precision. We present results that show training models in this way can lead to significantly better test set performance compared to other training methods that do not directly maximize the metric. Finally, we show that linear feature-based models can consistently and significantly outperform current state of the art retrieval models with the correct choice of features.
  相似文献   

17.
Teaching and learning in information retrieval   总被引:1,自引:1,他引:0  
A literature review of pedagogical methods for teaching and learning information retrieval is presented. From the analysis of the literature a taxonomy was built and it is used to structure the paper. Information Retrieval (IR) is presented from different points of view: technical levels, educational goals, teaching and learning methods, assessment and curricula. The review is organized around two levels of abstraction which form a taxonomy that deals with the different aspects of pedagogy as applied to information retrieval. The first level looks at the technical level of delivering information retrieval concepts, and at the educational goals as articulated by the two main subject domains where IR is delivered: computer science (CS) and library and information science (LIS). The second level focuses on pedagogical issues, such as teaching and learning methods, delivery modes (classroom, online or e-learning), use of IR systems for teaching, assessment and feedback, and curricula design. The survey, and its bibliography, provides an overview of the pedagogical research carried out in the field of IR. It also provides a guide for educators on approaches that can be applied to improving the student learning experiences.  相似文献   

18.
The application of visualization techniques to information retrieval (IR) has resulted in the development of innovative systems and interfaces that are now available for public use. Visualization tools have emerged in research environments and more recently on the Web to retrieve information. Questions arise in regard to the utility of Web-based IR visualization tools for assisting users not only in manipulating search output, but also in managing the information retrieval process. To understand how Web-based visualization tools enable visual information retrieval, this article reviews some of the human perceptual theory behind the graphical interface of information visualization systems, analyzes iconic representations and information density on visualization displays, and examines information retrieval tasks that have been used in visualization system user research. This article is timely since it addresses new technologies for Web information retrieval and discusses future information visualization user research directions.  相似文献   

19.
搜索代理工具与信息检索   总被引:1,自引:0,他引:1  
本文分析了网络信息搜索现状的不足,探讨了搜索代理工具Copernic在信息检索中的作用,并详细介绍了它的使用及功能。  相似文献   

20.
We review the history of modeling score distributions, focusing on the mixture of normal-exponential by investigating the theoretical as well as the empirical evidence supporting its use. We discuss previously suggested conditions which valid binary mixture models should satisfy, such as the Recall-Fallout Convexity Hypothesis, and formulate two new hypotheses considering the component distributions, individually as well as in pairs, under some limiting conditions of parameter values. From all the mixtures suggested in the past, the current theoretical argument points to the two gamma as the most-likely universal model, with the normal-exponential being a usable approximation. Beyond the theoretical contribution, we provide new experimental evidence showing vector space or geometric models, and BM25, as being ‘friendly’ to the normal-exponential, and that the non-convexity problem that the mixture possesses is practically not severe. Furthermore, we review recent non-binary mixture models, speculate on graded relevance, and consider methods such as logistic regression for score calibration.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号