期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A comparison of two methods for boolean query relevancy feedback†

G. Salton E. Voorhees 《Information processing & management》1984,20(5-6)

The relevance feedback process uses information derived from an initially retrieved set of documents to improve subsequent search formulations and retrieval output. In a Boolean query environment this implies that new query terms must be identified and Boolean operators must be chosen automatically to connect the various query terms. In this study two recently proposed automatic methods for relevance feedback of Boolean queries are evaluated and conclusions are drawn concerning the use of effective feedback methods in a Boolean query environment. 相似文献

2.

Adaptive relevance feedback method of extended Boolean model using hierarchical clustering techniques

Jongpill Choi Minkoo Kim Vijay V. Raghavan 《Information processing & management》2006

The relevance feedback process uses information obtained from a user about a set of initially retrieved documents to improve subsequent search formulations and retrieval performance. In extended Boolean models, the relevance feedback implies not only that new query terms must be identified and re-weighted, but also that the terms must be connected with Boolean And/Or operators properly. Salton et al. proposed a relevance feedback method, called DNF (disjunctive normal form) method, for a well established extended Boolean model. However, this method mainly focuses on generating Boolean queries but does not concern about re-weighting query terms. Also, this method has some problems in generating reformulated Boolean queries. In this study, we investigate the problems of the DNF method and propose a relevance feedback method using hierarchical clustering techniques to solve those problems. We also propose a neural network model in which the term weights used in extended Boolean queries can be adjusted by the users’ relevance feedbacks. 相似文献

3.

on the inclusiveness of systems for retrieval documents indexed by unweighted descriptors

Tadeusz Radecki 《Information processing & management》1981,17(5):227-237

相似文献

4.

Mix and match: combining terms and operators for successful Web searches

《Information processing & management》2005,41(4):801-817

This paper presents a detailed analysis of the structure and components of queries written by experimental participants in a study that manipulated two factors found to affect end-user information retrieval performance: training in Boolean logic and the type of search interface. As reported previously, we found that both Boolean training and the use of an assisted interface improved the participants' ability to find correct responses to information requests. Here, we examine the impact of these training and interface manipulations on the Boolean operators and search terms that comprise the submitted queries. Our analysis shows that both Boolean training and the use of an assisted interface improved the participants' ability to correctly utilize various operators. An unexpected finding is that this training also had a positive impact on term selection. The terms and, to a lesser extent, the operators comprising a query were important factors affecting the participants' performance in query tasks. Our findings demonstrate that even small training interventions can improve the users' search performance and highlight the need for additional information retrieval research into how search interfaces can provide superior support to today's untrained users of the Web. 相似文献

5.

A theoretical framework for defining similarity measures for boolean search request formulations, including some experimental results

Tadeusz Radecki 《Information processing & management》1985,21(6):501-524

Clusters of queries submitted to a given information retrieval system can be used as a basis for an effective method of clustering documents. This indirect procedure of document clustering requires the availability of a similarity measure for queries. Research carried out along these lines has resulted in the development of some methodologies for estimating such query similarities, applicable both in the case of queries characterized by sets of weighted or unweighted index terms and in the case of queries represented by Boolean combinations of index terms. This paper reports the results of further research by the author into a methodology of the latter type, i.e. a methodology for determining the similarity between queries characterized by Boolean search request formulations. The novelty of the presented approach, as compared with the methodology introduced in an earlier paper by the author, is that some relations among index terms are now taken into account. A number of similarity measures for Boolean combinations of index terms are discussed here in some detail. The rationale behind these measures is outlined, and the conditions to be met for ensuring their equivalence are identified. Moreover, the results of an experiment concerning two of the similarity measures introduced are given. 相似文献

6.

Term relevance weights in on-line information retrieval

G. Salton R.K. Waldstein 《Information processing & management》1978,14(1):29-35

Considerable evidence exists to show that the use of term relevance weights is beneficial in interactive information retrieval. Various term weighting systems are reviewed. An experiment is then described in which information retrieval users are asked to rank query terms in decreasing order of presumed importance prior to actual search and retrieval. The experimental design is examined, and various relevance ranking systems are evaluated, including fully automatic systems based on inverse document frequency parameters, human rankings performed by the user population, and combinations of the two. 相似文献

7.

Reformulation of queries using similarity thesauri

《Information processing & management》2005,41(5):1163-1173

One of the major problems in information retrieval is the formulation of queries on the part of the user. This entails specifying a set of words or terms that express their informational need. However, it is well-known that two people can assign different terms to refer to the same concepts. The techniques that attempt to reduce this problem as much as possible generally start from a first search, and then study how the initial query can be modified to obtain better results. In general, the construction of the new query involves expanding the terms of the initial query and recalculating the importance of each term in the expanded query. Depending on the technique used to formulate the new query several strategies are distinguished. These strategies are based on the idea that if two terms are similar (with respect to any criterion), the documents in which both terms appear frequently will also be related. The technique we used in this study is known as query expansion using similarity thesauri. 相似文献

8.

A knowledge-based semantic framework for query expansion

Jamal Abdul Nasir Iraklis Varlamis Samreen Ishfaq 《Information processing & management》2019,56(5):1605-1617

Searching for relevant material that satisfies the information need of a user, within a large document collection is a critical activity for web search engines. Query Expansion techniques are widely used by search engines for the disambiguation of user’s information need and for improving the information retrieval (IR) performance. Knowledge-based, corpus-based and relevance feedback, are the main QE techniques, that employ different approaches for expanding the user query with synonyms of the search terms (word synonymy) in order to bring more relevant documents and for filtering documents that contain search terms but with a different meaning (also known as word polysemy problem) than the user intended. This work, surveys existing query expansion techniques, highlights their strengths and limitations and introduces a new method that combines the power of knowledge-based or corpus-based techniques with that of relevance feedback. Experimental evaluation on three information retrieval benchmark datasets shows that the application of knowledge or corpus-based query expansion techniques on the results of the relevance feedback step improves the information retrieval performance, with knowledge-based techniques providing significantly better results than their simple relevance feedback alternatives in all sets. 相似文献

9.

一种面向Web搜索的查询修正方案

杨建林严明《情报理论与实践》2008,31(1):146-149

本文分析了正方法,查询修正中的用户信息行为,吸收网页抓取、检索与浏览并重的思想,综合考虑用户Web搜索过程中的行为特点、查询修正所用词汇的可用来源,给出一个新的面向Web搜索的查询修正解决方案. 相似文献

10.

Improving the learning of Boolean queries by means of a multiobjective IQBE evolutionary algorithm

O. Cordón E. Herrera-Viedma M. Luque 《Information processing & management》2006

The Inductive Query By Example (IQBE) paradigm allows a system to automatically derive queries for a specific Information Retrieval System (IRS). Classic IRSs based on this paradigm [Smith, M., & Smith, M. (1997). The use of genetic programming to build Boolean queries for text retrieval through relevance feedback. Journal of Information Science, 23(6), 423–431] generate a single solution (Boolean query) in each run, that with the best fitness value, which is usually based on a weighted combination of the basic performance criteria, precision and recall. 相似文献

11.

Integrating textual and visual information for cross-language image retrieval: A trans-media dictionary approach

Wen-Cheng Lin Yih-Chen Chang Hsin-Hsi Chen 《Information processing & management》2007

This paper explores the integration of textual and visual information for cross-language image retrieval. An approach which automatically transforms textual queries into visual representations is proposed. First, we mine the relationships between text and images and employ the mined relationships to construct visual queries from textual ones. Then, the retrieval results of textual and visual queries are combined. To evaluate the proposed approach, we conduct English monolingual and Chinese–English cross-language retrieval experiments. The selection of suitable textual query terms to construct visual queries is the major issue. Experimental results show that the proposed approach improves retrieval performance, and use of nouns is appropriate to generate visual queries. 相似文献

12.

Inter and intra-document contexts applied in polyrepresentation for best match IR

Mette Skov Birger Larsen Peter Ingwersen 《Information processing & management》2008

The principle of polyrepresentation offers a theoretical framework for handling multiple contexts in information retrieval (IR). This paper presents an empirical laboratory study of polyrepresentation in restricted mode of the information space with focus on inter and intra-document features. The Cystic Fibrosis test collection indexed in the best match system InQuery constitutes the experimental setting. Overlaps between five functionally and/or cognitively different document representations are identified. Supporting the principle of polyrepresentation, results show that in general overlaps generated by three or four representations of different nature have higher precision than those generated from two representations or the single fields. This result pertains to both structured and unstructured query mode in best match retrieval, however, with the latter query mode demonstrating higher performance. The retrieval overlaps containing search keys from the bibliographic references provide the best retrieval performance and minor MeSH terms the worst. It is concluded that a highly structured query language is necessary when implementing the principle of polyrepresentation in a best match IR system because the principle is inherently Boolean. Finally a re-ranking test shows promising results when search results are re-ranked according to precision obtained in the overlaps whilst re-ranking by citations seems less useful when integrated into polyrepresentative applications. 相似文献

13.

Automated assistance in the formulation of search statements for bibliographic databases

Michael P. Oakes Malcolm J. Taylor 《Information processing & management》1998,34(6):645-668

We report on the design and construction of features of an automated query system which will assist pharmacologists who are not information specialists to access the Derwent Drug File (DDF) pharmacological database. Our approach was to first elucidate those search skills of the search intermediary which might prove tractable to automation. Modules were then produced which assist in the three important subtasks of search statement generation, namely vocabulary selection, the choice of context indicators and query reformulation. Vocabulary selection is facilitated by approximate string matching, morphological analysis, browsing and menu searching. The context of the study, such as treatment or metabolism, is determined using a system of advisory menus. The task of query reformulation is performed using user feedback on retrieved documents, thesaurus relations between document index terms and term postings data. Use is made of diverse information sources, including electronic forms of printed search aids, a thesaurus and a medical dictionary. The system will be of use both to semicasual users and experienced intermediaries. Many of the ideas developed should prove transportable to domains other than pharmacology: the techniques for thesaurus manipulation are designed for use with any hierarchical thesaurus. 相似文献

14.

Noun phrases as building blocks for cross-language Search Assistance

《Information processing & management》2005,41(3):549-568

This paper presents a Foreign-Language Search Assistant that uses noun phrases as fundamental units for document translation and query formulation, translation and refinement. The system (a) supports the foreign-language document selection task providing a cross-language indicative summary based on noun phrase translations, and (b) supports query formulation and refinement using the information displayed in the cross-language document summaries. Our results challenge two implicit assumptions in most of cross-language Information Retrieval research: first, that once documents in the target language are found, Machine Translation is the optimal way of informing the user about their contents; and second, that in an interactive setting the optimal way of formulating and refining the query is helping the user to choose appropriate translations for the query terms. 相似文献

15.

CIDER: Concept-based image diversification,exploration, and retrieval

Enamul Hoque Orland Hoeber Minglun Gong 《Information processing & management》2013

Many of the approaches to image retrieval on the Web have their basis in text retrieval. However, when searchers are asked to describe their image needs, the resulting query is often short and potentially ambiguous. The solution we propose is to perform automatic query expansion using Wikipedia as the source knowledge base, resulting in a diversification of the search results. The outcome is a broad range of images that represent the various possible interpretations of the query. In order to assist the searcher in finding images that match their specific intentions for the query, we have developed an image organization method that uses both the conceptual information associated with each image, and the visual features extracted from the images. This, coupled with a hierarchical organization of the concepts, provides an interactive interface that takes advantage of the searchers’ abilities to recognize relevant concepts, filter and focus the search results based on these concepts, and visually identify relevant images while navigating within the image space. In this paper, we outline the key features of our image retrieval system (CIDER), and present the results of a preliminary user evaluation. The results of this study illustrate the potential benefits that CIDER can provide for searchers conducting image retrieval tasks. 相似文献

16.

Automatic query adjustment in document retrieval

Carlo Vernimb 《Information processing & management》1977,13(6):339-353

相似文献

17.

拟合用户偏好的个性化搜索 总被引：2，自引：0，他引：2

桑艳艳刘培刚李勇《情报科学》2008,26(8)

文章从用户偏好的角度对个性化搜索进行了优化研究,提出了基于语义关联树的查询扩展算法以及基于该算法的拟合用户偏好的个性化搜索系统架构。语义关联树可以灵活有效地控制查询扩展模型,在此之上的拟合用户偏好的个性化搜索系统具有用户偏好自学习能力。实验证明,该方法能有效提高文本检索的准确率。相似文献

18.

Automatic suggestion of phrasal-concept queries for literature search

Youngho Kim Jangwon Seo W. Bruce CroftDavid A. Smith 《Information processing & management》2014

Both general and domain-specific search engines have adopted query suggestion techniques to help users formulate effective queries. In the specific domain of literature search (e.g., finding academic papers), the initial queries are usually based on a draft paper or abstract, rather than short lists of keywords. In this paper, we investigate phrasal-concept query suggestions for literature search. These suggestions explicitly specify important phrasal concepts related to an initial detailed query. The merits of phrasal-concept query suggestions for this domain are their readability and retrieval effectiveness: (1) phrasal concepts are natural for academic authors because of their frequent use of terminology and subject-specific phrases and (2) academic papers describe their key ideas via these subject-specific phrases, and thus phrasal concepts can be used effectively to find those papers. We propose a novel phrasal-concept query suggestion technique that generates queries by identifying key phrasal-concepts from pseudo-labeled documents and combines them with related phrases. Our proposed technique is evaluated in terms of both user preference and retrieval effectiveness. We conduct user experiments to verify a preference for our approach, in comparison to baseline query suggestion methods, and demonstrate the effectiveness of the technique with retrieval experiments. 相似文献

19.

A mathematical model of a weighted boolean retrieval system

W.G. Waller Donald H. Kraft 《Information processing & management》1979,15(5):235-245

The use of weights to denote a query representation and/or the indexing of a document is analysed as a generalization of a Boolean retrieval system. Criteria are given for the functions used to evaluate the relevance of the records to a specific query, including self-consistency. Various mechanisms suggested in the literature for evaluating the relevance of records with regard to a given query are tested and found to be less than satisfactory. A new approach is suggested to avoid some of the perils of a weighted Boolean retrieval system. 相似文献

20.

Indexing strategies for Swedish full text retrieval under different user scenarios

Per Ahlgren Jaana Kekäläinen 《Information processing & management》2007

This paper deals with Swedish full text retrieval and the problem of morphological variation of query terms in the document database. The effects of combination of indexing strategies with query terms on retrieval effectiveness were studied. Three of five tested combinations involved indexing strategies that used conflation, in the form of normalization. Further, two of these three combinations used indexing strategies that employed compound splitting. Normalization and compound splitting were performed by SWETWOL, a morphological analyzer for the Swedish language. A fourth combination attempted to group related terms by right hand truncation of query terms. The four combinations were compared to each other and to a baseline combination, where no attempt was made to counteract the problem of morphological variation of query terms in the document database. The five combinations were evaluated under six different user scenarios, where each scenario simulated a certain user type. The four alternative combinations outperformed the baseline, for each user scenario. The truncation combination had the best performance under each user scenario. The main conclusion of the paper is that normalization and right hand truncation (performed by a search expert) enhanced retrieval effectiveness in comparison to the baseline. The performance of the three combinations of indexing strategies with query terms based on normalization was not far below the performance of the truncation combination. 相似文献