首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The results from a series of three experiments that used Text Retrieval Conference (TREC) data and TREC search topics are compared. These experiments each involved three novel user interfaces (one per experiment). User interfaces that made it easier for users to view text were found to improve recall in all three experiments. A distinction was found between a cluster of subjects (a majority of whom were search experts) who tended to read fewer documents more carefully (readers, or exclusives) and subjects who skimmed through more documents without reading them as carefully (skimmers, or inclusives). Skimmers were found to have significantly better recall overall. A major outcome from our experiments at TREC and with the TREC data, is that hypertext interfaces to information retrieval (IR) tasks tend to increase recall. Our interpretation of this pattern of results across the three experiments is that increased interaction with the text (more pages viewed) generally improves recall. Findings from one of the experiments indicated that viewing a greater diversity of text on a single screen (i.e., not just more text per se, but more articles available at once) may also improve recall. In an experiment where a traditional (type-in) query interface was contrasted with a condition where queries were marked up on the text, the improvement in recall due to viewing more text was more pronounced with search novices. Our results demonstrate that markup and hypertext interfaces to text retrieval systems can benefit recall and can also benefit novices. The challenge now will be to find modified versions of hypertext interfaces that can improve precision, as well as recall and that can work with users who prefer to use different types of search strategy or have different types of training and experience.  相似文献   

2.
Arabic is a morphologically rich language that presents significant challenges to many natural language processing applications because a word often conveys complex meanings decomposable into several morphemes (i.e. prefix, stem, suffix). By segmenting words into morphemes, we could improve the performance of English/Arabic translation pair’s extraction from parallel texts. This paper describes two algorithms and their combination to automatically extract an English/Arabic bilingual dictionary from parallel texts that exist in the Internet archive after using an Arabic light stemmer as a preprocessing step. Before using the Arabic light stemmer, the total system precision and recall were 88.6% and 81.5% respectively, then the system precision an recall increased to 91.6% and 82.6% respectively after applying the Arabic light stemmer on the Arabic documents.  相似文献   

3.
This paper describes an automatic approach designed to improve the retrieval effectiveness of very short queries such as those used in web searching. The method is based on the observation that stemming, which is designed to maximize recall, often results in depressed precision. Our approach is based on pseudo-feedback and attempts to increase the number of relevant documents in the pseudo-relevant set by reranking those documents based on the presence of unstemmed query terms in the document text. The original experiments underlying this work were carried out using Smart 11.0 and the lnc.ltc weighting scheme on three sets of documents from the TREC collection with corresponding TREC (title only) topics as queries. (The average length of these queries after stoplisting ranges from 2.4 to 4.5 terms.) Results, evaluated in terms of P@20 and non-interpolated average precision, showed clearly that pseudo-feedback (PF) based on this approach was effective in increasing the number of relevant documents in the top ranks. Subsequent experiments, performed on the same data sets using Smart 13.0 and the improved Lnu.ltu weighting scheme, indicate that these results hold up even over the much higher baseline provided by the new weights. Query drift analysis presents a more detailed picture of the improvements produced by this process.  相似文献   

4.
Search engines are essential for finding information on the World Wide Web. We conducted a study to see how effective eight search engines are. Expert searchers sought information on the Web for users who had legitimate needs for information, and these users assessed the relevance of the information retrieved. We calculated traditional information retrieval measures of recall and precision at varying numbers of retrieved documents and used these as the bases for statistical comparisons of retrieval effectiveness among the eight search engines. We also calculated the likelihood that a document retrieved by one search engine was retrieved by other search engines as well.  相似文献   

5.
Document filtering (DF) and document classification (DC) are often integrated together to classify suitable documents into suitable categories. A popular way to achieve integrated DF and DC is to associate each category with a threshold. A document d may be classified into a category c only if its degree of acceptance (DOA) with respect to c is higher than the threshold of c. Therefore, tuning a proper threshold for each category is essential. A threshold that is too high (low) may mislead the classifier to reject (accept) too many documents. Unfortunately, thresholding is often based on the classifier's DOA estimations, which cannot always be reliable, due to two common phenomena: (1) the DOA estimations made by the classifier cannot always be correct, and (2) not all documents may be classified without any controversy. Unreliable estimations are actually noises that may mislead the thresholding process. In this paper, we present an adaptive and parameter-free technique AS4T to sample reliable DOA estimations for thresholding. AS4T operates by adapting to the classifier's status, without needing to define any parameters. Experimental results show that, by helping to derive more proper thresholds, AS4T may guide various classifiers to achieve significantly better and more stable performances under different circumstances. The contributions are of practical significance for real-world integrated DF and DC.  相似文献   

6.
Many Web sites have begun allowing users to submit items to a collection and tag them with keywords. The folksonomies built from these tags are an interesting topic that has seen little empirical research. This study compared the search information retrieval (IR) performance of folksonomies from social bookmarking Web sites against search engines and subject directories. Thirty-four participants created 103 queries for various information needs. Results from each IR system were collected and participants judged relevance. Folksonomy search results overlapped with those from the other systems, and documents found by both search engines and folksonomies were significantly more likely to be judged relevant than those returned by any single IR system type. The search engines in the study had the highest precision and recall, but the folksonomies fared surprisingly well. Del.icio.us was statistically indistinguishable from the directories in many cases. Overall the directories were more precise than the folksonomies but they had similar recall scores. Better query handling may enhance folksonomy IR performance further. The folksonomies studied were promising, and may be able to improve Web search performance.  相似文献   

7.
This paper presents a systematic analysis of twenty four performance measures used in the complete spectrum of Machine Learning classification tasks, i.e., binary, multi-class, multi-labelled, and hierarchical. For each classification task, the study relates a set of changes in a confusion matrix to specific characteristics of data. Then the analysis concentrates on the type of changes to a confusion matrix that do not change a measure, therefore, preserve a classifier’s evaluation (measure invariance). The result is the measure invariance taxonomy with respect to all relevant label distribution changes in a classification problem. This formal analysis is supported by examples of applications where invariance properties of measures lead to a more reliable evaluation of classifiers. Text classification supplements the discussion with several case studies.  相似文献   

8.
A better understanding of what motivates humans to perform certain actions is relevant for a range of research challenges including generating action sequences that implement goals (planning). A first step in this direction is the task of acquiring knowledge about human goals. In this work, we investigate whether Search Query Logs are a viable source for extracting expressions of human goals. For this purpose, we devise an algorithm that automatically identifies queries containing explicit goals such as find home to rent in Florida. Evaluation results of our algorithm achieve useful precision/recall values. We apply the classification algorithm to two large Search Query Logs, recorded by AOL and Microsoft Research in 2006, and obtain a set of ∼110,000 queries containing explicit goals. To study the nature of human goals in Search Query Logs, we conduct qualitative, quantitative and comparative analyses. Our findings suggest that Search Query Logs (i) represent a viable source for extracting human goals, (ii) contain a great variety of human goals and (iii) contain human goals that can be employed to complement existing commonsense knowledge bases. Finally, we illustrate the potential of goal knowledge for addressing following application scenario: to refine and extend commonsense knowledge with human goals from Search Query Logs. This work is relevant for (i) knowledge engineers interested in acquiring human goals from textual corpora and constructing knowledge bases of human goals (ii) researchers interested in studying characteristics of human goals in Search Query Logs.  相似文献   

9.
Nowadays, it is a common practice for healthcare professionals to spread medical knowledge by posting health articles on social media. However, promoting users’ intention to share such articles is challenging because the extent of sharing intention varies in their eHealth literacy (high or low) and the content valence of the article that they are exposed to (positive or negative). This study investigates boundary conditions under which eHealth literacy and content valence help to increase users’ intention to share by introducing a moderating role of confirmation bias—a tendency to prefer information that conforms to their initial beliefs. A 2 (eHealth literacy: high vs. low) × 2 (content valence: positive vs. negative) between-subjects experiment was conducted in a sample of 80 participants. Levels of confirmation bias ranging from extreme negative bias to extreme positive bias among the participants were assessed during the experiment. Results suggested that: (1) users with a high level of eHealth literacy were more likely to share positive health articles when they had extreme confirmation bias; (2) users with a high level of eHealth literacy were more likely to share negative health articles when they had moderate confirmation bias or no confirmation bias; (3) users with a low level of eHealth literacy were more likely to share health articles regardless of positive or negative content valence when they had moderate positive confirmation bias. This study sheds new light on the role of confirmation bias in users’ health information sharing. Also, it offers implications for health information providers who want to increase the visibility of their online health articles: they need to consider readers’ eHealth literacy and confirmation bias when deciding the content valence of the articles.  相似文献   

10.
One of the best known measures of information retrieval (IR) performance is the F-score, the harmonic mean of precision and recall. In this article we show that the curve of the F-score as a function of the number of retrieved items is always of the same shape: a fast concave increase to a maximum, followed by a slow decrease. In other words, there exists a single maximum, referred to as the tipping point, where the retrieval situation is ‘ideal’ in terms of the F-score. The tipping point thus indicates the optimal number of items to be retrieved, with more or less items resulting in a lower F-score. This empirical result is found in IR and link prediction experiments and can be partially explained theoretically, expanding on earlier results by Egghe. We discuss the implications and argue that, when comparing F-scores, one should compare the F-score curves’ tipping points.  相似文献   

11.
Summarisation is traditionally used to produce summaries of the textual contents of documents. In this paper, it is argued that summarisation methods can also be applied to the logical structure of XML documents. Structure summarisation selects the most important elements of the logical structure and ensures that the user’s attention is focused towards sections, subsections, etc. that are believed to be of particular interest. Structure summaries are shown to users as hierarchical tables of contents. This paper discusses methods for structure summarisation that use various features of XML elements in order to select document portions that a user’s attention should be focused to. An evaluation methodology for structure summarisation is also introduced and summarisation results using various summariser versions are presented and compared to one another. We show that data sets used in information retrieval evaluation can be used effectively in order to produce high quality (query independent) structure summaries. We also discuss the choice and effectiveness of particular summariser features with respect to several evaluation measures.  相似文献   

12.
This paper presents a study of relevance feedback in a cross-language information retrieval environment. We have performed an experiment in which Portuguese speakers are asked to judge the relevance of English documents; documents hand-translated to Portuguese and documents automatically translated to Portuguese. The goals of the experiment were to answer two questions (i) how well can native Portuguese searchers recognise relevant documents written in English, compared to documents that are hand translated and automatically translated to Portuguese; and (ii) what is the impact of misjudged documents on the performance improvement that can be achieved by relevance feedback. Surprisingly, the results show that machine translation is as effective as hand translation in aiding users to assess relevance in the experiment. In addition, the impact of misjudged documents on the performance of RF is overall just moderate, and varies greatly for different query topics.  相似文献   

13.
The increasing number of documents that have to be indexed in different environments, particularly on the Web, and the lack of scalability of a single centralised index lead to the use of distributed information retrieval systems to effectively search for and locate the required information. In this study, we present several improvements over the two main bottlenecks in a distributed information retrieval system (the network and the brokers). We extend a simulation network model in order to represent a switched network. The new simulation model is validated by comparing the estimated response times with those obtained using a real system. We show that the use of a switched network reduces the saturation of the interconnection network, especially in a replicated system, and some improvements may be achieved using multicast messages and faster connections with the brokers. We also demonstrate that reducing the partial results sets will improve the response time of a distributed system by 53%, with a negligible probability of changing the system’s precision and recall values. Finally, we present a simple hierarchical distributed broker model that will reduce the response times for a distributed system by 55%.  相似文献   

14.
In this paper, we propose a new algorithm, which incorporates the relationships of concept-based thesauri into the document categorization using the k-NN classifier (k-NN). k-NN is one of the most popular document categorization methods because it shows relatively good performance in spite of its simplicity. However, it significantly degrades precision when ambiguity arises, i.e., when there exist more than one candidate category to which a document can be assigned. To remedy the drawback, we employ concept-based thesauri in the categorization. Employing the thesaurus entails structuring categories into hierarchies, since their structure needs to be conformed to that of the thesaurus for capturing relationships between categories. By referencing various relationships in the thesaurus corresponding to the structured categories, k-NN can be prominently improved, removing the ambiguity. In this paper, we first perform the document categorization by using k-NN and then employ the relationships to reduce the ambiguity. Experimental results show that this method improves the precision of k-NN up to 13.86% without compromising its recall.  相似文献   

15.
This study theorized and validated a model of knowledge sharing continuance in a special type of online community, the online question answering (Q&A) community, in which knowledge exchange is reflected mainly by asking and answering specific questions. We created a model that integrated knowledge sharing factors and knowledge self-efficacy into the expectation confirmation theory. The hypotheses derived from this model were empirically validated using an online survey conducted among users of a famous online Q&A community in China, “Yahoo! Answers China”. The results suggested that users’ intention to continue sharing knowledge (i.e., answering questions) was directly influenced by users’ ex-post feelings as consisting of two dimensions: satisfaction, and knowledge self-efficacy. Based on the obtained results, we also found that knowledge self-efficacy and confirmation mediated the relationship between benefits and satisfaction.  相似文献   

16.
A qualitative study of user information needs is reported, based on a purposive sample of users and potential users of the Vaughan Williams Memorial Library, a small specialist folk music library in North London. The study set out to establish what the users’ (both existing and potential) information needs are, so that the library’s online service may take them into account with its design. The information needs framework proposed by Nicholas [Nicholas, D. (2000) Assessing information needs: tools, techniques and concepts for the internet age. London: ASLIB] is used as an analytical tool to achieve this end. The demographics of the users were examined in order to establish four user groups: Performer, Academic, Professional and Enthusiast. Important information needs were found to be based on social interaction, and key resources of the library were its staff, the concentration of the collection and the library’s social nature. A collection of broad design requirements are proposed based on the analysis and this study also provides some insights into the issue of musical relevance, which are discussed.  相似文献   

17.
Diversification of web search results aims to promote documents with diverse content (i.e., covering different aspects of a query) to the top-ranked positions, to satisfy more users, enhance fairness and reduce bias. In this work, we focus on the explicit diversification methods, which assume that the query aspects are known at the diversification time, and leverage supervised learning methods to improve their performance in three different frameworks with different features and goals. First, in the LTRDiv framework, we focus on applying typical learning to rank (LTR) algorithms to obtain a ranking where each top-ranked document covers as many aspects as possible. We argue that such rankings optimize various diversification metrics (under certain assumptions), and hence, are likely to achieve diversity in practice. Second, in the AspectRanker framework, we apply LTR for ranking the aspects of a query with the goal of more accurately setting the aspect importance values for diversification. As features, we exploit several pre- and post-retrieval query performance predictors (QPPs) to estimate how well a given aspect is covered among the candidate documents. Finally, in the LmDiv framework, we cast the diversification problem into an alternative fusion task, namely, the supervised merging of rankings per query aspect. We again use QPPs computed over the candidate set for each aspect, and optimize an objective function that is tailored for the diversification goal. We conduct thorough comparative experiments using both the basic systems (based on the well-known BM25 matching function) and the best-performing systems (with more sophisticated retrieval methods) from previous TREC campaigns. Our findings reveal that the proposed frameworks, especially AspectRanker and LmDiv, outperform both non-diversified rankings and two strong diversification baselines (i.e., xQuAD and its variant) in terms of various effectiveness metrics.  相似文献   

18.
19.
The study focuses on which users to target and why and how to inspire their participation by applying combination of von Hippel's lead user and user innovation toolkits with Rogers’ innovation diffusion theories. After an investigation of a social networking website, this study finds that individuals with large number of hits are highly active users of new functions. Moreover, they are likely to use toolkits to customize their personal uses and respond to others’ problems. Therefore, they garner appreciation from others in return, achieve higher ranks in the top hit parade, and obtain better-expected benefits from the website's incentive compensation. This study also evaluates the toolkits’ efficacy in the Web 2.0 context and finds that they are not equivalents. This research offers insights useful for web service providers to target innovative users and create an environment using web toolkits to induce user-generated innovation and achieve better effect of innovation communication.  相似文献   

20.
The goal of the study presented in this article is to investigate to what extent the classification of a web page by a single genre matches the users’ perspective. The extent of agreement on a single genre label for a web page can help understand whether there is a need for a different classification scheme that overrides the single-genre labelling. My hypothesis is that a single genre label does not account for the users’ perspective. In order to test this hypothesis, I submitted a restricted number of web pages (25 web pages) to a large number of web users (135 subjects) asking them to assign only a single genre label to each of the web pages. Users could choose from a list of 21 genre labels, or select one of the two ‘escape’ options, i.e. ‘Add a label’ and ‘I don’t know’. The rationale was to observe the level of agreement on a single genre label per web page, and draw some conclusions about the appropriateness of limiting the assignment to only a single label when doing genre classification of web pages. Results show that users largely disagree on the label to be assigned to a web page.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号