首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Re-ranking the search results in order to promote novel ones has traditionally been regarded as an intuitive diversification strategy. In this paper, we challenge this common intuition and thoroughly investigate the actual role of novelty for search result diversification, based upon the framework provided by the diversity task of the TREC 2009 and 2010 Web tracks. Our results show that existing diversification approaches based solely on novelty cannot consistently improve over a standard, non-diversified baseline ranking. Moreover, when deployed as an additional component by the current state-of-the-art diversification approaches, our results show that novelty does not bring significant improvements, while adding considerable efficiency overheads. Finally, through a comprehensive analysis with simulated rankings of various quality, we demonstrate that, although inherently limited by the performance of the initial ranking, novelty plays a role at breaking the tie between similarly diverse results.  相似文献   

2.
We study the problem of web search result diversification in the case where intent based relevance scores are available. A diversified search result will hopefully satisfy the information need of user-L.s who may have different intents. In this context, we first analyze the properties of an intent-based metric, ERR-IA, to measure relevance and diversity altogether. We argue that this is a better metric than some previously proposed intent aware metrics and show that it has a better correlation with abandonment rate. We then propose an algorithm to rerank web search results based on optimizing an objective function corresponding to this metric and evaluate it on shopping related queries.  相似文献   

3.
4.
博物馆的类型繁多,由此而形成的博物馆建筑的差别也往往很大。为使博物馆建筑设计能符合所建博物馆的特定需要,有必要对博物馆建筑加以分类,并分门别类地分析研究它们各自的特点与相互的差别,进而把握其建筑设计的针对性。  相似文献   

5.
Product reviews have become an important resource for customers before they make purchase decisions. However, the abundance of reviews makes it difficult for customers to digest them and make informed choices. In our study, we aim to help customers who want to quickly capture the main idea of a lengthy product review before they read the details. In contrast with existing work on review analysis and document summarization, we aim to retrieve a set of real-world user questions to summarize a review. In this way, users would know what questions a given review can address and they may further read the review only if they have similar questions about the product. Specifically, we design a two-stage approach which consists of question selection and question diversification. For question selection phase, we first employ probabilistic retrieval models to locate candidate questions that are relevant to a given review. A Recurrent Neural Network Encoder–Decoder is utilized to measure the “answerability” of questions to a review. We then design a set function to re-rank the questions with the goal of rewarding diversity in the final question set. The set function satisfies submodularity and monotonicity, which results in an efficient greedy algorithm of submodular optimization. Evaluation on product reviews from two categories shows that the proposed approach is effective for discovering meaningful questions that are representative of individual reviews.  相似文献   

6.
We consider the following autocompletion search scenario: imagine a user of a search engine typing a query; then with every keystroke display those completions of the last query word that would lead to the best hits, and also display the best such hits. The following problem is at the core of this feature: for a fixed document collection, given a set D of documents, and an alphabetical range W of words, compute the set of all word-in-document pairs (w, d) from the collection such that w W and d ∈ D. We present a new data structure with the help of which such autocompletion queries can be processed, on the average, in time linear in the input plus output size, independent of the size of the underlying document collection. At the same time, our data structure uses no more space than an inverted index. Actual query processing times on a large test collection correlate almost perfectly with our theoretical bound.
Ingmar WeberEmail:
  相似文献   

7.
8.
网络信息资源检索策略与方法   总被引:6,自引:1,他引:6  
国内外众多网站都开辟了有关于检索策略与方法的专栏,本汇集一些具有代表性的网站,以使网络用户系统地学习到网络信息检索方法和技巧。  相似文献   

9.
10.
Google学术搜索引擎与跨库检索系统的功能对比   总被引:1,自引:0,他引:1  
徐芳 《图书馆学研究》2008,(2):72-73,95
文章介绍了两种数字资源整合利用的方法--Google中文学术搜索引擎和Cross-Search跨库检索系统,并将它们各自的功能进行了对比.  相似文献   

11.
The internet is an important source of medical knowledge for everyone, from laypeople to medical professionals. We investigate how these two extremes, in terms of user groups, have distinct needs and exhibit significantly different search behaviour. We make use of query logs in order to study various aspects of these two kinds of users. The logs from America Online, Health on the Net, Turning Research Into Practice and American Roentgen Ray Society (ARRS) GoldMiner were divided into three sets: (1) laypeople, (2) medical professionals (such as physicians or nurses) searching for health content and (3) users not seeking health advice. Several analyses are made focusing on discovering how users search and what they are most interested in. One possible outcome of our analysis is a classifier to infer user expertise, which was built. We show the results and analyse the feature set used to infer expertise. We conclude that medical experts are more persistent, interacting more with the search engine. Also, our study reveals that, conversely to what is stated in much of the literature, the main focus of users, both laypeople and professionals, is on disease rather than symptoms. The results of this article, especially through the classifier built, could be used to detect specific user groups and then adapt search results to the user group.  相似文献   

12.
Simulation and analysis have shown that selective search can reduce the cost of large-scale distributed information retrieval. By partitioning the collection into small topical shards, and then using a resource ranking algorithm to choose a subset of shards to search for each query, fewer postings are evaluated. In this paper we extend the study of selective search into new areas using a fine-grained simulation, examining the difference in efficiency when term-based and sample-based resource selection algorithms are used; measuring the effect of two policies for assigning index shards to machines; and exploring the benefits of index-spreading and mirroring as the number of deployed machines is varied. Results obtained for two large datasets and four large query logs confirm that selective search is significantly more efficient than conventional distributed search architectures and can handle higher query rates. Furthermore, we demonstrate that selective search can be tuned to avoid bottlenecks, and thus maximize usage of the underlying computer hardware.  相似文献   

13.
14.
This paper describes an ongoing improvement effort directed at increasing the quality of mediated searches at the Sladen Library and Center for Health Information Resources. The project is the result of an analysis of literature statistics for mediated searching for 1997. The improvement project utilizes Deming's Plan-Do-Check-Act or PDCA cycle. Henry Ford Health System encourages use of the PDCA methodology for improvement projects. A key component of this improvement effort was the introduction of a productivity standard that each searcher is required to meet. The library has global productivity goals, but this is the first time that individual searchers have been held to a quantitative performance standard. The outcome of the Literature Search Improvement Project has been favorable.  相似文献   

15.
搜索引擎的性能评价   总被引:7,自引:0,他引:7  
在信息飞速增长的网络环境下,对有用信息的查找变得越来越困难,搜索引擎便应运而生,并逐渐发展壮大。论文以用户为导向构建层次分析模型,借此对搜索引擎的评价作简要探讨。  相似文献   

16.
科技学术期刊审稿多元化探析   总被引:2,自引:0,他引:2  
卢正升 《编辑学报》2007,19(5):321-323
"三审制"传统的机制及运作的惯例,已经无法适应时代发展的要求,在原先严肃但又有点呆板的审稿机制中需要注入多元化的审稿方式,编者和审者在坚持"三审制"严肃性一面的同时应多注意运作方式等的灵活性.  相似文献   

17.
统计学结果的修约   总被引:2,自引:0,他引:2  
郝拉娣  于化东 《编辑学报》2005,17(4):255-255
用统计学结果("平均数±标准差""平均数±标准误")表达带有随机误差的实验结果,是科技论文写作的一大进步,统计学结果的表达不仅要准确,而且其数值修约更不能忽视.  相似文献   

18.
This study analyzed observations and interviews of 31 participants, who were divided into six age groups, to understand the influence of end-user goals and experience on Internet search approaches. Users who lacked experience approached the Internet similarly no matter what the age group. Children and older adults were more likely to lack online search experience than other users. In addition, children and older adults were more homogeneous than other users in that they had a narrow range of situational goals, whereas users in other groups had a wide range of situational goals. The study has implications for user services and research in end-user searching. An understanding of the influence of age, experience, and goals on Internet search patterns might guide how, how much, and in what format information should be presented in the future. Knowledge gained from this study can also form the basis of hypotheses for larger studies.  相似文献   

19.
20.
存储过程在信息检索中的应用   总被引:2,自引:0,他引:2  
以作者开发的期刊记到和检索软件为例,说明了存储过程和触发器在信息检索中的作用及其创建方法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号