期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Brynn M. Evans Ed H. Chi 《Information processing & management》2010

Search engine researchers typically depict search as the solitary activity of an individual searcher. In contrast, results from our critical-incident survey of 150 users on Amazon’s Mechanical Turk service suggest that social interactions play an important role throughout the search process. A second survey of also 150 users, focused instead on difficulties encountered during searches, suggests similar conclusions. These social interactions range from highly coordinated collaborations with shared goals to loosely coordinated collaborations in which only advice is sought. Our main contribution is that we have integrated models from previous work in sensemaking and information-seeking behavior to present a canonical social model of user activities before, during, and after a search episode, suggesting where in the search process both explicitly and implicitly shared information may be valuable to individual searchers. 相似文献

2.

Examining the effectiveness of real-time query expansion

Ryen W. White Gary Marchionini 《Information processing & management》2007

Interactive query expansion (IQE) (c.f. [Efthimiadis, E. N. (1996). Query expansion. Annual Review of Information Systems and Technology, 31, 121–187]) is a potentially useful technique to help searchers formulate improved query statements, and ultimately retrieve better search results. However, IQE is seldom used in operational settings. Two possible explanations for this are that IQE is generally not integrated into searchers’ established information-seeking behaviors (e.g., examining lists of documents), and it may not be offered at a time in the search when it is needed most (i.e., during the initial query formulation). These challenges can be addressed by coupling IQE more closely with familiar search activities, rather than as a separate functionality that searchers must learn. In this article we introduce and evaluate a variant of IQE known as Real-Time Query Expansion (RTQE). As a searcher enters their query in a text box at the interface, RTQE provides a list of suggested additional query terms, in effect offering query expansion options while the query is formulated. To investigate how the technique is used – and when it may be useful – we conducted a user study comparing three search interfaces: a baseline interface with no query expansion support; an interface that provides expansion options during query entry, and a third interface that provides options after queries have been submitted to a search system. The results show that offering RTQE leads to better quality initial queries, more engagement in the search, and an increase in the uptake of query expansion. However, the results also imply that care must be taken when implementing RTQE interactively. Our findings have broad implications for how IQE should be offered, and form part of our research on the development of techniques to support the increased use of query expansion. 相似文献

3.

搜索引擎Cloaking技术研究

李村合刘竞《情报科学》2005,23(6):905-907,916

Cloaking指的是显示给Google和其他搜索引擎的页面与显示给普通浏览者的页面不同的技术。这种技术的目的通常是想操纵搜索引擎的排名结果，当普通浏览者在搜索引擎结果中查找资料的时候就会受到误导。本文介绍了Cloaking的定义和出现原因，并对Cloaking技术进行了分析和评价。相似文献

4.

Determining the informational,navigational, and transactional intent of Web queries

Bernard J. Jansen Danielle L. Booth Amanda Spink 《Information processing & management》2008

In this paper, we define and present a comprehensive classification of user intent for Web searching. The classification consists of three hierarchical levels of informational, navigational, and transactional intent. After deriving attributes of each, we then developed a software application that automatically classified queries using a Web search engine log of over a million and a half queries submitted by several hundred thousand users. Our findings show that more than 80% of Web queries are informational in nature, with about 10% each being navigational and transactional. In order to validate the accuracy of our algorithm, we manually coded 400 queries and compared the results from this manual classification to the results determined by the automated method. This comparison showed that the automatic classification has an accuracy of 74%. Of the remaining 25% of the queries, the user intent is vague or multi-faceted, pointing to the need for probabilistic classification. We discuss how search engines can use knowledge of user intent to provide more targeted and relevant results in Web searching. 相似文献

5.

A study of results overlap and uniqueness among major Web search engines

Amanda Spink Bernard J. Jansen Chris Blakely Sherry Koshman 《Information processing & management》2006

The performance and capabilities of Web search engines is an important and significant area of research. Millions of people world wide use Web search engines very day. This paper reports the results of a major study examining the overlap among results retrieved by multiple Web search engines for a large set of more than 10,000 queries. Previous smaller studies have discussed a lack of overlap in results returned by Web search engines for the same queries. The goal of the current study was to conduct a large-scale study to measure the overlap of search results on the first result page (both non-sponsored and sponsored) across the four most popular Web search engines, at specific points in time using a large number of queries. The Web search engines included in the study were MSN Search, Google, Yahoo! and Ask Jeeves. Our study then compares these results with the first page results retrieved for the same queries by the metasearch engine Dogpile.com. Two sets of randomly selected user-entered queries, one set was 10,316 queries and the other 12,570 queries, from Infospace’s Dogpile.com search engine (the first set was from Dogpile, the second was from across the Infospace Network of search properties were submitted to the four single Web search engines). Findings show that the percent of total results unique to only one of the four Web search engines was 84.9%, shared by two of the three Web search engines was 11.4%, shared by three of the Web search engines was 2.6%, and shared by all four Web search engines was 1.1%. This small degree of overlap shows the significant difference in the way major Web search engines retrieve and rank results in response to given queries. Results point to the value of metasearch engines in Web retrieval to overcome the biases of individual search engines. 相似文献

6.

Real time search on the web: Queries,topics, and economic value

Bernard J. Jansen Zhe Liu Courtney Weaver Gerry Campbell Matthew Gregg 《Information processing & management》2011

Real time search is an increasingly important area of information seeking on the Web. In this research, we analyze 1,005,296 user interactions with a real time search engine over a 190 day period. Using query log analysis, we investigate searching behavior, categorize search topics, and measure the economic value of this real time search stream. We examine aggregate usage of the search engine, including number of users, queries, and terms. We then classify queries into subject categories using the Google Directory topical hierarchy. We next estimate the economic value of the real time search traffic using the Google AdWords keyword advertising platform. Results shows that 30% of the queries were unique (used only once in the entire dataset), which is low compared to traditional Web searching. Also, 60% of the search traffic comes from the search engine’s application program interface, indicating that real time search is heavily leveraged by other applications. There are many repeated queries over time via these application program interfaces, perhaps indicating both long term interest in a topic and the polling nature of real time queries. Concerning search topics, the most used terms dealt with technology, entertainment, and politics, reflecting both the temporal nature of the queries and, perhaps, an early adopter user-based. However, 36% of the queries indicate some geographical affinity, pointing to a location-based aspect to real time search. In terms of economic value, we calculate this real time search stream to be worth approximately US $33,000,000 (US $33 M) on the online advertising market at the time of the study. We discuss the implications for search engines and content providers as real time content increasingly enters the main stream as an information source. 相似文献

7.

User-assisted query translation for interactive cross-language information retrieval

Douglas W. Oard Daqing He Jianqiang Wang 《Information processing & management》2008

相似文献

8.

An implicit feedback approach for interactive information retrieval

Ryen W. White Joemon M. Jose Ian Ruthven 《Information processing & management》2006

Searchers can face problems finding the information they seek. One reason for this is that they may have difficulty devising queries to express their information needs. In this article, we describe an approach that uses unobtrusive monitoring of interaction to proactively support searchers. The approach chooses terms to better represent information needs by monitoring searcher interaction with different representations of top-ranked documents. Information needs are dynamic and can change as a searcher views information. The approach we propose gathers evidence on potential changes in these needs and uses this evidence to choose new retrieval strategies. We present an evaluation of how well our technique estimates information needs, how well it estimates changes in these needs and the appropriateness of the interface support it offers. The results are presented and the avenues for future research identified. 相似文献

9.

Exploring features for the automatic identification of user goals in web search

Mauro Rojas Herrera Edleno Silva de Moura Marco Cristo Thomaz Philippe Silva Altigran Soares da Silva 《Information processing & management》2010

Queries submitted to search engines can be classified according to the user goals into three distinct categories: navigational, informational, and transactional. Such classification may be useful, for instance, as additional information for advertisement selection algorithms and for search engine ranking functions, among other possible applications. This paper presents a study about the impact of using several features extracted from the document collection and query logs on the task of automatically identifying the users’ goals behind their queries. We propose the use of new features not previously reported in literature and study their impact on the quality of the query classification task. Further, we study the impact of each feature on different web collections, showing that the choice of the best set of features may change according to the target collection. 相似文献

10.

How doctors search: A study of query behaviour and the impact on search results

Marianne Lykke Susan Price Lois Delcambre 《Information processing & management》2012

Professional, workplace searching is different from general searching, because it is typically limited to specific facets and targeted to a single answer. We have developed the semantic component (SC) model, which is a search feature that allows searchers to structure and specify the search to context-specific aspects of the main topic of the documents. We have tested the model in an interactive searching study with family doctors with the purpose to explore doctors’ querying behaviour, how they applied the means for specifying a search, and how these features contributed to the search outcome. In general, the doctors were capable of exploiting system features and search tactics during the searching. Most searchers produced well-structured queries that contained appropriate search facets. When searches failed it was not due to query structure or query length. Failures were mostly caused by the well-known vocabulary problem. The problem was exacerbated by using certain filters as Boolean filters. The best working queries were structured into 2–3 main facets out of 3–5 possible search facets, and expressed with terms reflecting the focal view of the search task. The findings at the same time support and extend previous results about query structure and exhaustivity showing the importance of selecting central search facets and express them from the perspective of search task. The SC model was applied in the highest performing queries except one. The findings suggest that the model might be a helpful feature to structure queries into central, appropriate facets, and in returning highly relevant documents. 相似文献

11.

Beyond actions: Exploring the discovery of tactics from user logs

《Information processing & management》2016,52(6):1200-1226

Search log analysis has become a common practice to gain insights into user search behaviour: it helps gain an understanding of user needs and preferences, as well as an insight into how well a system supports such needs. Currently, log analysis is typically focused on low-level user actions, i.e. logged events such as issued queries and clicked results, and often only a selection of such events are logged and analysed. However, types of logged events may differ widely from interface to interface, making comparison between systems difficult. Further, the interpretation of the meaning of and subsequent analysis of a selection of events may lead to conclusions out of context— e.g. the statistics of observed query reformulations may be influenced by the existence of a relevance feedback component. Alternatively, in lab studies user activities can be analysed at a higher level, such as search tactics and strategies, abstracted away from detailed interface implementation. Unfortunately, until now the required manual codings that map logged events to higher-level interpretations have prevented large-scale use of this type of analysis. In this paper, we propose a new method for analysing search logs by (semi-)automatically identifying user search tactics from logged events, allowing large-scale analysis that is comparable across search systems. In addition, as the resulting analysis is at a tactical level we reduce potential issues surrounding the need for interpretation of low-level user actions for log analysis. We validate the efficiency and effectiveness of the proposed tactic identification method using logs of two reference search systems of different natures: a product search system and a video search system. With the identified tactics, we perform a series of novel log analyses in terms of entropy rate of user search tactic sequences, demonstrating how this type of analysis allows comparisons of user search behaviours across systems of different nature and design. This analysis provides insights not achievable with traditional log analysis. 相似文献

12.

The effectiveness of Web search engines for retrieving relevant ecommerce links

Bernard J. Jansen Paulo R. Molina 《Information processing & management》2006

Ecommerce is developing into a fast-growing channel for new business, so a strong presence in this domain could prove essential to the success of numerous commercial organizations. However, there is little research examining ecommerce at the individual customer level, particularly on the success of everyday ecommerce searches. This is critical for the continued success of online commerce. The purpose of this research is to evaluate the effectiveness of search engines in the retrieval of relevant ecommerce links. The study examines the effectiveness of five different types of search engines in response to ecommerce queries by comparing the engines’ quality of ecommerce links using topical relevancy ratings. This research employs 100 ecommerce queries, five major search engines, and more than 3540 Web links. The findings indicate that links retrieved using an ecommerce search engine are significantly better than those obtained from most other engines types but do not significantly differ from links obtained from a Web directory service. We discuss the implications for Web system design and ecommerce marketing campaigns. 相似文献

13.

Bid keyword suggestion in sponsored search based on competitiveness and relevance

Ying Zhang Weinan Zhang Bin Gao Xiaojie Yuan Tie-Yan Liu 《Information processing & management》2014

In sponsored search, many advertisers have not achieved their expected performances while the search engine also has a large room to improve their revenue. Specifically, due to the improper keyword bidding, many advertisers cannot survive the competitive ad auctions to get their desired ad impressions; meanwhile, a significant portion of search queries have no ads displayed in their search result pages, even if many of them have commercial values. We propose recommending a group of relevant yet less-competitive keywords to an advertiser. Hence, the advertiser can get the chance to win some (originally empty) ad slots and accumulate a number of impressions. At the same time, the revenue of the search engine can also be boosted since many empty ad shots are filled. Mathematically, we model the problem as a mixed integer programming problem, which maximizes the advertiser revenue and the relevance of the recommended keywords, while minimizing the keyword competitiveness, subject to the bid and budget constraints. By solving the problem, we can offer an optimal group of keywords and their optimal bid prices to an advertiser. Simulation results have shown the proposed method is highly effective in increasing ad impressions, expected clicks, advertiser revenue, and search engine revenue. 相似文献

14.

Computing controversy: Formal model and algorithms for detecting controversy on Wikipedia and in search queries

Kazimierz Zielinski Radoslaw Nielek Adam Wierzbicki Adam Jatowt 《Information processing & management》2018,54(1):14-36

Controversy is a complex concept that has been attracting attention of scholars from diverse fields. In the era of Internet and social media, detecting controversy and controversial concepts by the means of automatic methods is especially important. Web searchers could be alerted when the contents they consume are controversial or when they attempt to acquire information on disputed topics. Presenting users with the indications and explanations of the controversy should offer them chance to see the “wider picture” rather than letting them obtain one-sided views. In this work we first introduce a formal model of controversy as the basis of computational approaches to detecting controversial concepts. Then we propose a classification based method for automatic detection of controversial articles and categories in Wikipedia. Next, we demonstrate how to use the obtained results for the estimation of the controversy level of search queries. The proposed method can be incorporated into search engines as a component responsible for detection of queries related to controversial topics. The method is independent of the search engine’s retrieval and search results recommendation algorithms, and is therefore unaffected by a possible filter bubble.Our approach can be also applied in Wikipedia or other knowledge bases for supporting the detection of controversy and content maintenance. Finally, we believe that our results could be useful for social science researchers for understanding the complex nature of controversy and in fostering their studies. 相似文献

15.

Cost-effectiveness comparison of manual and on-line retrospective bibliographic searching.

D R Elchesen 《Journal of the American Society for Information Science》1978,29(2):56-66

A study to compare the cost effectiveness of retrospective manual and on-line bibliographic searching is described. Forty search queries were processed against seven abstracting-indexing publications and the corresponding SDC/ORBIT data bases. Equivalent periods of coverage and searcher skill levels were used for both search models. Separate task times were measured for question analysis, searching, photocopying, shelving, and output distribution. Component costs were calculated for labor, information, reproduction, equipment, physical space, and telecommunications. Results indicate that on-line searching is generally faster, less costly, and more effective than manual searching. However, for certain query/information-source combinations, manual searching may offer some advantages in precision and turn-around time. The results of a number of related studies are reviewed. 相似文献

16.

Social summarization in collaborative web search

Oisı´n Boydell Barry Smyth 《Information processing & management》2010

A critical challenge for Web search engines concerns how they present relevant results to searchers. The traditional approach is to produce a ranked list of results with title and summary (snippet) information, and these snippets are usually chosen based on the current query. Snippets play a vital sensemaking role, helping searchers to efficiently make sense of a collection of search results, as well as determine the likely relevance of individual results. Recently researchers have begun to explore how snippets might also be adapted based on searcher preferences as a way to better highlight relevant results to the searcher. In this paper we focus on the role of snippets in collaborative web search and describe a technique for summarizing search results that harnesses the collaborative search behaviour of communities of like-minded searchers to produce snippets that are more focused on the preferences of the searchers. We go on to show how this so-called social summarization technique can generate summaries that are significantly better adapted to searcher preferences and describe a novel personalized search interface that combines result recommendation with social summarization. 相似文献

17.

Using searcher simulations to redesign a polyrepresentative implicit feedback interface

Ryen W. White 《Information processing & management》2006

Information seeking is traditionally conducted in environments where search results are represented at the user interface by a minimal amount of meta-information such as titles and query-based summaries. The goal of this form of presentation is to give searchers sufficient context to help them make informed interaction decisions without overloading them cognitively. The principle of polyrepresentation [Ingwersen, P. (1996). Cognitive perspectives of information retrieval interaction: elements of a cognitive IR theory. Journal of Documentation 52, 3–50] suggests that information retrieval (IR) systems should provide and use different cognitive structures during acts of communication to reduce the uncertainty associated with interactive IR. In previous work we have created content-rich search interfaces that implement an aspect of polyrepresentative theory, and are capable of displaying multiple representations of the retrieved documents simultaneously at the results interface. Searcher interaction with content-rich interfaces was used as implicit relevance feedback (IRF) to construct modified queries. These interfaces have been shown to be successful in experimentation with human subjects but we do not know whether the information was presented in a way that makes good use of the display space, or positioned most useful components in easily accessible locations, for use in IRF. In this article we use simulations of searcher interaction behaviour as design tools to determine the most rational interface design for when IRF is employed. This research forms part of the iterative design of interfaces to proactively support searchers. 相似文献

18.

The effects of perceived chronic pressure and time constraint on information search behaviors and experience

《Information processing & management》2019,56(5):1667-1679

In this paper, we explore the effects of individual pressure level and time constraint on searchers' behaviors and their assessment of search experience within the framework of interactive information retrieval. A user experiment was conducted in which 40 participants individually searched for information in a laboratory setting under two conditions: with time constraint (TC) and with no time constraint (NTC). Participants filled in a Perceived Stress Scale questionnaire to measure their chronic pressure value (subjective stress), and their pressure value was recorded as their individual characteristic. The results showed that the more chronic pressure the searcher has, the more search efforts they devote, including more time in searching and more time to complete the search tasks, especially when there was no time constraint. Time constraint and searchers’ pressure value had a significant effect on users’ numbers of scrolling actions per minute. The results indicate that when given a time constraint, searchers with higher-pressure values tend to lower their reading or scanning speed, while searchers with lower-pressure values tend to accelerate their reading or scanning speed. The results suggested different people would react to the time condition change in different ways, especially people with higher pressure. Therefore, it is necessary to examine users’ search behaviors in person-in-situation frameworks to analyze the effects of contextual factors on users. This study contributes to our knowledge of how contextual factors and individual characteristics affect searchers’ behaviors and have implications for the design of IIR systems. 相似文献

19.

Automatic new topic identification using multiple linear regression

Seda Ozmutlu 《Information processing & management》2006

The purpose of this study is to provide automatic new topic identification of search engine query logs, and estimate the effect of statistical characteristics of search engine queries on new topic identification. By applying multiple linear regression and multi-factor ANOVA on a sample data log from the Excite search engine, we demonstrated that the statistical characteristics of Web search queries, such as time interval, search pattern and position of a query in a user session, are effective on shifting to a new topic. Multiple linear regression is also a successful tool for estimating topic shifts and continuations. The findings of this study provide statistical proof for the relationship between the non-semantic characteristics of Web search queries and the occurrence of topic shifts and continuations. 相似文献

20.

An analysis of Web searching by European AlltheWeb.com users

《Information processing & management》2005,41(2):361-381

The Web has become a worldwide source of information and a mainstream business tool. It is changing the way people conduct the daily business of their lives. As these changes are occurring, we need to understand what Web searching trends are emerging within the various global regions. What are the regional differences and trends in Web searching, if any? What is the effectiveness of Web search engines as providers of information? As part of a body of research studying these questions, we have analyzed two data sets collected from queries by mainly European users submitted to AlltheWeb.com on 6 February 2001 and 28 May 2002. AlltheWeb.com is a major and highly rated European search engine. Each data set contains approximately a million queries submitted by over 200,000 users and spans a 24-h period. This longitudinal benchmark study shows that European Web searching is evolving in certain directions. There was some decline in query length, with extremely simple queries. European search topics are broadening, with a notable percentage decline in sexual and pornographic searching. The majority of Web searchers view fewer than five Web documents, spending only seconds on a Web document. Approximately 50% of the Web documents viewed by these European users were topically relevant. We discuss the implications for Web information systems and information content providers. 相似文献