期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Model-driven formative evaluation of exploratory search: A study under a sensemaking framework

Yan Qu George W. Furnas 《Information processing & management》2008

The evaluation of exploratory search relies on the ongoing paradigm shift from focusing on the search algorithm to focusing on the interactive process. This paper proposes a model-driven formative evaluation approach, in which the goal is not the evaluation of a specific system, per se, but the exploration of new design possibilities. This paper gives an example of this approach where a model of sensemaking was used to inform the evaluation of a basic exploratory search system(s) in the context of a sensemaking task. The model suggested that, rather than just looking at simple search performance measures, we should examine closely the interwoven, interactive processes of both representation construction and information seeking. Participants were asked to make sense of an unfamiliar topic using an augmented query-based search system. The processes of representation construction and information seeking were captured and analyzed using data from experiment notes, interviews, and a system log. The data analysis revealed users’ sources of ideas for structuring representations and a tightly coupled relationship between search and representation construction in their exploratory searches. For example, users strategically used search to find useful structure ideas instead of just accumulating information facts. Implications for improving current search systems and designing new systems are discussed. 相似文献

2.

Effectiveness of additional representations for the search result presentation on the web

Hideo Joho Joemon M. Jose 《Information processing & management》2008

The presentation of search results on the web has been dominated by the textual form of document representation. On the other hand, the document’s visual aspects such as the layout, colour scheme, or presence of images have been studied in a limited context with regard to their effectiveness of search result presentation. This article presents a comparative evaluation of textual and visual forms of document representation as additional components of document surrogates. A total of 24 people were recruited for our task-based user study. The experimental results suggest that an increased level of document representation available in the search results can facilitate users’ interaction with a search interface. The results also suggest that the two forms of additional representations are likely beneficial to users’ information searching process in different contexts. 相似文献

3.

Using searcher simulations to redesign a polyrepresentative implicit feedback interface

Ryen W. White 《Information processing & management》2006

Information seeking is traditionally conducted in environments where search results are represented at the user interface by a minimal amount of meta-information such as titles and query-based summaries. The goal of this form of presentation is to give searchers sufficient context to help them make informed interaction decisions without overloading them cognitively. The principle of polyrepresentation [Ingwersen, P. (1996). Cognitive perspectives of information retrieval interaction: elements of a cognitive IR theory. Journal of Documentation 52, 3–50] suggests that information retrieval (IR) systems should provide and use different cognitive structures during acts of communication to reduce the uncertainty associated with interactive IR. In previous work we have created content-rich search interfaces that implement an aspect of polyrepresentative theory, and are capable of displaying multiple representations of the retrieved documents simultaneously at the results interface. Searcher interaction with content-rich interfaces was used as implicit relevance feedback (IRF) to construct modified queries. These interfaces have been shown to be successful in experimentation with human subjects but we do not know whether the information was presented in a way that makes good use of the display space, or positioned most useful components in easily accessible locations, for use in IRF. In this article we use simulations of searcher interaction behaviour as design tools to determine the most rational interface design for when IRF is employed. This research forms part of the iterative design of interfaces to proactively support searchers. 相似文献

4.

A socio-cognitive framework for designing interactive IR systems: Lessons from the Neanderthals

Charles Cole 《Information processing & management》2008

The article analyzes user–IR system interaction from the broad, socio-cognitive perspective of lessons we can learn about human brain evolution when we compare the Neanderthal brain to the human brain before and after a small human brain mutation is hypothesized to have occurred 35,000–75,000 years ago. The enhanced working memory mutation enabled modern humans (i) to decode unfamiliar environmental stimuli with greater focusing power on adaptive solutions to environmental changes and problems, and (ii) to encode environmental stimuli in more efficient, generative knowledge structures. A sociological theory of these evolving, more efficient encoding knowledge structures is given. These new knowledge structures instilled in humans not only the ability to adapt to and survive novelty and/or changing conditions in the environment, but they also instilled an imperative to do so. Present day IR systems ignore the encoding imperative in their design framework. To correct for this lacuna, we propose the evolutionary-based socio-cognitive framework model for designing interactive IR systems. A case study is given to illustrate the functioning of the model. 相似文献

5.

Search on surfaces: Exploring the potential of interactive tabletops for collaborative search tasks

Meredith Ringel Morris Danyel Fisher Daniel Wigdor 《Information processing & management》2010

Collaborative information seeking often takes place in co-located settings; such opportunities may be planned (business colleagues meeting in a conference room or students working together in a library) or spontaneous (family members gathered in their living room or friends meeting at a café). Surface computing technologies (i.e., interactive tabletops) hold great potential for enhancing collaborative information seeking activities. Such devices provide engaging direct manipulation interactions, facilitate awareness of collaborators’ activities, and afford spatial organization of content. However, current tabletop technologies also present several challenges that creators of collaborative information seeking system must account for in their designs. In this article, we explore the design space for collaborative search systems on interactive tabletops, discussing the benefits and challenges of creating search applications for these devices. We discuss how features of our tabletop search prototypes TeamSearch, FourBySix Search, Cambiera, and WeSearch, illustrate different aspects of this design space. 相似文献

6.

Soft-constrained inference for Named Entity Recognition

E. Fersini E. Messina G. Felici D. Roth 《Information processing & management》2014

Much of the valuable information in supporting decision making processes originates in text-based documents. Although these documents can be effectively searched and ranked by modern search engines, actionable knowledge need to be extracted and transformed in a structured form before being used in a decision process. In this paper we describe how the discovery of semantic information embedded in natural language documents can be viewed as an optimization problem aimed at assigning a sequence of labels (hidden states) to a set of interdependent variables (textual tokens). Dependencies among variables are efficiently modeled through Conditional Random Fields, an indirected graphical model able to represent the distribution of labels given a set of observations. The Markov property of these models prevent them to take into account long-range dependencies among variables, which are indeed relevant in Natural Language Processing. In order to overcome this limitation we propose an inference method based on Integer Programming formulation of the problem, where long distance dependencies are included through non-deterministic soft constraints. 相似文献

7.

Bipolar queries in textual information retrieval: A new perspective

Sławomir Zadrożny Janusz Kacprzyk Guy De Tré 《Information processing & management》2012

A new concept of a bipolar query against collections of textual documents, i.e. in the context of information retrieval (IR), is introduced using recent developments in bipolar information modeling and bipolar database queries. Specifically, a particular approach to bipolar queries with an explicit “and possibly” type of an aggregation operator is used. An effective and efficient processing of such bipolar queries using standard IR data structures is briefly discussed. The bipolar queries proposed combine a flexibility provided by fuzzy logic with a more sophisticated representation of user preferences and intentions. This combination can make the search of vast resources of textual document, notably those available via the Internet, more intelligent. 相似文献

8.

Dealing with textual noise for robust and effective BERT re-ranking

《Information processing & management》2023,60(1):103135

The pre-trained language models (PLMs), such as BERT, have been successfully employed in two-phases ranking pipeline for information retrieval (IR). Meanwhile, recent studies have reported that BERT model is vulnerable to imperceptible textual perturbations on quite a few natural language processing (NLP) tasks. As for IR tasks, current established BERT re-ranker is mainly trained on large-scale and relatively clean dataset, such as MS MARCO, but actually noisy text is more common in real-world scenarios, such as web search. In addition, the impact of within-document textual noises (perturbations) on retrieval effectiveness remains to be investigated, especially on the ranking quality of BERT re-ranker, considering its contextualized nature. To mitigate this gap, we carry out exploratory experiments on the MS MARCO dataset in this work to examine whether BERT re-ranker can still perform well when ranking text with noise. Unfortunately, we observe non-negligible effectiveness degradation of BERT re-ranker over a total of ten different types of synthetic within-document textual noise. Furthermore, to address the effectiveness losses over textual noise, we propose a novel noise-tolerant model, De-Ranker, which is learned by minimizing the distance between the noisy text and its original clean version. Our evaluation on the MS MARCO and TREC 2019–2020 DL datasets demonstrates that De-Ranker can deal with synthetic textual noise more effectively, with 3%–4% performance improvement over vanilla BERT re-ranker. Meanwhile, extensive zero-shot transfer experiments on a total of 18 widely-used IR datasets show that De-Ranker can not only tackle natural noise in real-world text, but also achieve 1.32% improvement on average in terms of cross-domain generalization ability on the BEIR benchmark. 相似文献

9.

Ontology-based affective models to organize artworks in the social semantic web

《Information processing & management》2016,52(1):139-162

In this paper, we focus on applying sentiment analysis to resources from online art collections, by exploiting, as information source, tags intended as textual traces that visitors leave to comment artworks on social platforms. We present a framework where methods and tools from a set of disciplines, ranging from Semantic and Social Web to Natural Language Processing, provide us the building blocks for creating a semantic social space to organize artworks according to an ontology of emotions. The ontology is inspired by the Plutchik’s circumplex model, a well-founded psychological model of human emotions. Users can be involved in the creation of the emotional space, through a graphical interactive interface. The development of such semantic space enables new ways of accessing and exploring art collections.The affective categorization model and the emotion detection output are encoded into W3C ontology languages. This gives us the twofold advantage to enable tractable reasoning on detected emotions and related artworks, and to foster the interoperability and integration of tools developed in the Semantic Web and Linked Data community. The proposal has been evaluated against a real-word case study, a dataset of tagged multimedia artworks from the ArsMeteo Italian online collection, and validated through a user study. 相似文献

10.

Deep Learning-based Extraction of Algorithmic Metadata in Full-Text Scholarly Documents

《Information processing & management》2020,57(6):102269

The advancements of search engines for traditional text documents have enabled the effective retrieval of massive textual information in a resource-efficient manner. However, such conventional search methodologies often suffer from poor retrieval accuracy especially when documents exhibit unique properties that behoove specialized and deeper semantic extraction. Recently, AlgorithmSeer, a search engine for algorithms has been proposed, that extracts pseudo-codes and shallow textual metadata from scientific publications and treats them as traditional documents so that the conventional search engine methodology could be applied. However, such a system fails to facilitate user search queries that seek to identify algorithm-specific information, such as the datasets on which algorithms operate, the performance of algorithms, and runtime complexity, etc. In this paper, a set of enhancements to the previously proposed algorithm search engine are presented. Specifically, we propose a set of methods to automatically identify and extract algorithmic pseudo-codes and the sentences that convey related algorithmic metadata using a set of machine-learning techniques. In an experiment with over 93,000 text lines, we introduce 60 novel features, comprising content-based, font style based and structure-based feature groups, to extract algorithmic pseudo-codes. Our proposed pseudo-code extraction method achieves 93.32% F1-score, outperforming the state-of-the-art techniques by 28%. Additionally, we propose a method to extract algorithmic-related sentences using deep neural networks and achieve an accuracy of 78.5%, outperforming a Rule-based model and a support vector machine model by 28% and 16%, respectively. 相似文献

11.

云笔记Wiz在科技查新中的应用举隅

李金永《现代情报》2015,35(7):93-97

面对技术环境不断发展的潮流和趋势,图书馆科技查新的手段需要创新。云笔记以云计算为基础,以知识管理为使命,契合图书馆科技查新的需求。论文阐述了云笔记Wiz在图书馆科技查新中应用的4个方面:查新流程云改造、查新档案云管理、查新主体云协作、查新知识云培训;结合这几方面的实践分析云笔记Wiz在图书馆科技查新中的应用价值,总结经验,以期抛砖引玉。相似文献

12.

Multitasking during Web search sessions

Amanda Spink Minsoo Park Bernard J. Jansen Jan Pedersen 《Information processing & management》2006

A user’s single session with a Web search engine or information retrieval (IR) system may consist of seeking information on single or multiple topics, and switch between tasks or multitasking information behavior. Most Web search sessions consist of two queries of approximately two words. However, some Web search sessions consist of three or more queries. We present findings from two studies. First, a study of two-query search sessions on the AltaVista Web search engine, and second, a study of three or more query search sessions on the AltaVista Web search engine. We examine the degree of multitasking search and information task switching during these two sets of AltaVista Web search sessions. A sample of two-query and three or more query sessions were filtered from AltaVista transaction logs from 2002 and qualitatively analyzed. Sessions ranged in duration from less than a minute to a few hours. Findings include: (1) 81% of two-query sessions included multiple topics, (2) 91.3% of three or more query sessions included multiple topics, (3) there are a broad variety of topics in multitasking search sessions, and (4) three or more query sessions sometimes contained frequent topic changes. Multitasking is found to be a growing element in Web searching. This paper proposes an approach to interactive information retrieval (IR) contextually within a multitasking framework. The implications of our findings for Web design and further research are discussed. 相似文献

13.

Seeking and implementing automated assistance during the search process

《Information processing & management》2005,41(4):909-928

Searchers seldom make use of the advanced searching features that could improve the quality of the search process because they do not know these features exist, do not understand how to use them, or do not believe they are effective or efficient. Information retrieval systems offering automated assistance could greatly improve search effectiveness by suggesting or implementing assistance automatically. A critical issue in designing such systems is determining when the system should intervene in the search process. In this paper, we report the results of an empirical study analyzing when during the search process users seek automated searching assistance from the system and when they implement the assistance. We designed a fully functional, automated assistance application and conducted a study with 30 subjects interacting with the system. The study used a 2G TREC document collection and TREC topics. Approximately 50% of the subjects sought assistance, and over 80% of those implemented that assistance. Results from the evaluation indicate that users are willing to accept automated assistance during the search process, especially after viewing results and locating relevant documents. We discuss implications for interactive information retrieval system design and directions for future research. 相似文献

14.

Examining the impact of domain and cognitive complexity on query formulation and reformulation

Barbara M. Wildemuth Diane Kelly Emma Boettcher Erin Moore Gergana Dimitrova 《Information processing & management》2018,54(3):433-450

The purpose of this analysis was to evaluate an existing set of search tasks in terms of their effectiveness as part of a “shared infrastructure” for conducting interactive IR research. Twenty search tasks that varied in their cognitive complexity and domain were assigned to 47 study participants; the 3,101 moves used to complete those tasks were then analyzed in terms of frequency of each type of move and the sequential patterns they formed. The cognitive complexity of the tasks influenced the number of moves used to complete the tasks, with the most complex (i.e., Create) tasks requiring more moves than tasks at other levels of complexity. Across the four domains, the Commerce tasks elicited more search moves per search. When sequences of moves were analyzed, seven patterns were identified; some of these patterns were associated with particular task characteristics. The findings suggest that search tasks can be designed to elicit particular types of search behaviors and, thus, allow researchers to focus attention on particular aspects of IR interactions. 相似文献

15.

A usage study of retrieval modalities for video shot retrieval

Alan F. Smeaton Paul Browne 《Information processing & management》2006

As an information medium, video offers many possible retrieval and browsing modalities, far more than text, image or audio. Some of these, like searching the text of the spoken dialogue, are well developed, others like keyframe browsing tools are in their infancy, and others not yet technically achievable. For those modalities for browsing and retrieval which we cannot yet achieve we can only speculate as to how useful they will actually be, but we do not know for sure. In our work we have created a system to support multiple modalities for video browsing and retrieval including text search through the spoken dialogue, image matching against shot keyframes and object matching against segmented video objects. For the last of these, automatic segmentation and tracking of video objects is a computationally demanding problem which is not yet solved for generic natural video material, and when it is then it is expected to open up possibilities for user interaction with objects in video, including searching and browsing. In this paper we achieve object segmentation by working in a closed domain of animated cartoons. We describe an interactive user experiment on a medium-sized corpus of video where we were able to measure users’ use of video objects versus other modes of retrieval during multiple-iteration searching. Results of this experiment show that although object searching is used far less than text searching in the first iteration of a user’s search it is a popular and useful search type once an initial set of relevant shots have been found. 相似文献

16.

An integrated model for textual social media data with spatio-temporal dimensions

《Information processing & management》2020,57(5):102219

GPS-enabled devices and social media popularity have created an unprecedented opportunity for researchers to collect, explore, and analyze text data with fine-grained spatial and temporal metadata. In this sense, text, time and space are different domains with their own representation scales and methods. This poses a challenge on how to detect relevant patterns that may only arise from the combination of text with spatio-temporal elements. In particular, spatio-temporal textual data representation has relied on feature embedding techniques. This can limit a model’s expressiveness for representing certain patterns extracted from the sequence structure of textual data. To deal with the aforementioned problems, we propose an Acceptor recurrent neural network model that jointly models spatio-temporal textual data. Our goal is to focus on representing the mutual influence and relationships that can exist between written language and the time-and-place where it was produced. We represent space, time, and text as tuples, and use pairs of elements to predict a third one. This results in three predictive tasks that are trained simultaneously. We conduct experiments on two social media datasets and on a crime dataset; we use Mean Reciprocal Rank as evaluation metric. Our experiments show that our model outperforms state-of-the-art methods ranging from a 5.5% to a 24.7% improvement for location and time prediction. 相似文献

17.

Whose story is it anyway? Automatic extraction of accounts from news articles

Hao Zhang Frank Boons Riza Batista-Navarro 《Information processing & management》2019,56(5):1837-1848

Narratives are comprised of stories that provide insight into social processes. To facilitate the analysis of narratives in a more efficient manner, natural language processing (NLP) methods have been employed in order to automatically extract information from textual sources, e.g., newspaper articles. Existing work on automatic narrative extraction, however, has ignored the nested character of narratives. In this work, we argue that a narrative may contain multiple accounts given by different actors. Each individual account provides insight into the beliefs and desires underpinning an actor’s actions. We present a pipeline for automatically extracting accounts, consisting of NLP methods for: (1) named entity recognition, (2) event extraction, and (3) attribution extraction. Machine learning-based models for named entity recognition were trained based on a state-of-the-art neural network architecture for sequence labelling. For event extraction, we developed a hybrid approach combining the use of semantic role labelling tools, the FrameNet repository of semantic frames, and a lexicon of event nouns. Meanwhile, attribution extraction was addressed with the aid of a dependency parser and Levin’s verb classes. To facilitate the development and evaluation of these methods, we constructed a new corpus of news articles, in which named entities, events and attributions have been manually marked up following a novel annotation scheme that covers over 20 event types relating to socio-economic phenomena. Evaluation results show that relative to a baseline method underpinned solely by semantic role labelling tools, our event extraction approach optimises recall by 12.22–14.20 percentage points (reaching as high as 92.60% on one data set). Meanwhile, the use of Levin’s verb classes in attribution extraction obtains optimal performance in terms of F-score, outperforming a baseline method by 7.64–11.96 percentage points. Our proposed approach was applied on news articles focused on industrial regeneration cases. This facilitated the generation of accounts of events that are attributed to specific actors. 相似文献

18.

Towards data abstraction in networked information retrieval systems

《Information processing & management》1999,35(2):101-119

Networked information retrieval aims at the interoperability of heterogeneous information retrieval (IR) systems. In this paper, we show how differences concerning search operators and database schemas can be handled by applying data abstraction concepts in combination with uncertain inference. Different data types with vague predicates are required to allow for queries referring to arbitrary attributes of documents. Physical data independence separates search operators from access paths, thus solving text search problems related to noun phrases, compound words and proper nouns. Projection and inheritance on attributes support the creation of unified views on a set of IR databases. Uncertain inference allows for query processing even on incompatible database schemas. 相似文献

19.

Online video channel management: An integrative decision support system framework

《International Journal of Information Management》2021

相似文献

20.

本体论在知识图书馆中的应用初探 总被引：13，自引：0，他引：13

邓凯吴家春王洪伟《情报科学》2003,21(1):106-108

本文探讨了知识图书馆的概念和特征，对近年来在人工智能和信息系统领域得到广泛应用的本体论思想了简要地介绍，提出了利用本体理论和方法建立知识图书馆中知识的划分、分类和组织模型，从而方便知识的面向主题的存放和智能检索，有利于用户高效地利用图书馆的资源，促进知识创新和应用。相似文献