期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Satisfying information needs with multi-document summaries 总被引：2，自引：0，他引：2

Sanda Harabagiu Andrew Hickl Finley Lacatusu 《Information processing & management》2007,43(6):1619

Generating summaries that meet the information needs of a user relies on (1) several forms of question decomposition; (2) different summarization approaches; and (3) textual inference for combining the summarization strategies. This novel framework for summarization has the advantage of producing highly responsive summaries, as indicated by the evaluation results. 相似文献

2.

Effects of answer weight boosting in strategy-driven question answering

Hyo-Jung Oh Sung Hyon Myaeng 《Information processing & management》2012,48(1):83-93

With the advances in natural language processing (NLP) techniques and the need to deliver more fine-grained information or answers than a set of documents, various QA techniques have been developed corresponding to different question and answer types. A comprehensive QA system must be able to incorporate individual QA techniques as they are developed and integrate their functionality to maximize the system’s overall capability in handling increasingly diverse types of questions. To this end, a new QA method was developed to learn strategies for determining module invocation sequences and boosting answer weights for different types of questions. In this article, we examine the roles and effects of the answer verification and weight boosting method, which is the main core of the automatically generated strategy-driven QA framework, in comparison with a strategy-less, straightforward answer-merging approach and a strategy-driven but with manually constructed strategies. 相似文献

3.

Addressing ontology-based question answering with collections of user queries

Óscar Ferrández Rubén Izquierdo Sergio Ferrández José Luis Vicedo 《Information processing & management》2009

This paper presents QACID an ontology-based Question Answering system applied to the CInema Domain. This system allows users to retrieve information from formal ontologies by using as input queries formulated in natural language. The original characteristic of QACID is the strategy used to fill the gap between users’ expressiveness and formal knowledge representation. This approach is based on collections of user queries and offers a simple adaptability to deal with multilingual capabilities, inter-domain portability and changes in user information requirements. All these capabilities permit developing Question Answering applications for actual users. This system has been developed and tested on the Spanish language and using an ontology modelling the cinema domain. The performance level achieved enables the use of the system in real environments. 相似文献

4.

Textual aggregation approaches in OLAP context: A survey

《International Journal of Information Management》2017,37(6):684-692

In the last decade, OnLine Analytical Processing (OLAP) has taken an increasingly important role as a research field. Solutions, techniques and tools have been provided for both databases and data warehouses to focus mainly on numerical data. however these solutions are not suitable for textual data. Therefore recently, there has been a huge need for new tools and approaches that treat and manipulate textual data and aggregate it as well. Textual aggregation techniques emerge as a key tool to perform textual data analysis in OLAP for decision support systems. This paper aims at providing a structured and comprehensive overview of the literature in the field of OLAP Textual Aggregation. We provide a new classification framework in which the existing textual aggregation approaches are grouped into two main classes, namely approaches based on cube structure and approaches based on text mining. We discuss and synthesize also the potential of textual similarity metrics, and we provide a recent classification of them. 相似文献

5.

Combining evidence with a probabilistic framework for answer ranking and answer merging in question answering

Jeongwoo Ko Luo Si Eric Nyberg 《Information processing & management》2010

Question answering (QA) aims at finding exact answers to a user’s question from a large collection of documents. Most QA systems combine information retrieval with extraction techniques to identify a set of likely candidates and then utilize some ranking strategy to generate the final answers. This ranking process can be challenging, as it entails identifying the relevant answers amongst many irrelevant ones. This is more challenging in multi-strategy QA, in which multiple answering agents are used to extract answer candidates. As answer candidates come from different agents with different score distributions, how to merge answer candidates plays an important role in answer ranking. In this paper, we propose a unified probabilistic framework which combines multiple evidence to address challenges in answer ranking and answer merging. The hypotheses of the paper are that: (1) the framework effectively combines multiple evidence for identifying answer relevance and their correlation in answer ranking, (2) the framework supports answer merging on answer candidates returned by multiple extraction techniques, (3) the framework can support list questions as well as factoid questions, (4) the framework can be easily applied to a different QA system, and (5) the framework significantly improves performance of a QA system. An extensive set of experiments was done to support our hypotheses and demonstrate the effectiveness of the framework. All of the work substantially extends the preliminary research in Ko et al. (2007a). A probabilistic framework for answer selection in question answering. In: Proceedings of NAACL/HLT. 相似文献

6.

WORD图文数据批量处理方法初探

王玉林熊军《科技广场》2005,(10):88-90

本文结合Word宏命令、域以及邮件合并功能,介绍了两种在Word文档窗口内实现图文数据批量处理的方法. 相似文献

7.

《Information processing & management》2022,59(2):102811

Open data is becoming ubiquitous as governments, companies, and even individuals have the option to offer more or less unrestricted access to their non-sensitive data. The benefits of open data, such as accessibility and transparency, have motivated and enabled a large number of research studies and applications in both academia and industry. However, each open data only offers a single perspective, and its potential inherent limitations (e.g., demographic biases) may lead to poor decisions and misjudgments. This paper discusses how to create and use multiple digital lenses empowered by open data, including census data (macro lens), search logs (meso lens), and social data (micro lens), to investigate general real-world events. To reveal the unique angles and perspectives brought by each open lens, we summarize and compare the underpinning open data from eleven dimensions, such as utility, data volume, dynamic variability, and demographic fairness. Then, we propose an easy-to-use and generalized open data driven framework, which automatically retrieves multi-source data, extracts features, and trains machine learning models for the event specified by answering what, when, and where questions. With low labor efforts, the framework’s generalization and automation capabilities guarantee an instant investigation of general events and phenomena, such as disasters, sports events, and political activities. We also conduct two case studies, i.e., the COVID-19 pandemic and Great American Eclipse (see Appendix), to demonstrate its feasibility and effectiveness at different time granularities. 相似文献

8.

大学新生心理剖析与应对

杨欣雨《黑龙江科技信息》2008,(4):160-160

大学新生入校后由于个人状况的差异,在学习和生活上会产生各式各样的心理问题.针对这些心理问题结合自己在辅导员岗位上多年的工作经验,提出了一系列的应对方法与措施,以帮助学生走出心理阴影,快速进入角色. 相似文献

9.

Shengli Wu Sally McClean 《Information processing & management》2006

The data fusion technique has been investigated by many researchers and has been used in implementing several information retrieval systems. However, the results from data fusion vary in different situations. To find out under which condition data fusion may lead to performance improvement is an important issue. In this paper, we present an analysis of the behaviour of several well-known methods such as CombSum and CombMNZ for fusion of multiple information retrieval results. Based on this analysis, we predict the performance of the data fusion methods. Experiments are conducted with three groups of results submitted to TREC 6, TREC 2001, and TREC 2004. The experiments show that the prediction of the performance of data fusion is quite accurate, and it can be used in situations very different from the training examples. Compared with previous work, our result is more accurate and in a better position for applications since various number of component systems can be supported while only two was used previously. 相似文献

10.

自然主义视域中的感受质——埃德尔曼基于脑科学维度的哲学新解

陈思《科学技术与辩证法》2013,(6):26-30

关于感受质的本体论地位、存在形式及其与大脑神经活动之间的关系在当代西方心灵哲学中存在许多争论。埃德尔曼把上述问题集中放置于自然主义视域中,从脑科学的维度肯定了感受质的本体论地位。他把整体一部分关系作为感受质存在的自然主义理论根据,并在此基础上提出“动态核心假说”来说明感受质的存在形式,用“蕴涵关系”理论来说明感受质与大脑神经活动之间具有一种非常规的因果关联性。相似文献

11.

自动问答系统设计与实现

王正华韩永国《人天科学研究》2014,(9):111-113

自动问答系统在搜索引擎的基础上融入了自然语言的知识与应用,与传统的依靠关键字匹配的搜索引擎相比,能够更好地满足用户的检索需求。介绍了计算机操作系统自动问答系统模型,阐述了具体开发过程,设计并实现了基于计算机操作系统领域的自动问答系统,实践表明该系统能够较为准确地回答用户问题。相似文献

12.

Examining the Authority and Ranking Effects as the result list depth used in data fusion is varied

Anselm Spoerri 《Information processing & management》2007

The Authority and Ranking Effects play a key role in data fusion. The former refers to the fact that the potential relevance of a document increases exponentially as the number of systems retrieving it increases and the latter to the phenomena that documents higher up in ranked lists and found by more systems are more likely to be relevant. Data fusion methods commonly use all the documents returned by the different retrieval systems being compared. Yet, as documents further down in the result lists are considered, a document’s probability of being relevant decreases significantly and a major source of noise is introduced. This paper presents a systematic examination of the Authority and Ranking Effects as the number of documents in the result lists, called the list depth, is varied. Using TREC 3, 7, 8, 12 and 13 data, it is shown that the Authority and Ranking Effects are present at all list depths. However, if the systems in the same TREC track retrieve a large number of relevant documents, then the Ranking Effect only begins to emerge as more systems have found the same document and/or the list depth increases. It is also shown that the Authority and Ranking Effects are not an artifact of how the TREC test collections have been constructed. 相似文献

13.

Unsupervised Latent Dirichlet Allocation for supervised question classification

Saeedeh Momtazi 《Information processing & management》2018,54(3):380-393

Question answering systems assist users in satisfying their information needs more precisely by providing focused responses to their questions. Among the various systems developed for such a purpose, community-based question answering has recently received researchers’ attention due to the large amount of user-generated questions and answers in social question-and-answer platforms. Reusing such data sources requires an accurate information retrieval component enhanced by a question classifier. The question classification gives the system the possibility to have information about question categories to focus on questions and answers from relevant categories to the input question. In this paper, we propose a new method based on unsupervised Latent Dirichlet Allocation for classifying questions in community-based question answering. Our method first uses unsupervised topic modeling to extract topics from a large amount of unlabeled data. The learned topics are then used in the training phase to find their association with the available category labels in the training data. The category mixture of topics is finally used to predict the label of unseen data. 相似文献

14.

《Information processing & management》2023,60(4):103367

Many existing biomedical extractive question answering methods are based on pre-trained models, which do not take full advantage of the hidden layer knowledge of pretrained models and do not consider span overlap between answers when predicting. To address these issues, we propose a new question answering model, called ALBERT with Dynamic Routing and Answer Voting (ADRAV). The ADRAV can reasonably utilize hidden layer knowledge through dynamic routing, and consider span similarity between answers through answer voting. To improve the performance of the model, we also carry out pre-fine-tuning, and add a dynamic parameter adjustment mechanism in the process of pre-fine-tuning. Experimental results show that our model achieves significant performance improvement with fewer parameters on BioASQ 4b, 5b, 6b, 9b, and outperforms SOTA baselines on BioASQ 4b, 6b. 相似文献

15.

计算机机房安全隐患及应对

刘胜达《科技广场》2006,(4):16-17

安全生产关系人民群众生命财产安全,关系改革开放、经济发展和社会稳定的大局。各级政府一直高度重视安全生产工作。对于机房来说,安全生产就是安全管理,也就是说安全管理第一,因此每个机房管理者应以保障机房设备安全可靠运行为工作目标。本文对计算机机房安全隐患进行综述,并提出应对措施。相似文献

16.

Assigning appropriate weights for the linear combination data fusion method in information retrieval

Shengli Wu Yaxin Bi Xiaoqin Zeng Lixin Han 《Information processing & management》2009

In data fusion, the linear combination method is a very flexible method since different weights can be assigned to different systems. However, it remains an open question which weighting schema should be used. In some previous investigations and experiments, a simple weighting schema was used: for a system, its weight is assigned as its average performance over a group of training queries. However, it is not clear if this weighting schema is good or not. In some other investigations, different numerical optimisation methods were used to search for appropriate weights for the component systems. One major problem with those numerical optimisation methods is their low efficiency. It might not be feasible to use them in some situations, for example in some dynamic environments, system weights need to be updated from time to time for reasonably good performance. In this paper, we investigate the weighting issue by extensive experiments. The key point is to try to find the relation between performances of component systems and their corresponding weights which can lead to good fusion performance. We demonstrate that a series of power functions of average performance, which can be implemented as efficiently as the simple weighting schema, is more effective than the simple weighting schema for the linear data fusion method. Some other features of the power function weighting schema and the linear combination method are also investigated. The observations obtained from this study can be used directly in fusion applications of component retrieval results. The observations are also very useful for optimisation methods to choose better starting points and therefore to obtain more effective weights more quickly. 相似文献

17.

语码转换的语篇功能研究——《挥着翅膀的女孩》歌词语篇衔接手段分析

焦亚楠金朋荪《中国科技纵横》2011,(14):161-161

语码转换是双语或多语者在语言交际中的一种普遍现象。本文从韩礼得的系统功能语言学的语篇功能中衔接手段出发,分析《挥着翅膀的女孩》歌词语篇中的语码转换语现象,旨在探讨系统功能语言学中语篇功能在社会语言学语码转换现象研究领域的可行性。相似文献

18.

M.T. Martín-Valdivia M.C. Díaz-GalianoA. Montejo-Raez L.A. Ureña-López 《Information processing & management》2008

Nowadays, access to information requires managing multimedia databases effectively, and so, multi-modal retrieval techniques (particularly images retrieval) have become an active research direction. In the past few years, a lot of content-based image retrieval (CBIR) systems have been developed. However, despite the progress achieved in the CBIR, the retrieval accuracy of current systems is still limited and often worse than only textual information retrieval systems. In this paper, we propose to combine content-based and text-based approaches to multi-modal retrieval in order to achieve better results and overcome the lacks of these techniques when they are taken separately. For this purpose, we use a medical collection that includes both images and non-structured text. We retrieve images from a CBIR system and textual information through a traditional information retrieval system. Then, we combine the results obtained from both systems in order to improve the final performance. Furthermore, we use the information gain (IG) measure to reduce and improve the textual information included in multi-modal information retrieval systems. We have carried out several experiments that combine this reduction technique with a visual and textual information merger. The results obtained are highly promising and show the profit obtained when textual information is managed to improve conventional multi-modal systems. 相似文献

19.

多源遥感数据在植被识别和提取中的应用 总被引：4，自引：0，他引：4

下载免费PDF全文

高晓岚汪小钦《资源科学》2008,30(1):153-158

不同类型的遥感数据有着自己独特的优势,如果综合应用,可以实现信息的互补,提高地物的识别精度。本文以福建省漳浦县为研究区域,利用SPOT5、ASTER和CBERS等多源遥感数据对植被的识别和提取方法进行研究,建立了基于多源遥感数据的专题信息提取流程。首先设计了基于不同植被专题信息自动提取的专家库,对单一遥感数据进行专题提取,然后基于专家知识进行决策级植被信息融合。多源遥感数据所提供的信息的优越性在于可以将不同传感器的光谱信息和时相特征进行互补,利用不同植被在不同遥感数据上的特征和专家知识,建立隶属度函数,判剐每个像元的归属,完成研究区不同植被类型的专题提取。结果表明,与单一传感器数据的结果相比,综合利用多源遥感数据能较大程度地提高植被的提取精度。相似文献

20.

Automatic ranking of information retrieval systems using data fusion

Rabia Nuray Fazli Can 《Information processing & management》2006

Measuring effectiveness of information retrieval (IR) systems is essential for research and development and for monitoring search quality in dynamic environments. In this study, we employ new methods for automatic ranking of retrieval systems. In these methods, we merge the retrieval results of multiple systems using various data fusion algorithms, use the top-ranked documents in the merged result as the “(pseudo) relevant documents,” and employ these documents to evaluate and rank the systems. Experiments using Text REtrieval Conference (TREC) data provide statistically significant strong correlations with human-based assessments of the same systems. We hypothesize that the selection of systems that would return documents different from the majority could eliminate the ordinary systems from data fusion and provide better discrimination among the documents and systems. This could improve the effectiveness of automatic ranking. Based on this intuition, we introduce a new method for the selection of systems to be used for data fusion. For this purpose, we use the bias concept that measures the deviation of a system from the norm or majority and employ the systems with higher bias in the data fusion process. This approach provides even higher correlations with the human-based results. We demonstrate that our approach outperforms the previously proposed automatic ranking methods. 相似文献