首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Satisfying information needs with multi-document summaries   总被引:2,自引:0,他引:2  
Generating summaries that meet the information needs of a user relies on (1) several forms of question decomposition; (2) different summarization approaches; and (3) textual inference for combining the summarization strategies. This novel framework for summarization has the advantage of producing highly responsive summaries, as indicated by the evaluation results.  相似文献   

2.
With the advances in natural language processing (NLP) techniques and the need to deliver more fine-grained information or answers than a set of documents, various QA techniques have been developed corresponding to different question and answer types. A comprehensive QA system must be able to incorporate individual QA techniques as they are developed and integrate their functionality to maximize the system’s overall capability in handling increasingly diverse types of questions. To this end, a new QA method was developed to learn strategies for determining module invocation sequences and boosting answer weights for different types of questions. In this article, we examine the roles and effects of the answer verification and weight boosting method, which is the main core of the automatically generated strategy-driven QA framework, in comparison with a strategy-less, straightforward answer-merging approach and a strategy-driven but with manually constructed strategies.  相似文献   

3.
This paper presents QACID an ontology-based Question Answering system applied to the CInema Domain. This system allows users to retrieve information from formal ontologies by using as input queries formulated in natural language. The original characteristic of QACID is the strategy used to fill the gap between users’ expressiveness and formal knowledge representation. This approach is based on collections of user queries and offers a simple adaptability to deal with multilingual capabilities, inter-domain portability and changes in user information requirements. All these capabilities permit developing Question Answering applications for actual users. This system has been developed and tested on the Spanish language and using an ontology modelling the cinema domain. The performance level achieved enables the use of the system in real environments.  相似文献   

4.
In the last decade, OnLine Analytical Processing (OLAP) has taken an increasingly important role as a research field. Solutions, techniques and tools have been provided for both databases and data warehouses to focus mainly on numerical data. however these solutions are not suitable for textual data. Therefore recently, there has been a huge need for new tools and approaches that treat and manipulate textual data and aggregate it as well. Textual aggregation techniques emerge as a key tool to perform textual data analysis in OLAP for decision support systems. This paper aims at providing a structured and comprehensive overview of the literature in the field of OLAP Textual Aggregation. We provide a new classification framework in which the existing textual aggregation approaches are grouped into two main classes, namely approaches based on cube structure and approaches based on text mining. We discuss and synthesize also the potential of textual similarity metrics, and we provide a recent classification of them.  相似文献   

5.
Question answering (QA) aims at finding exact answers to a user’s question from a large collection of documents. Most QA systems combine information retrieval with extraction techniques to identify a set of likely candidates and then utilize some ranking strategy to generate the final answers. This ranking process can be challenging, as it entails identifying the relevant answers amongst many irrelevant ones. This is more challenging in multi-strategy QA, in which multiple answering agents are used to extract answer candidates. As answer candidates come from different agents with different score distributions, how to merge answer candidates plays an important role in answer ranking. In this paper, we propose a unified probabilistic framework which combines multiple evidence to address challenges in answer ranking and answer merging. The hypotheses of the paper are that: (1) the framework effectively combines multiple evidence for identifying answer relevance and their correlation in answer ranking, (2) the framework supports answer merging on answer candidates returned by multiple extraction techniques, (3) the framework can support list questions as well as factoid questions, (4) the framework can be easily applied to a different QA system, and (5) the framework significantly improves performance of a QA system. An extensive set of experiments was done to support our hypotheses and demonstrate the effectiveness of the framework. All of the work substantially extends the preliminary research in Ko et al. (2007a). A probabilistic framework for answer selection in question answering. In: Proceedings of NAACL/HLT.  相似文献   

6.
王玉林  熊军 《科技广场》2005,(10):88-90
本文结合Word宏命令、域以及邮件合并功能,介绍了两种在Word文档窗口内实现图文数据批量处理的方法.  相似文献   

7.
    
Open data is becoming ubiquitous as governments, companies, and even individuals have the option to offer more or less unrestricted access to their non-sensitive data. The benefits of open data, such as accessibility and transparency, have motivated and enabled a large number of research studies and applications in both academia and industry. However, each open data only offers a single perspective, and its potential inherent limitations (e.g., demographic biases) may lead to poor decisions and misjudgments. This paper discusses how to create and use multiple digital lenses empowered by open data, including census data (macro lens), search logs (meso lens), and social data (micro lens), to investigate general real-world events. To reveal the unique angles and perspectives brought by each open lens, we summarize and compare the underpinning open data from eleven dimensions, such as utility, data volume, dynamic variability, and demographic fairness. Then, we propose an easy-to-use and generalized open data driven framework, which automatically retrieves multi-source data, extracts features, and trains machine learning models for the event specified by answering what, when, and where questions. With low labor efforts, the framework’s generalization and automation capabilities guarantee an instant investigation of general events and phenomena, such as disasters, sports events, and political activities. We also conduct two case studies, i.e., the COVID-19 pandemic and Great American Eclipse (see Appendix), to demonstrate its feasibility and effectiveness at different time granularities.  相似文献   

8.
大学新生入校后由于个人状况的差异,在学习和生活上会产生各式各样的心理问题.针对这些心理问题结合自己在辅导员岗位上多年的工作经验,提出了一系列的应对方法与措施,以帮助学生走出心理阴影,快速进入角色.  相似文献   

9.
    
The data fusion technique has been investigated by many researchers and has been used in implementing several information retrieval systems. However, the results from data fusion vary in different situations. To find out under which condition data fusion may lead to performance improvement is an important issue. In this paper, we present an analysis of the behaviour of several well-known methods such as CombSum and CombMNZ for fusion of multiple information retrieval results. Based on this analysis, we predict the performance of the data fusion methods. Experiments are conducted with three groups of results submitted to TREC 6, TREC 2001, and TREC 2004. The experiments show that the prediction of the performance of data fusion is quite accurate, and it can be used in situations very different from the training examples. Compared with previous work, our result is more accurate and in a better position for applications since various number of component systems can be supported while only two was used previously.  相似文献   

10.
关于感受质的本体论地位、存在形式及其与大脑神经活动之间的关系在当代西方心灵哲学中存在许多争论。埃德尔曼把上述问题集中放置于自然主义视域中,从脑科学的维度肯定了感受质的本体论地位。他把整体一部分关系作为感受质存在的自然主义理论根据,并在此基础上提出“动态核心假说”来说明感受质的存在形式,用“蕴涵关系”理论来说明感受质与大脑神经活动之间具有一种非常规的因果关联性。  相似文献   

11.
自动问答系统在搜索引擎的基础上融入了自然语言的知识与应用,与传统的依靠关键字匹配的搜索引擎相比,能够更好地满足用户的检索需求。介绍了计算机操作系统自动问答系统模型,阐述了具体开发过程,设计并实现了基于计算机操作系统领域的自动问答系统,实践表明该系统能够较为准确地回答用户问题。  相似文献   

12.
The Authority and Ranking Effects play a key role in data fusion. The former refers to the fact that the potential relevance of a document increases exponentially as the number of systems retrieving it increases and the latter to the phenomena that documents higher up in ranked lists and found by more systems are more likely to be relevant. Data fusion methods commonly use all the documents returned by the different retrieval systems being compared. Yet, as documents further down in the result lists are considered, a document’s probability of being relevant decreases significantly and a major source of noise is introduced. This paper presents a systematic examination of the Authority and Ranking Effects as the number of documents in the result lists, called the list depth, is varied. Using TREC 3, 7, 8, 12 and 13 data, it is shown that the Authority and Ranking Effects are present at all list depths. However, if the systems in the same TREC track retrieve a large number of relevant documents, then the Ranking Effect only begins to emerge as more systems have found the same document and/or the list depth increases. It is also shown that the Authority and Ranking Effects are not an artifact of how the TREC test collections have been constructed.  相似文献   

13.
Question answering systems assist users in satisfying their information needs more precisely by providing focused responses to their questions. Among the various systems developed for such a purpose, community-based question answering has recently received researchers’ attention due to the large amount of user-generated questions and answers in social question-and-answer platforms. Reusing such data sources requires an accurate information retrieval component enhanced by a question classifier. The question classification gives the system the possibility to have information about question categories to focus on questions and answers from relevant categories to the input question. In this paper, we propose a new method based on unsupervised Latent Dirichlet Allocation for classifying questions in community-based question answering. Our method first uses unsupervised topic modeling to extract topics from a large amount of unlabeled data. The learned topics are then used in the training phase to find their association with the available category labels in the training data. The category mixture of topics is finally used to predict the label of unseen data.  相似文献   

14.
    
Many existing biomedical extractive question answering methods are based on pre-trained models, which do not take full advantage of the hidden layer knowledge of pretrained models and do not consider span overlap between answers when predicting. To address these issues, we propose a new question answering model, called ALBERT with Dynamic Routing and Answer Voting (ADRAV). The ADRAV can reasonably utilize hidden layer knowledge through dynamic routing, and consider span similarity between answers through answer voting. To improve the performance of the model, we also carry out pre-fine-tuning, and add a dynamic parameter adjustment mechanism in the process of pre-fine-tuning. Experimental results show that our model achieves significant performance improvement with fewer parameters on BioASQ 4b, 5b, 6b, 9b, and outperforms SOTA baselines on BioASQ 4b, 6b.  相似文献   

15.
安全生产关系人民群众生命财产安全,关系改革开放、经济发展和社会稳定的大局。各级政府一直高度重视安全生产工作。对于机房来说,安全生产就是安全管理,也就是说安全管理第一,因此每个机房管理者应以保障机房设备安全可靠运行为工作目标。本文对计算机机房安全隐患进行综述,并提出应对措施。  相似文献   

16.
In data fusion, the linear combination method is a very flexible method since different weights can be assigned to different systems. However, it remains an open question which weighting schema should be used. In some previous investigations and experiments, a simple weighting schema was used: for a system, its weight is assigned as its average performance over a group of training queries. However, it is not clear if this weighting schema is good or not. In some other investigations, different numerical optimisation methods were used to search for appropriate weights for the component systems. One major problem with those numerical optimisation methods is their low efficiency. It might not be feasible to use them in some situations, for example in some dynamic environments, system weights need to be updated from time to time for reasonably good performance. In this paper, we investigate the weighting issue by extensive experiments. The key point is to try to find the relation between performances of component systems and their corresponding weights which can lead to good fusion performance. We demonstrate that a series of power functions of average performance, which can be implemented as efficiently as the simple weighting schema, is more effective than the simple weighting schema for the linear data fusion method. Some other features of the power function weighting schema and the linear combination method are also investigated. The observations obtained from this study can be used directly in fusion applications of component retrieval results. The observations are also very useful for optimisation methods to choose better starting points and therefore to obtain more effective weights more quickly.  相似文献   

17.
语码转换是双语或多语者在语言交际中的一种普遍现象。本文从韩礼得的系统功能语言学的语篇功能中衔接手段出发,分析《挥着翅膀的女孩》歌词语篇中的语码转换语现象,旨在探讨系统功能语言学中语篇功能在社会语言学语码转换现象研究领域的可行性。  相似文献   

18.
    
Nowadays, access to information requires managing multimedia databases effectively, and so, multi-modal retrieval techniques (particularly images retrieval) have become an active research direction. In the past few years, a lot of content-based image retrieval (CBIR) systems have been developed. However, despite the progress achieved in the CBIR, the retrieval accuracy of current systems is still limited and often worse than only textual information retrieval systems. In this paper, we propose to combine content-based and text-based approaches to multi-modal retrieval in order to achieve better results and overcome the lacks of these techniques when they are taken separately. For this purpose, we use a medical collection that includes both images and non-structured text. We retrieve images from a CBIR system and textual information through a traditional information retrieval system. Then, we combine the results obtained from both systems in order to improve the final performance. Furthermore, we use the information gain (IG) measure to reduce and improve the textual information included in multi-modal information retrieval systems. We have carried out several experiments that combine this reduction technique with a visual and textual information merger. The results obtained are highly promising and show the profit obtained when textual information is managed to improve conventional multi-modal systems.  相似文献   

19.
多源遥感数据在植被识别和提取中的应用   总被引:4,自引:0,他引:4       下载免费PDF全文
高晓岚  汪小钦 《资源科学》2008,30(1):153-158
不同类型的遥感数据有着自己独特的优势,如果综合应用,可以实现信息的互补,提高地物的识别精度。本文以福建省漳浦县为研究区域,利用SPOT5、ASTER和CBERS等多源遥感数据对植被的识别和提取方法进行研究,建立了基于多源遥感数据的专题信息提取流程。首先设计了基于不同植被专题信息自动提取的专家库,对单一遥感数据进行专题提取,然后基于专家知识进行决策级植被信息融合。多源遥感数据所提供的信息的优越性在于可以将不同传感器的光谱信息和时相特征进行互补,利用不同植被在不同遥感数据上的特征和专家知识,建立隶属度函数,判剐每个像元的归属,完成研究区不同植被类型的专题提取。结果表明,与单一传感器数据的结果相比,综合利用多源遥感数据能较大程度地提高植被的提取精度。  相似文献   

20.
Measuring effectiveness of information retrieval (IR) systems is essential for research and development and for monitoring search quality in dynamic environments. In this study, we employ new methods for automatic ranking of retrieval systems. In these methods, we merge the retrieval results of multiple systems using various data fusion algorithms, use the top-ranked documents in the merged result as the “(pseudo) relevant documents,” and employ these documents to evaluate and rank the systems. Experiments using Text REtrieval Conference (TREC) data provide statistically significant strong correlations with human-based assessments of the same systems. We hypothesize that the selection of systems that would return documents different from the majority could eliminate the ordinary systems from data fusion and provide better discrimination among the documents and systems. This could improve the effectiveness of automatic ranking. Based on this intuition, we introduce a new method for the selection of systems to be used for data fusion. For this purpose, we use the bias concept that measures the deviation of a system from the norm or majority and employ the systems with higher bias in the data fusion process. This approach provides even higher correlations with the human-based results. We demonstrate that our approach outperforms the previously proposed automatic ranking methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号