首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Machine reading comprehension (MRC) is a challenging task in the field of artificial intelligence. Most existing MRC works contain a semantic matching module, either explicitly or intrinsically, to determine whether a piece of context answers a question. However, there is scant work which systematically evaluates different paradigms using semantic matching in MRC. In this paper, we conduct a systematic empirical study on semantic matching. We formulate a two-stage framework which consists of a semantic matching model and a reading model, based on pre-trained language models. We compare and analyze the effectiveness and efficiency of using semantic matching modules with different setups on four types of MRC datasets. We verify that using semantic matching before a reading model improves both the effectiveness and efficiency of MRC. Compared with answering questions by extracting information from concise context, we observe that semantic matching yields more improvements for answering questions with noisy and adversarial context. Matching coarse-grained context to questions, e.g., paragraphs, is more effective than matching fine-grained context, e.g., sentences and spans. We also find that semantic matching is helpful for answering who/where/when/what/how/which questions, whereas it decreases the MRC performance on why questions. This may imply that semantic matching helps to answer a question whose necessary information can be retrieved from a single sentence. The above observations demonstrate the advantages and disadvantages of using semantic matching in different scenarios.  相似文献   

2.
Question-answering has become one of the most popular information retrieval applications. Despite that most question-answering systems try to improve the user experience and the technology used in finding relevant results, many difficulties are still faced because of the continuous increase in the amount of web content. Questions Classification (QC) plays an important role in question-answering systems, with one of the major tasks in the enhancement of the classification process being the identification of questions types. A broad range of QC approaches has been proposed with the aim of helping to find a solution for the classification problems; most of these are approaches based on bag-of-words or dictionaries. In this research, we present an analysis of the different type of questions based on their grammatical structure. We identify different patterns and use machine learning algorithms to classify them. A framework is proposed for question classification using a grammar-based approach (GQCC) which exploits the structure of the questions. Our findings indicate that using syntactic categories related to different domain-specific types of Common Nouns, Numeral Numbers and Proper Nouns enable the machine learning algorithms to better differentiate between different question types. The paper presents a wide range of experiments the results show that the GQCC using J48 classifier has outperformed other classification methods with 90.1% accuracy.  相似文献   

3.
This paper presents QACID an ontology-based Question Answering system applied to the CInema Domain. This system allows users to retrieve information from formal ontologies by using as input queries formulated in natural language. The original characteristic of QACID is the strategy used to fill the gap between users’ expressiveness and formal knowledge representation. This approach is based on collections of user queries and offers a simple adaptability to deal with multilingual capabilities, inter-domain portability and changes in user information requirements. All these capabilities permit developing Question Answering applications for actual users. This system has been developed and tested on the Spanish language and using an ontology modelling the cinema domain. The performance level achieved enables the use of the system in real environments.  相似文献   

4.
Reading Comprehension tests are commonly used to assess the degree to which people comprehend what they read. This is why we work with the hypothesis that it is reasonable to use these tests to assess the degree to which a machine “comprehends” what it is reading. In this work, we evaluate Question Answering systems using Reading Comprehension tests from exams to enter University. This article analyses the datasets generated, the kind of inferences required, the methodology followed in three evaluation campaigns, the approaches presented by participants and current results. Besides, we study the evolution of systems and the main lessons learned in this evaluation process. We also show how current technologies are unable to pass university-entrance exams. This is because these tests require a deep understanding of texts, as well as detecting the similar meaning of phrases with different words. Future directions focused on these ideas seem more promising than including a massive amount of data for training systems, what has allowed systems to obtain outstanding results in Reading Comprehension tests with more straightforward questions. We think this study helps to increase the knowledge about how to develop better Question Answering systems.  相似文献   

5.
    
Recently, reinforcement learning (RL)-based methods have achieved remarkable progress in both effectiveness and interpretability for complex question answering over knowledge base (KBQA). However, existing RL-based methods share a common limitation: the agent is usually misled by aimless exploration, as well as sparse and delayed rewards, leading to a large number of spurious relation paths. To address this issue, a new adaptive reinforcement learning (ARL) framework is proposed to learn a better and interpretable model for complex KBQA. First, instead of using a random walk agent, an adaptive path generator is developed with three atomic operations to sequentially generate the relation paths until the agent reaches the target entity. Second, a semantic policy network is presented with both character-level and sentence-level information to better guide the agent. Finally, a new reward function is introduced by considering both the relation paths and the target entity to alleviate sparse and delayed rewards. The empirical results on five benchmark datasets show that our model is more effective than state-of-the-art approaches. Compared with the strong baseline model SRN, the proposed model achieves performance improvements of 23.7% on MetaQA-3 using the metric Hits@1.  相似文献   

6.
    
Question categorization, which suggests one of a set of predefined categories to a user’s question according to the question’s topic or content, is a useful technique in user-interactive question answering systems. In this paper, we propose an automatic method for question categorization in a user-interactive question answering system. This method includes four steps: feature space construction, topic-wise words identification and weighting, semantic mapping, and similarity calculation. We firstly construct the feature space based on all accumulated questions and calculate the feature vector of each predefined category which contains certain accumulated questions. When a new question is posted, the semantic pattern of the question is used to identify and weigh the important words of the question. After that, the question is semantically mapped into the constructed feature space to enrich its representation. Finally, the similarity between the question and each category is calculated based on their feature vectors. The category with the highest similarity is assigned to the question. The experimental results show that our proposed method achieves good categorization precision and outperforms the traditional categorization methods on the selected test questions.  相似文献   

7.
This article addresses the issue of extracting contexts and answers of questions from posts of online discussion forums. In previous work, general-purpose graphical models have been employed without any customization to this specific extraction problem. Instead, in this article, we propose a unified approach to context and answer extraction by customizing the structural support vector machine method. The customization enables our proposal to explore various relations among sentences of posts and complex structures of threads. We design new inference algorithms to find or approximate the most violated constraint by utilizing the specific structure of forum threads, which enables us to efficiently find the global optimum of the customized optimizing problem. We also optimize practical performance measures by varying loss functions. Experimental results show that our methods are both promising and flexible.  相似文献   

8.
9.
Using lexical chains for keyword extraction   总被引:9,自引:0,他引:9  
Keywords can be considered as condensed versions of documents and short forms of their summaries. In this paper, the problem of automatic extraction of keywords from documents is treated as a supervised learning task. A lexical chain holds a set of semantically related words of a text and it can be said that a lexical chain represents the semantic content of a portion of the text. Although lexical chains have been extensively used in text summarization, their usage for keyword extraction problem has not been fully investigated. In this paper, a keyword extraction technique that uses lexical chains is described, and encouraging results are obtained.  相似文献   

10.
This paper describes a state-of-the-art supervised, knowledge-intensive approach to the automatic identification of semantic relations between nominals in English sentences. The system employs a combination of rich and varied sets of new and previously used lexical, syntactic, and semantic features extracted from various knowledge sources such as WordNet and additional annotated corpora. The system ranked first at the third most popular SemEval 2007 Task – Classification of Semantic Relations between Nominals and achieved an F-measure of 72.4% and an accuracy of 76.3%. We also show that some semantic relations are better suited for WordNet-based models than other relations. Additionally, we make a distinction between out-of-context (regular) examples and those that require sentence context for relation identification and show that contextual data are important for the performance of a noun–noun semantic parser. Finally, learning curves show that the task difficulty varies across relations and that our learned WordNet-based representation is highly accurate so the performance results suggest the upper bound on what this representation can do.  相似文献   

11.
    
Entity linking (EL), the task of automatically matching mentions in text to concepts in a target knowledge base, remains under-explored when it comes to the food domain, despite its many potential applications, e.g., finding the nutritional value of ingredients in databases. In this paper, we describe the creation of new resources supporting the development of EL methods applied to the food domain: the E.Care Knowledge Base (E.Care KB) which contains 664 food concepts and the E.Care dataset, a corpus of 468 cooking recipes where ingredient names have been manually linked to corresponding concepts in the E.Care KB. We developed and evaluated different methods for EL, namely, deep learning-based approaches underpinned by Siamese networks trained under a few-shot learning setting, traditional machine learning-based approaches underpinned by support vector machines (SVMs) and unsupervised approaches based on string matching algorithms. Combining the strengths of each of these approaches, we built a hybrid model for food EL that balances the trade-offs between performance and inference speed. Specifically, our hybrid model obtains 89.40% accuracy and links mentions at an average speed of 0.24 seconds per mention, whereas our best deep learning-based model, SVM model and unsupervised model obtain accuracies of 86.99%, 87.19% and 87.43% at inference speeds of 0.007, 0.66 and 0.02 seconds per mention, respectively.  相似文献   

12.
    
Recently, question series have become one focus of research in question answering. These series are comprised of individual factoid, list, and “other” questions organized around a central topic, and represent abstractions of user–system dialogs. Existing evaluation methodologies have yet to catch up with this richer task model, as they fail to take into account contextual dependencies and different user behaviors. This paper presents a novel simulation-based methodology for evaluating answers to question series that addresses some of these shortcomings. Using this methodology, we examine two different behavior models: a “QA-styled” user and an “IR-styled” user. Results suggest that an off-the-shelf document retrieval system is competitive with state-of-the-art QA systems in this task. Advantages and limitations of evaluations based on user simulations are also discussed.  相似文献   

13.
Stance detection identifies a person’s evaluation of a subject, and is a crucial component for many downstream applications. In application, stance detection requires training a machine learning model on an annotated dataset and applying the model on another to predict stances of text snippets. This cross-dataset model generalization poses three central questions, which we investigate using stance classification models on 7 publicly available English Twitter datasets ranging from 297 to 48,284 instances. (1) Are stance classification models generalizable across datasets? We construct a single dataset model to train/test dataset-against-dataset, finding models do not generalize well (avg F1=0.33). (2) Can we improve the generalizability by aggregating datasets? We find a multi dataset model built on the aggregation of datasets has an improved performance (avg F1=0.69). (3) Given a model built on multiple datasets, how much additional data is required to fine-tune it? We find it challenging to ascertain a minimum number of data points due to the lack of pattern in performance. Investigating possible reasons for the choppy model performance we find that texts are not easily differentiable by stances, nor are annotations consistent within and across datasets. Our observations emphasize the need for an aggregated dataset as well as consistent labels for the generalizability of models.  相似文献   

14.
Is Cross-Language answer finding harder than Monolingual answer finding for users? In this paper we provide initial quantitative and qualitative evidence to answer this question.  相似文献   

15.
16.
    
In this paper we study the problem of classification of textual web reports. We are specifically focused on situations in which structured information extracted from the reports is used for classification. We present an experimental classification system based on usage of third party linguistic analyzers, our previous work on web information extraction, and fuzzy inductive logic programming (fuzzy ILP). A detailed study of the so-called ‘Fuzzy ILP Classifier’ is the main contribution of the paper. The study includes formal models, prototype implementation, extensive evaluation experiments and comparison of the classifier with other alternatives like decision trees, support vector machines, neural networks, etc.  相似文献   

17.
    
This paper describes how questions can be characterized for question answering (QA) along different facets and focuses on questions that cannot be answered directly but can be divided into simpler ones so that they can be answered directly using existing QA capabilities. Since individual answers are composed to generate the final answer, we call this process as compositional QA. The goal of the proposed QA method is to answer a composite question by dividing it into atomic ones, instead of developing an entirely new method tailored for the new question type. A question is analyzed automatically to determine its class, and its sub-questions are sent to the relevant QA modules. Answers returned from the individual QA modules are composed based on the predetermined plan corresponding to the question type. The experimental results based on 615 questions show that the compositional QA approach outperforms the simple routing method by about 17%. Considering 115 composite questions only, the F-score was almost tripled from the baseline.  相似文献   

18.
    
Extractive summarization for academic articles in natural sciences and medicine has attracted attention for a long time. However, most existing extractive summarization models often process academic articles with sentence classification models, which are hard to produce comprehensive summaries. To address this issue, we explore a new view to solve the extractive summarization of academic articles in natural sciences and medicine by taking it as a question-answering process. We propose a novel framework, MRC-Sum, where the extractive summarization for academic articles in natural sciences and medicine is cast as an MRC (Machine Reading Comprehension) task. To instantiate MRC-Sum, article-summary pairs in the summarization datasets are firstly reconstructed into (Question, Answer, Context) triples in the MRC task. Several questions are designed to cover the main aspects (e.g. Background, Method, Result, Conclusion) of the articles in natural sciences and medicine. A novel strategy is proposed to solve the problem of the non-existence of the ground truth answer spans. Then MRC-Sum is trained on the reconstructed datasets and large-scale pre-trained models. During the inference stage, four answer spans of the predefined questions are given by MRC-Sum and concatenated to form the final summary for each article. Experiments on three publicly available benchmarks, i.e., the Covid, PubMed, and arXiv datasets, demonstrate the effectiveness of MRC-Sum. Specifically, MRC-Sum outperforms advanced extractive summarization baselines on the Covid dataset and achieves competitive results on the PubMed and arXiv datasets. We also propose a novel metric, COMPREHS, to automatically evaluate the comprehensiveness of the system summaries for academic articles in natural sciences and medicine. Abundant experiments are conducted and verified the reliability of the proposed metric. And the results of the COMPREHS metric show that MRC-Sum is able to generate more comprehensive summaries than the baseline models.  相似文献   

19.
Question answering (QA) aims at finding exact answers to a user’s question from a large collection of documents. Most QA systems combine information retrieval with extraction techniques to identify a set of likely candidates and then utilize some ranking strategy to generate the final answers. This ranking process can be challenging, as it entails identifying the relevant answers amongst many irrelevant ones. This is more challenging in multi-strategy QA, in which multiple answering agents are used to extract answer candidates. As answer candidates come from different agents with different score distributions, how to merge answer candidates plays an important role in answer ranking. In this paper, we propose a unified probabilistic framework which combines multiple evidence to address challenges in answer ranking and answer merging. The hypotheses of the paper are that: (1) the framework effectively combines multiple evidence for identifying answer relevance and their correlation in answer ranking, (2) the framework supports answer merging on answer candidates returned by multiple extraction techniques, (3) the framework can support list questions as well as factoid questions, (4) the framework can be easily applied to a different QA system, and (5) the framework significantly improves performance of a QA system. An extensive set of experiments was done to support our hypotheses and demonstrate the effectiveness of the framework. All of the work substantially extends the preliminary research in Ko et al. (2007a). A probabilistic framework for answer selection in question answering. In: Proceedings of NAACL/HLT.  相似文献   

20.
In this paper, we propose a generative model, the Topic-based User Interest (TUI) model, to capture the user interest in the User-Interactive Question Answering (UIQA) systems. Specifically, our method aims to model the user interest in the UIQA systems with latent topic method, and extract interests for users by mining the questions they asked, the categories they participated in and relevant answer providers. We apply the TUI model to the application of question recommendation, which automatically recommends to certain user appropriate questions he might be interested in. Data collection from Yahoo! Answers is used to evaluate the performance of the proposed model in question recommendation, and the experimental results show the effectiveness of our proposed model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号