首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Effective learning schemes such as fine-tuning, zero-shot, and few-shot learning, have been widely used to obtain considerable performance with only a handful of annotated training data. In this paper, we presented a unified benchmark to facilitate the problem of zero-shot text classification in Turkish. For this purpose, we evaluated three methods, namely, Natural Language Inference, Next Sentence Prediction and our proposed model that is based on Masked Language Modeling and pre-trained word embeddings on nine Turkish datasets for three main categories: topic, sentiment, and emotion. We used pre-trained Turkish monolingual and multilingual transformer models which can be listed as BERT, ConvBERT, DistilBERT and mBERT. The results showed that ConvBERT with the NLI method yields the best results with 79% and outperforms previously used multilingual XLM-RoBERTa model by 19.6%. The study contributes to the literature using different and unattempted transformer models for Turkish and showing improvement of zero-shot text classification performance for monolingual models over multilingual models.  相似文献   

2.
Zero-shot object classification aims to recognize the object of unseen classes whose supervised data are unavailable in the training stage. Recent zero-shot learning (ZSL) methods usually propose to generate new supervised data for unseen classes by designing various deep generative networks. In this paper, we propose an end-to-end deep generative ZSL approach that trains the data generation module and object classification module jointly, rather than separately as in the majority of existing generation-based ZSL methods. Due to the ZSL assumption that unseen data are unavailable in the training stage, the distribution of generated unseen data will shift to the distribution of seen data, and subsequently causes the projection domain shift problem. Therefore, we further design a novel meta-learning optimization model to improve the proposed generation-based ZSL approach, where the parameters initialization and the parameters update algorithm are meta-learned to assist model convergence. We evaluate the proposed approach on five standard ZSL datasets. The average accuracy increased by the proposed jointly training strategy is 2.7% and 23.0% for the standard ZSL task and generalized ZSL task respectively, and the meta-learning optimization further improves the accuracy by 5.0% and 2.1% on two ZSL tasks respectively. Experimental results demonstrate that the proposed approach has significant superiority in various ZSL tasks.  相似文献   

3.
Knowledge representation learning(KRL) transforms knowledge graph(KG) from symbol space to vector space. However, KRL under open world assumption(OWA) is deeply trapped in the dilemma of lack of labels due to difficulty or high cost in labeling. To address this problem, we propose KRL_MLCCL:Multi-Label Classification based on Contrastive Learning(CL) Knowledge Representation Learning method. Specifically, (1)we formalize a problem of solving true knowledge graph objects(KGOs) matchings(KGOMs) under the OWA in the original KGOM sample space(KGOMSS)(multi-label classification with one known true matching(positive-example)). (2)we solve the problem in the new KGOMSS, generated through augmenting the true matching according to CL’s idea(multi-label classification with multiple known true matching). (3)we score the true matchings based on hermitian inner product and softmax and minimize a negative logarithm likelihood loss to establish KRL_MLCCL model preliminarily. (4)we migrate the learned model back to the original KGOMSS to solve the true matching problem. We creatively design and apply a positive-example augmentation way of CL enabling KRL_MLCCL with back migration ability: “pulling KGOs in true matching close and pushing KGOs in false matching away”, which helps KRL out of the labels shortage dilemma faced in modeling. We also propose a negative-example noise filtering algorithm to enhance this ability. The open world entity prediction(OWEP) experiment on dataset FB15K-237-OWE shows that the performance of KRL_MLCCL is increased by 3% in Hits@10 and 1.32% in MRR compared with the state-of-the-art in the baselines. The experiments of OWEP in KG also show that KRL_MLCCL has a better back migration ability.  相似文献   

4.
Hate speech is an increasingly important societal issue in the era of digital communication. Hateful expressions often make use of figurative language and, although they represent, in some sense, the dark side of language, they are also often prime examples of creative use of language. While hate speech is a global phenomenon, current studies on automatic hate speech detection are typically framed in a monolingual setting. In this work, we explore hate speech detection in low-resource languages by transferring knowledge from a resource-rich language, English, in a zero-shot learning fashion. We experiment with traditional and recent neural architectures, and propose two joint-learning models, using different multilingual language representations to transfer knowledge between pairs of languages. We also evaluate the impact of additional knowledge in our experiment, by incorporating information from a multilingual lexicon of abusive words. The results show that our joint-learning models achieve the best performance on most languages. However, a simple approach that uses machine translation and a pre-trained English language model achieves a robust performance. In contrast, Multilingual BERT fails to obtain a good performance in cross-lingual hate speech detection. We also experimentally found that the external knowledge from a multilingual abusive lexicon is able to improve the models’ performance, specifically in detecting the positive class. The results of our experimental evaluation highlight a number of challenges and issues in this particular task. One of the main challenges is related to the issue of current benchmarks for hate speech detection, in particular how bias related to the topical focus in the datasets influences the classification performance. The insufficient ability of current multilingual language models to transfer knowledge between languages in the specific hate speech detection task also remain an open problem. However, our experimental evaluation and our qualitative analysis show how the explicit integration of linguistic knowledge from a structured abusive language lexicon helps to alleviate this issue.  相似文献   

5.
Stance detection is to distinguish whether the text’s author supports, opposes, or maintains a neutral stance towards a given target. In most real-world scenarios, stance detection needs to work in a zero-shot manner, i.e., predicting stances for unseen targets without labeled data. One critical challenge of zero-shot stance detection is the absence of contextual information on the targets. Current works mostly concentrate on introducing external knowledge to supplement information about targets, but the noisy schema-linking process hinders their performance in practice. To combat this issue, we argue that previous studies have ignored the extensive target-related information inhabited in the unlabeled data during the training phase, and propose a simple yet efficient Multi-Perspective Contrastive Learning Framework for zero-shot stance detection. Our framework is capable of leveraging information not only from labeled data but also from extensive unlabeled data. To this end, we design target-oriented contrastive learning and label-oriented contrastive learning to capture more comprehensive target representation and more distinguishable stance features. We conduct extensive experiments on three widely adopted datasets (from 4870 to 33,090 instances), namely SemEval-2016, WT-WT, and VAST. Our framework achieves 53.6%, 77.1%, and 72.4% macro-average F1 scores on these three datasets, showing 2.71% and 0.25% improvements over state-of-the-art baselines on the SemEval-2016 and WT-WT datasets and comparable results on the more challenging VAST dataset.  相似文献   

6.
We study the selection of transfer languages for automatic abusive language detection. Instead of preparing a dataset for every language, we demonstrate the effectiveness of cross-lingual transfer learning for zero-shot abusive language detection. This way we can use existing data from higher-resource languages to build better detection systems for low-resource languages. Our datasets are from seven different languages from three language families. We measure the distance between the languages using several language similarity measures, especially by quantifying the World Atlas of Language Structures. We show that there is a correlation between linguistic similarity and classifier performance. This discovery allows us to choose an optimal transfer language for zero shot abusive language detection.  相似文献   

7.
Both node classification and link prediction are popular topics of supervised learning on the graph data, but previous works seldom integrate them together to capture their complementary information. In this paper, we propose a Multi-Task and Multi-Graph Convolutional Network (MTGCN) to jointly conduct node classification and link prediction in a unified framework. Specifically, MTGCN consists of multiple multi-task learning so that each multi-task learning learns the complementary information between node classification and link prediction. In particular, each multi-task learning uses different inputs to output representations of the graph data. Moreover, the parameters of one multi-task learning initialize the parameters of the other multi-task learning, so that the useful information in the former multi-task learning can be propagated to the other multi-task learning. As a result, the information is augmented to guarantee the quality of representations by exploring the complex constructure inherent in the graph data. Experimental results on six datasets show that our MTGCN outperforms the comparison methods in terms of both node classification and link prediction.  相似文献   

8.
Knowledge graph representation learning (KGRL) aims to infer the missing links between target entities based on existing triples. Graph neural networks (GNNs) have been introduced recently as one of the latest trendy architectures serves KGRL task using aggregations of neighborhood information. However, current GNN-based methods have fundamental limitations in both modelling the multi-hop distant neighbors and selecting relation-specific neighborhood information from vast neighbors. In this study, we propose a new relation-specific graph transformation network (RGTN) for the KGRL task. Specifically, the proposed RGTN is the first pioneer model that transforms a relation-based graph into a new path-based graph by generating useful paths that connect heterogeneous relations and multi-hop neighbors. Unlike the existing GNN-based methods, our approach is able to adaptively select the most useful paths for each specific relation and to effectively build path-based connections between unconnected distant entities. The transformed new graph structure opens a new way to model the arbitrary lengths of multi-hop neighbors which leads to more effective embedding learning. In order to verify the effectiveness of our proposed model, we conduct extensive experiments on three standard benchmark datasets, e.g., WN18RR, FB15k-237 and YAGO-10-DR. Experimental results show that the proposed RGTN achieves the promising results and even outperforms other state-of-the-art models on the KGRL task (e.g., compared to other state-of-the-art GNN-based methods, our model achieves 2.5% improvement using H@10 on WN18RR, 1.2% improvement using H@10 on FB15k-237 and 6% improvement using H@10 on YAGO3-10-DR).  相似文献   

9.
Text classification is an important research topic in natural language processing (NLP), and Graph Neural Networks (GNNs) have recently been applied in this task. However, in existing graph-based models, text graphs constructed by rules are not real graph data and introduce massive noise. More importantly, for fixed corpus-level graph structure, these models cannot sufficiently exploit the labeled and unlabeled information of nodes. Meanwhile, contrastive learning has been developed as an effective method in graph domain to fully utilize the information of nodes. Therefore, we propose a new graph-based model for text classification named CGA2TC, which introduces contrastive learning with an adaptive augmentation strategy into obtaining more robust node representation. First, we explore word co-occurrence and document word relationships to construct a text graph. Then, we design an adaptive augmentation strategy for the text graph with noise to generate two contrastive views that effectively solve the noise problem and preserve essential structure. Specifically, we design noise-based and centrality-based augmentation strategies on the topological structure of text graph to disturb the unimportant connections and thus highlight the relatively important edges. As for the labeled nodes, we take the nodes with same label as multiple positive samples and assign them to anchor node, while we employ consistency training on unlabeled nodes to constrain model predictions. Finally, to reduce the resource consumption of contrastive learning, we adopt a random sample method to select some nodes to calculate contrastive loss. The experimental results on several benchmark datasets can demonstrate the effectiveness of CGA2TC on the text classification task.  相似文献   

10.
Graph-based multi-view clustering aims to take advantage of multiple view graph information to provide clustering solutions. The consistency constraint of multiple views is the key of multi-view graph clustering. Most existing studies generate fusion graphs and constrain multi-view consistency by clustering loss. We argue that local pair-view consistency can achieve fine-modeling of consensus information in multiple views. Towards this end, we propose a novel Contrastive and Attentive Graph Learning framework for multi-view clustering (CAGL). Specifically, we design a contrastive fine-modeling in multi-view graph learning using maximizing the similarity of pair-view to guarantee the consistency of multiple views. Meanwhile, an Att-weighted refined fusion graph module based on attention networks to capture the capacity difference of different views dynamically and further facilitate the mutual reinforcement of single view and fusion view. Besides, our CAGL can learn a specialized representation for clustering via a self-training clustering module. Finally, we develop a joint optimization objective to balance every module and iteratively optimize the proposed CAGL in the framework of graph encoder–decoder. Experimental results on six benchmarks across different modalities and sizes demonstrate that our CAGL outperforms state-of-the-art baselines.  相似文献   

11.
Question classification (QC) involves classifying given question based on the expected answer type and is an important task in the Question Answering(QA) system. Existing approaches for question classification use full training dataset to fine-tune the models. It is expensive and requires more time to develop labelled datasets in huge size. Hence, there is a need to develop approaches that can achieve comparable or state of the art performance using limited training instances. In this paper, we propose an approach that uses data augmentation as a tool to generate additional training instances. We evaluate our proposed approach on two question classification datasets namely TREC and ICHI datasets. Experimental results show that our proposed approach reduces the requirement of labelled instances (a) up to 81.7% and achieves new state of the art accuracy of 98.11 on TREC dataset and (b) up to 75% and achieves 67.9 on ICHI dataset.  相似文献   

12.
Graph neural networks (GNN) have emerged as a new state-of-the-art for learning knowledge graph representations. Although they have shown impressive performance in recent studies, how to efficiently and effectively aggregate neighboring features is not well designed. To tackle this challenge, we propose the simplifying heterogeneous graph neural network (SHGNet), a generic framework that discards the two standard operations in GNN, including the transformation matrix and nonlinear activation. SHGNet, in particular, adopts only the essential component of neighborhood aggregation in GNN and incorporates relation features into feature propagation. Furthermore, to capture complex structures, SHGNet utilizes a hierarchical aggregation architecture, including node aggregation and relation weighting. Thus, the proposed model can treat each relation differently and selectively aggregate informative features. SHGNet has been evaluated for link prediction tasks on three real-world benchmark datasets. The experimental results show that SHGNet significantly promotes efficiency while maintaining superior performance, outperforming all the existing models in 3 out of 4 metrics on NELL-995 and in 4 out of 4 metrics on FB15k-237 dataset.  相似文献   

13.
14.
We digitized three years of Dutch election manifestos annotated by the Dutch political scientist Isaac Lipschits. We used these data to train a classifier that can automatically label new, unseen election manifestos with themes. Having the manifestos in a uniform XML format with all paragraphs annotated with their themes has advantages for both electronic publishing of the data and diachronic comparative data analysis. The data that we created will be disclosed to the public through a search interface. This means that it will be possible to query the data and filter them on themes and parties. We optimized the Lipschits classifier on the task of classifying election manifestos using models trained on earlier years. We built a classifier that is suited for classifying election manifestos from 2002 onwards using the data from the 1980s and 1990s. We evaluated the results by having a domain expert manually assess a sample of the classified data. We found that our automatic classifier obtains the same precision as a human classifier on unseen data. Its recall could be improved by extending the set of themes with newly emerged themes. Thus when using old political texts to classify new texts, work is needed to link and expand the set of themes to newer topics.  相似文献   

15.
Researchers have been aware that emotion is not one-hot encoded in emotion-relevant classification tasks, and multiple emotions can coexist in a given sentence. Recently, several works have focused on leveraging a distribution label or a grayscale label of emotions in the classification model, which can enhance the one-hot label with additional information, such as the intensity of other emotions and the correlation between emotions. Such an approach has been proven effective in alleviating the overfitting problem and improving the model robustness by introducing a distribution learning component in the objective function. However, the effect of distribution learning cannot be fully unfolded as it can reduce the model’s discriminative ability within similar emotion categories. For example, “Sad” and “Fear” are both negative emotions. To address such a problem, we proposed a novel emotion extension scheme in the prior work (Li, Chen, Xie, Li, and Tao, 2021). The prior work incorporated fine-grained emotion concepts to build an extended label space, where a mapping function between coarse-grained emotion categories and fine-grained emotion concepts was identified. For example, sentences labeled “Joy” can convey various emotions such as enjoy, free, and leisure. The model can further benefit from the extended space by extracting dependency within fine-grained emotions when yielding predictions in the original label space. The prior work has shown that it is more apt to apply distribution learning in the extended label space than in the original space. A novel sparse connection method, i.e., Leaky Dropout, is proposed in this paper to refine the dependency-extraction step, which further improves the classification performance. In addition to the multiclass emotion classification task, we extensively experimented on sentiment analysis and multilabel emotion prediction tasks to investigate the effectiveness and generality of the label extension schema.  相似文献   

16.
运用科学计量方法对软科学领域的六大核心期刊(2008—2018年)现状和研究热点进行分析。研究发现,目前我国软科学以国家创新驱动发展战略为主要研究方向,围绕区域创新体系及创新链的热点焦点问题和关键环节开展研究,同时,发现各个参与研究者之间合作紧密。研究运用描述性分析的方法,管窥我国软科学领域研究现状,并对未来发展做出展望,为我国软科学繁荣发展提供参考。  相似文献   

17.
We study several machine learning algorithms for cross-language patent retrieval and classification. In comparison with most of other studies involving machine learning for cross-language information retrieval, which basically used learning techniques for monolingual sub-tasks, our learning algorithms exploit the bilingual training documents and learn a semantic representation from them. We study Japanese–English cross-language patent retrieval using Kernel Canonical Correlation Analysis (KCCA), a method of correlating linear relationships between two variables in kernel defined feature spaces. The results are quite encouraging and are significantly better than those obtained by other state of the art methods. We also investigate learning algorithms for cross-language document classification. The learning algorithm are based on KCCA and Support Vector Machines (SVM). In particular, we study two ways of combining the KCCA and SVM and found that one particular combination called SVM_2k achieved better results than other learning algorithms for either bilingual or monolingual test documents.  相似文献   

18.
Transfer learning utilizes labeled data available from some related domain (source domain) for achieving effective knowledge transformation to the target domain. However, most state-of-the-art cross-domain classification methods treat documents as plain text and ignore the hyperlink (or citation) relationship existing among the documents. In this paper, we propose a novel cross-domain document classification approach called Link-Bridged Topic model (LBT). LBT consists of two key steps. Firstly, LBT utilizes an auxiliary link network to discover the direct or indirect co-citation relationship among documents by embedding the background knowledge into a graph kernel. The mined co-citation relationship is leveraged to bridge the gap across different domains. Secondly, LBT simultaneously combines the content information and link structures into a unified latent topic model. The model is based on an assumption that the documents of source and target domains share some common topics from the point of view of both content information and link structure. By mapping both domains data into the latent topic spaces, LBT encodes the knowledge about domain commonality and difference as the shared topics with associated differential probabilities. The learned latent topics must be consistent with the source and target data, as well as content and link statistics. Then the shared topics act as the bridge to facilitate knowledge transfer from the source to the target domains. Experiments on different types of datasets show that our algorithm significantly improves the generalization performance of cross-domain document classification.  相似文献   

19.
Question-answering has become one of the most popular information retrieval applications. Despite that most question-answering systems try to improve the user experience and the technology used in finding relevant results, many difficulties are still faced because of the continuous increase in the amount of web content. Questions Classification (QC) plays an important role in question-answering systems, with one of the major tasks in the enhancement of the classification process being the identification of questions types. A broad range of QC approaches has been proposed with the aim of helping to find a solution for the classification problems; most of these are approaches based on bag-of-words or dictionaries. In this research, we present an analysis of the different type of questions based on their grammatical structure. We identify different patterns and use machine learning algorithms to classify them. A framework is proposed for question classification using a grammar-based approach (GQCC) which exploits the structure of the questions. Our findings indicate that using syntactic categories related to different domain-specific types of Common Nouns, Numeral Numbers and Proper Nouns enable the machine learning algorithms to better differentiate between different question types. The paper presents a wide range of experiments the results show that the GQCC using J48 classifier has outperformed other classification methods with 90.1% accuracy.  相似文献   

20.
Recently, sentiment classification has received considerable attention within the natural language processing research community. However, since most recent works regarding sentiment classification have been done in the English language, there are accordingly not enough sentiment resources in other languages. Manual construction of reliable sentiment resources is a very difficult and time-consuming task. Cross-lingual sentiment classification aims to utilize annotated sentiment resources in one language (typically English) for sentiment classification of text documents in another language. Most existing research works rely on automatic machine translation services to directly project information from one language to another. However, different term distribution between original and translated text documents and translation errors are two main problems faced in the case of using only machine translation. To overcome these problems, we propose a novel learning model based on active learning and semi-supervised co-training to incorporate unlabelled data from the target language into the learning process in a bi-view framework. This model attempts to enrich training data by adding the most confident automatically-labelled examples, as well as a few of the most informative manually-labelled examples from unlabelled data in an iterative process. Further, in this model, we consider the density of unlabelled data so as to select more representative unlabelled examples in order to avoid outlier selection in active learning. The proposed model was applied to book review datasets in three different languages. Experiments showed that our model can effectively improve the cross-lingual sentiment classification performance and reduce labelling efforts in comparison with some baseline methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号