首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
    
Stance detection is to distinguish whether the text’s author supports, opposes, or maintains a neutral stance towards a given target. In most real-world scenarios, stance detection needs to work in a zero-shot manner, i.e., predicting stances for unseen targets without labeled data. One critical challenge of zero-shot stance detection is the absence of contextual information on the targets. Current works mostly concentrate on introducing external knowledge to supplement information about targets, but the noisy schema-linking process hinders their performance in practice. To combat this issue, we argue that previous studies have ignored the extensive target-related information inhabited in the unlabeled data during the training phase, and propose a simple yet efficient Multi-Perspective Contrastive Learning Framework for zero-shot stance detection. Our framework is capable of leveraging information not only from labeled data but also from extensive unlabeled data. To this end, we design target-oriented contrastive learning and label-oriented contrastive learning to capture more comprehensive target representation and more distinguishable stance features. We conduct extensive experiments on three widely adopted datasets (from 4870 to 33,090 instances), namely SemEval-2016, WT-WT, and VAST. Our framework achieves 53.6%, 77.1%, and 72.4% macro-average F1 scores on these three datasets, showing 2.71% and 0.25% improvements over state-of-the-art baselines on the SemEval-2016 and WT-WT datasets and comparable results on the more challenging VAST dataset.  相似文献   

2.
    
Knowledge representation learning(KRL) transforms knowledge graph(KG) from symbol space to vector space. However, KRL under open world assumption(OWA) is deeply trapped in the dilemma of lack of labels due to difficulty or high cost in labeling. To address this problem, we propose KRL_MLCCL:Multi-Label Classification based on Contrastive Learning(CL) Knowledge Representation Learning method. Specifically, (1)we formalize a problem of solving true knowledge graph objects(KGOs) matchings(KGOMs) under the OWA in the original KGOM sample space(KGOMSS)(multi-label classification with one known true matching(positive-example)). (2)we solve the problem in the new KGOMSS, generated through augmenting the true matching according to CL’s idea(multi-label classification with multiple known true matching). (3)we score the true matchings based on hermitian inner product and softmax and minimize a negative logarithm likelihood loss to establish KRL_MLCCL model preliminarily. (4)we migrate the learned model back to the original KGOMSS to solve the true matching problem. We creatively design and apply a positive-example augmentation way of CL enabling KRL_MLCCL with back migration ability: “pulling KGOs in true matching close and pushing KGOs in false matching away”, which helps KRL out of the labels shortage dilemma faced in modeling. We also propose a negative-example noise filtering algorithm to enhance this ability. The open world entity prediction(OWEP) experiment on dataset FB15K-237-OWE shows that the performance of KRL_MLCCL is increased by 3% in Hits@10 and 1.32% in MRR compared with the state-of-the-art in the baselines. The experiments of OWEP in KG also show that KRL_MLCCL has a better back migration ability.  相似文献   

3.
    
Knowledge graph representation learning (KGRL) aims to infer the missing links between target entities based on existing triples. Graph neural networks (GNNs) have been introduced recently as one of the latest trendy architectures serves KGRL task using aggregations of neighborhood information. However, current GNN-based methods have fundamental limitations in both modelling the multi-hop distant neighbors and selecting relation-specific neighborhood information from vast neighbors. In this study, we propose a new relation-specific graph transformation network (RGTN) for the KGRL task. Specifically, the proposed RGTN is the first pioneer model that transforms a relation-based graph into a new path-based graph by generating useful paths that connect heterogeneous relations and multi-hop neighbors. Unlike the existing GNN-based methods, our approach is able to adaptively select the most useful paths for each specific relation and to effectively build path-based connections between unconnected distant entities. The transformed new graph structure opens a new way to model the arbitrary lengths of multi-hop neighbors which leads to more effective embedding learning. In order to verify the effectiveness of our proposed model, we conduct extensive experiments on three standard benchmark datasets, e.g., WN18RR, FB15k-237 and YAGO-10-DR. Experimental results show that the proposed RGTN achieves the promising results and even outperforms other state-of-the-art models on the KGRL task (e.g., compared to other state-of-the-art GNN-based methods, our model achieves 2.5% improvement using H@10 on WN18RR, 1.2% improvement using H@10 on FB15k-237 and 6% improvement using H@10 on YAGO3-10-DR).  相似文献   

4.
    
Graph-based multi-view clustering aims to take advantage of multiple view graph information to provide clustering solutions. The consistency constraint of multiple views is the key of multi-view graph clustering. Most existing studies generate fusion graphs and constrain multi-view consistency by clustering loss. We argue that local pair-view consistency can achieve fine-modeling of consensus information in multiple views. Towards this end, we propose a novel Contrastive and Attentive Graph Learning framework for multi-view clustering (CAGL). Specifically, we design a contrastive fine-modeling in multi-view graph learning using maximizing the similarity of pair-view to guarantee the consistency of multiple views. Meanwhile, an Att-weighted refined fusion graph module based on attention networks to capture the capacity difference of different views dynamically and further facilitate the mutual reinforcement of single view and fusion view. Besides, our CAGL can learn a specialized representation for clustering via a self-training clustering module. Finally, we develop a joint optimization objective to balance every module and iteratively optimize the proposed CAGL in the framework of graph encoder–decoder. Experimental results on six benchmarks across different modalities and sizes demonstrate that our CAGL outperforms state-of-the-art baselines.  相似文献   

5.
    
Hate speech is an increasingly important societal issue in the era of digital communication. Hateful expressions often make use of figurative language and, although they represent, in some sense, the dark side of language, they are also often prime examples of creative use of language. While hate speech is a global phenomenon, current studies on automatic hate speech detection are typically framed in a monolingual setting. In this work, we explore hate speech detection in low-resource languages by transferring knowledge from a resource-rich language, English, in a zero-shot learning fashion. We experiment with traditional and recent neural architectures, and propose two joint-learning models, using different multilingual language representations to transfer knowledge between pairs of languages. We also evaluate the impact of additional knowledge in our experiment, by incorporating information from a multilingual lexicon of abusive words. The results show that our joint-learning models achieve the best performance on most languages. However, a simple approach that uses machine translation and a pre-trained English language model achieves a robust performance. In contrast, Multilingual BERT fails to obtain a good performance in cross-lingual hate speech detection. We also experimentally found that the external knowledge from a multilingual abusive lexicon is able to improve the models’ performance, specifically in detecting the positive class. The results of our experimental evaluation highlight a number of challenges and issues in this particular task. One of the main challenges is related to the issue of current benchmarks for hate speech detection, in particular how bias related to the topical focus in the datasets influences the classification performance. The insufficient ability of current multilingual language models to transfer knowledge between languages in the specific hate speech detection task also remain an open problem. However, our experimental evaluation and our qualitative analysis show how the explicit integration of linguistic knowledge from a structured abusive language lexicon helps to alleviate this issue.  相似文献   

6.
Zero-shot object classification aims to recognize the object of unseen classes whose supervised data are unavailable in the training stage. Recent zero-shot learning (ZSL) methods usually propose to generate new supervised data for unseen classes by designing various deep generative networks. In this paper, we propose an end-to-end deep generative ZSL approach that trains the data generation module and object classification module jointly, rather than separately as in the majority of existing generation-based ZSL methods. Due to the ZSL assumption that unseen data are unavailable in the training stage, the distribution of generated unseen data will shift to the distribution of seen data, and subsequently causes the projection domain shift problem. Therefore, we further design a novel meta-learning optimization model to improve the proposed generation-based ZSL approach, where the parameters initialization and the parameters update algorithm are meta-learned to assist model convergence. We evaluate the proposed approach on five standard ZSL datasets. The average accuracy increased by the proposed jointly training strategy is 2.7% and 23.0% for the standard ZSL task and generalized ZSL task respectively, and the meta-learning optimization further improves the accuracy by 5.0% and 2.1% on two ZSL tasks respectively. Experimental results demonstrate that the proposed approach has significant superiority in various ZSL tasks.  相似文献   

7.
    
Text classification is an important research topic in natural language processing (NLP), and Graph Neural Networks (GNNs) have recently been applied in this task. However, in existing graph-based models, text graphs constructed by rules are not real graph data and introduce massive noise. More importantly, for fixed corpus-level graph structure, these models cannot sufficiently exploit the labeled and unlabeled information of nodes. Meanwhile, contrastive learning has been developed as an effective method in graph domain to fully utilize the information of nodes. Therefore, we propose a new graph-based model for text classification named CGA2TC, which introduces contrastive learning with an adaptive augmentation strategy into obtaining more robust node representation. First, we explore word co-occurrence and document word relationships to construct a text graph. Then, we design an adaptive augmentation strategy for the text graph with noise to generate two contrastive views that effectively solve the noise problem and preserve essential structure. Specifically, we design noise-based and centrality-based augmentation strategies on the topological structure of text graph to disturb the unimportant connections and thus highlight the relatively important edges. As for the labeled nodes, we take the nodes with same label as multiple positive samples and assign them to anchor node, while we employ consistency training on unlabeled nodes to constrain model predictions. Finally, to reduce the resource consumption of contrastive learning, we adopt a random sample method to select some nodes to calculate contrastive loss. The experimental results on several benchmark datasets can demonstrate the effectiveness of CGA2TC on the text classification task.  相似文献   

8.
    
Graph neural networks (GNN) have emerged as a new state-of-the-art for learning knowledge graph representations. Although they have shown impressive performance in recent studies, how to efficiently and effectively aggregate neighboring features is not well designed. To tackle this challenge, we propose the simplifying heterogeneous graph neural network (SHGNet), a generic framework that discards the two standard operations in GNN, including the transformation matrix and nonlinear activation. SHGNet, in particular, adopts only the essential component of neighborhood aggregation in GNN and incorporates relation features into feature propagation. Furthermore, to capture complex structures, SHGNet utilizes a hierarchical aggregation architecture, including node aggregation and relation weighting. Thus, the proposed model can treat each relation differently and selectively aggregate informative features. SHGNet has been evaluated for link prediction tasks on three real-world benchmark datasets. The experimental results show that SHGNet significantly promotes efficiency while maintaining superior performance, outperforming all the existing models in 3 out of 4 metrics on NELL-995 and in 4 out of 4 metrics on FB15k-237 dataset.  相似文献   

9.
    
Effective learning schemes such as fine-tuning, zero-shot, and few-shot learning, have been widely used to obtain considerable performance with only a handful of annotated training data. In this paper, we presented a unified benchmark to facilitate the problem of zero-shot text classification in Turkish. For this purpose, we evaluated three methods, namely, Natural Language Inference, Next Sentence Prediction and our proposed model that is based on Masked Language Modeling and pre-trained word embeddings on nine Turkish datasets for three main categories: topic, sentiment, and emotion. We used pre-trained Turkish monolingual and multilingual transformer models which can be listed as BERT, ConvBERT, DistilBERT and mBERT. The results showed that ConvBERT with the NLI method yields the best results with 79% and outperforms previously used multilingual XLM-RoBERTa model by 19.6%. The study contributes to the literature using different and unattempted transformer models for Turkish and showing improvement of zero-shot text classification performance for monolingual models over multilingual models.  相似文献   

10.
    
We study the selection of transfer languages for automatic abusive language detection. Instead of preparing a dataset for every language, we demonstrate the effectiveness of cross-lingual transfer learning for zero-shot abusive language detection. This way we can use existing data from higher-resource languages to build better detection systems for low-resource languages. Our datasets are from seven different languages from three language families. We measure the distance between the languages using several language similarity measures, especially by quantifying the World Atlas of Language Structures. We show that there is a correlation between linguistic similarity and classifier performance. This discovery allows us to choose an optimal transfer language for zero shot abusive language detection.  相似文献   

11.
    
Few-shot intent recognition aims to identify user’s intent from the utterance with limited training data. A considerable number of existing methods mainly rely on the generic knowledge acquired on the base classes to identify the novel classes. Such methods typically ignore the characteristics of each meta task itself, resulting in the inability to make full use of limited given samples when classifying unseen classes. To deal with such issues, we propose a Contrastive learning-based Task Adaptation model (CTA) for few-shot intent recognition. In detail, we leverage contrastive learning to help achieve task adaptation and make full use of the limited samples of novel classes. First, a self-attention layer is employed in the task adaptation module, which aims to establish interactions between samples of different categories so that new representations are task-specific rather than relying entirely on the base classes. Then, the contrastive-based loss functions and the semantics of the label name are respectively used for reducing the similarity between sample representations in different categories while increasing it in the same categories. Experimental results on a public dataset OOS verify the effectiveness of our proposal by beating the competitive baselines in terms of accuracy. Besides, we conduct the cross-domain experiments on three datasets, i.e., OOS, SNIPS as well as ATIS. We find that CTA gains obvious improvements in terms of accuracy in all cross-domain experiments, indicating that it has a better generalization ability than other competitive baselines in both cross-domain and single-domain settings.  相似文献   

12.
运用科学计量方法对软科学领域的六大核心期刊(2008—2018年)现状和研究热点进行分析。研究发现,目前我国软科学以国家创新驱动发展战略为主要研究方向,围绕区域创新体系及创新链的热点焦点问题和关键环节开展研究,同时,发现各个参与研究者之间合作紧密。研究运用描述性分析的方法,管窥我国软科学领域研究现状,并对未来发展做出展望,为我国软科学繁荣发展提供参考。  相似文献   

13.
We study the selection of transfer languages for different Natural Language Processing tasks, specifically sentiment analysis, named entity recognition and dependency parsing. In order to select an optimal transfer language, we propose to utilize different linguistic similarity metrics to measure the distance between languages and make the choice of transfer language based on this information instead of relying on intuition. We demonstrate that linguistic similarity correlates with cross-lingual transfer performance for all of the proposed tasks. We also show that there is a statistically significant difference in choosing the optimal language as the transfer source instead of English. This allows us to select a more suitable transfer language which can be used to better leverage knowledge from high-resource languages in order to improve the performance of language applications lacking data. For the study, we used datasets from eight different languages from three language families.  相似文献   

14.
    
Both node classification and link prediction are popular topics of supervised learning on the graph data, but previous works seldom integrate them together to capture their complementary information. In this paper, we propose a Multi-Task and Multi-Graph Convolutional Network (MTGCN) to jointly conduct node classification and link prediction in a unified framework. Specifically, MTGCN consists of multiple multi-task learning so that each multi-task learning learns the complementary information between node classification and link prediction. In particular, each multi-task learning uses different inputs to output representations of the graph data. Moreover, the parameters of one multi-task learning initialize the parameters of the other multi-task learning, so that the useful information in the former multi-task learning can be propagated to the other multi-task learning. As a result, the information is augmented to guarantee the quality of representations by exploring the complex constructure inherent in the graph data. Experimental results on six datasets show that our MTGCN outperforms the comparison methods in terms of both node classification and link prediction.  相似文献   

15.
With the emergence and development of deep generative models, such as the variational auto-encoders (VAEs), the research on topic modeling successfully extends to a new area: neural topic modeling, which aims to learn disentangled topics to understand the data better. However, the original VAE framework had been shown to be limited in disentanglement performance, bringing their inherent defects to a neural topic model (NTM). In this paper, we put forward that the optimization objectives of contrastive learning are consistent with two important goals (alignment and uniformity) of well-disentangled topic learning. Also, the optimization objectives of contrastive learning are consistent with two key evaluation measures for topic models, topic coherence and topic diversity. So, we come to the important conclusion that alignment and uniformity of disentangled topic learning can be quantified with topic coherence and topic diversity. Accordingly, we are inspired to propose the Contrastive Disentangled Neural Topic Model (CNTM). By representing both words and topics as low-dimensional vectors in the same embedding space, we apply contrastive learning to neural topic modeling to produce factorized and disentangled topics in an interpretable manner. We compare our proposed CNTM with strong baseline models on widely-used metrics. Our model achieves the best topic coherence scores under the most general evaluation setting (100% proportion topic selected) with 25.0%, 10.9%, 24.6%, and 51.3% improvements above the second-best models’ scores reported on four datasets of 20 Newsgroups, Web Snippets, Tag My News, and Reuters, respectively. Our method also gets the second-best topic diversity scores on the dataset of 20Newsgroups and Web Snippets. Our experimental results show that CNTM can effectively leverage the disentanglement ability from contrastive learning to solve the inherent defect of neural topic modeling and obtain better topic quality.  相似文献   

16.
    
  相似文献   

17.
The majority of currently available entity alignment (EA) solutions primarily rely on structural information to align entities, which is biased and disregards additional multi-source information. To compensate for inadequate structural details, this article suggests the SKEA framework, which is a simple but flexible framework for Entity Alignment with cross-modal supervision of Supporting Knowledge. We employ a relational aggregate network to specifically utilize the details about the entity and its neighbors. To overcome the limitations of relational features, two multi-modal encode modules are being used to extract visual and textural information. A new set of potential aligned entity pairs are generated by SKEA in each iteration using the knowledge of two reference modalities, which can enhance the model’s supervision. It is important to note that the supporting information used in our framework does not participate in the network’s backpropagation, which considerably improves efficiency and differs dramatically from earlier work. In comparison to existing baselines, experiments demonstrate that our proposed framework can incorporate multi-aspect information efficiently and enable supervisory signals from other modalities to transmit to entities. The maximum performance improvement of 5.24% indicates our suggested framework’s superiority, especially for sparse KGs.  相似文献   

18.
【目的】 通过分析国内图书情报领域零被引文章的主题特征,为图书情报领域科研工作者的写作以及科技期刊的选题、出版等提供参考和建议。【方法】 以17种《中文社会科学引文索引(2017—2018)》来源期刊为数据来源,利用知识图谱,展现其2008—2012年零被引论文的主题特征,通过对比同期高被引论文的主题,总结图书情报领域出现零被引现象的主题因素。【结果】 与高被引论文相比,零被引论文的研究主题分散、陈旧。但是研究主题并没有显示出偏离学科研究领域或者高度前瞻性的特点。【结论】 建议科研工作者在扎实写作的同时,注意创新研究主题,建议科技期刊选题策划应平衡各项原则,避免一味地追求创新。  相似文献   

19.
    
Deep multi-view clustering (MVC) is to mine and employ the complex relationships among views to learn the compact data clusters with deep neural networks in an unsupervised manner. The more recent deep contrastive learning (CL) methods have shown promising performance in MVC by learning cluster-oriented deep feature representations, which is realized by contrasting the positive and negative sample pairs. However, most existing deep contrastive MVC methods only focus on the one-side contrastive learning, such as feature-level or cluster-level contrast, failing to integrating the two sides together or bringing in more important aspects of contrast. Additionally, most of them work in a separate two-stage manner, i.e., first feature learning and then data clustering, failing to mutually benefit each other. To fix the above challenges, in this paper we propose a novel joint contrastive triple-learning framework to learn multi-view discriminative feature representation for deep clustering, which is threefold, i.e., feature-level alignment-oriented and commonality-oriented CL, and cluster-level consistency-oriented CL. The former two submodules aim to contrast the encoded feature representations of data samples in different feature levels, while the last contrasts the data samples in the cluster-level representations. Benefiting from the triple contrast, the more discriminative representations of views can be obtained. Meanwhile, a view weight learning module is designed to learn and exploit the quantitative complementary information across the learned discriminative features of each view. Thus, the contrastive triple-learning module, the view weight learning module and the data clustering module with these fused features are jointly performed, so that these modules are mutually beneficial. The extensive experiments on several challenging multi-view datasets show the superiority of the proposed method over many state-of-the-art methods, especially the large improvement of 15.5% and 8.1% on Caltech-4V and CCV in terms of accuracy. Due to the promising performance on visual datasets, the proposed method can be applied into many practical visual applications such as visual recognition and analysis. The source code of the proposed method is provided at https://github.com/ShizheHu/Joint-Contrastive-Triple-learning.  相似文献   

20.
    
Text-enhanced and implicit reasoning methods are proposed for answering questions over incomplete knowledge graph (KG), whereas prior studies either rely on external resources or lack necessary interpretability. This article desires to extend the line of reinforcement learning (RL) methods for better interpretability and dynamically augment original KG action space with additional actions. To this end, we propose a RL framework along with a dynamic completion mechanism, namely Dynamic Completion Reasoning Network (DCRN). DCRN consists of an action space completion module and a policy network. The action space completion module exploits three sub-modules (relation selector, relation pruner and tail entity predictor) to enrich options for decision making. The policy network calculates probability distribution over joint action space and selects promising next-step actions. Simultaneously, we employ the beam search-based action selection strategy to alleviate delayed and sparse rewards. Extensive experiments conducted on WebQSP, CWQ and MetaQA demonstrate the effectiveness of DCRN. Specifically, under 50% KG setting, the Hits@1 performance improvements of DCRN on MetaQA-1H and MetaQA-3H are 2.94% and 1.18% respectively. Moreover, under 30% and 10% KG settings, DCRN prevails over all baselines by 0.9% and 1.5% on WebQSP, indicating the robustness to sparse KGs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号