首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到12条相似文献,搜索用时 15 毫秒
1.
Human collaborative relationship inference is a meaningful task for online social networks and is called link prediction in network science. Real-world networks contain multiple types of interacting components and can be modeled naturally as heterogeneous information networks (HINs). The current link prediction algorithms in HINs fail to effectively extract training samples from snapshots of HINs; moreover, they underutilise the differences between nodes and between meta-paths. Therefore, we propose a meta-circuit machine (MCM) that can learn and fuse node and meta-path features efficiently, and we use these features to inference the collaborative relationships in question-and-answer and bibliographic networks. We first utilise meta-circuit random walks to obtain training samples in which the basic idea is to perform biased meta-path random walks on the input and target network successively and then connect them. Then, a meta-circuit recurrent neural network (mcRNN) is designed for link prediction, which represents each node and meta-path by a dense vector and leverages an RNN to fuse the features of node sequences. Experiments on two real-world networks demonstrate the effectiveness of our framework. This study promotes the investigation of potential evolutionary mechanisms for collaborative relationships and offers practical guidance for designing more effective recommendation systems for online social networks.  相似文献   

2.
Aspect-based sentiment analysis aims to predict the sentiment polarities of specific targets in a given text. Recent researches show great interest in modeling the target and context with attention network to obtain more effective feature representation for sentiment classification task. However, the use of an average vector of target for computing the attention score for context is unfair. Besides, the interaction mechanism is simple thus need to be further improved. To solve the above problems, this paper first proposes a coattention mechanism which models both target-level and context-level attention alternatively so as to focus on those key words of targets to learn more effective context representation. On this basis, we implement a Coattention-LSTM network which learns nonlinear representations of context and target simultaneously and can extracts more effective sentiment feature from coattention mechanism. Further, a Coattention-MemNet network which adopts a multiple-hops coattention mechanism is proposed to improve the sentiment classification result. Finally, we propose a new location weighted function which considers the location information to enhance the performance of coattention mechanism. Extensive experiments on two public datasets demonstrate the effectiveness of all proposed methods, and our findings in the experiments provide new insight for future developments of using attention mechanism and deep neural network for aspect-based sentiment analysis.  相似文献   

3.
Legal researchers, recruitment professionals, healthcare information professionals, and patent analysts all undertake work tasks where search forms a core part of their duties. In these instances, the search task is often complex and time-consuming and requires specialist expertise to identify relevant documents and insights within large domain-specific repositories and collections. Several studies have been made investigating the search practices of professionals such as these, but few have attempted to directly compare their professional practices and so it remains unclear to what extent insights and approaches from one domain can be applied to another. In this paper we describe the results of a survey of a purposive sample of 108 legal researchers, 64 recruitment professionals and 107 healthcare information professionals. Their responses are compared with results from a previous survey of 81 patent analysts. The survey investigated their search practices and preferences, the types of functionality they value, and their requirements for future information retrieval systems. The results reveal that these professions share many fundamental needs and face similar challenges. In particular a continuing preference to formulate queries as Boolean expressions, the need to manage, organise and re-use search strategies and results and an ambivalence toward the use of relevance ranking. The results stress the importance of recall and coverage for the healthcare and patent professionals, while precision and recency were more important to the legal and recruitment professionals. The results also highlight the need to ensure that search systems give confidence to the professional searcher and so trust, explainability and accountability remains a significant challenge when developing such systems. The findings suggest that translational research between the different areas could benefit professionals across domains.  相似文献   

4.
5.
Automatic word spacing in Korean remains a significant task in natural language processing owing to the extremely complex word spacing rules involved. Most previous models remove all spaces in input sentences and insert new spaces in the modified input sentences. If input sentences include only a small number of spacing errors, the previous models often return sentences with even more spacing errors than the input sentences because they remove the correct spaces that were typed intentionally by the users. To reduce this problem, we propose an automatic word spacing model based on a neural network that effectively uses word spacing information from input sentences. The proposed model comprises a space insertion layer and a spacing-error correction layer. Using an approach similar to previous models, the space insertion layer inserts word spaces into input sentences from which all spaces have been removed. The spacing error correction layer post-corrects the spacing errors of the space insertion model using word spacing typed by users. Because the two layers are tightly connected in the proposed model, the backpropagation flows are not blocked. As a result, the space insertion and error correction are performed simultaneously. In experiments, the proposed model outperformed all compared models on all measures on the same test data. In addition, it exhibited reliable performance (word-unit F1-measures of 94.17%~97.87%) regardless of how many word spacing errors were present in the input sentences.  相似文献   

6.
Named Entity Recognition (NER) aims to automatically extract specific entities from the unstructured text. Compared with performing NER in English, Chinese NER is more challenging in recognizing entity boundaries because there are no explicit delimiters between Chinese characters. However, most previous researches focused on the semantic information of the Chinese language on the character level but ignored the importance of the phonetic characteristics. To address these issues, we integrated phonetic features of Chinese characters with the lexicon information to help disambiguate the entity boundary recognition by fully exploring the potential of Chinese as a pictophonetic language. In addition, a novel multi-tagging-scheme learning method was proposed, based on the multi-task learning paradigm, to alleviate the data sparsity and error propagation problems that occurred in the previous tagging schemes, by separately annotating the segmentation information of entities and their corresponding entity types. Extensive experiments performed on four Chinese NER benchmark datasets: OntoNotes4.0, MSRA, Resume, and Weibo, show that our proposed method consistently outperforms the existing state-of-the-art baseline models. The ablation experiments further demonstrated that the introduction of the phonetic feature and the multi-tagging-scheme has a significant positive effect on the improvement of the Chinese NER task.  相似文献   

7.
Tables in documents are a widely-available and rich source of information, but not yet well-utilised computationally because of the difficulty in automatically extracting their structure and data content. There has been a plethora of systems proposed to solve the problem, but current methods present low usability and accuracy and lack precision in detecting data from diverse layouts. We propose a component-based design and implementation of table processing concepts which can offer flexibility and re-usability as well as high performance on a wide range of table types. In this paper, we describe a system named TEXUS which is a fully automated table processing system that takes a PDF document and detects tables in a layout independent manner. We introduce TEXUS’s own table processing specific document model and the two-phased processing pipeline design. Through an extensive evaluation on a dataset comprised of complex financial tables, we show the performance of the system on different table types.  相似文献   

8.
Health misinformation has become an unfortunate truism of social media platforms, where lies could spread faster than truth. Despite considerable work devoted to suppressing fake news, health misinformation, including low-quality health news, persists and even increases in recent years. One promising approach to fighting bad information is studying the temporal and sentiment effects of health news stories and how they are discussed and disseminated on social media platforms like Twitter. As part of the effort of searching for innovative ways to fight health misinformation, this study analyzes a dataset of more than 1600 objectively and independently reviewed health news stories published over a 10-year span and nearly 50,000 Twitter posts responding to them. Specifically, it examines the source credibility of health news circulated on Twitter and the temporal, sentiment features of the tweets containing or responding to the health news reports. The results show that health news stories that are rated low by experts are discussed more, persist longer, and produce stronger sentiments than highly rated ones in the tweetosphere. However, the highly rated stories retained a fresh interest in the form of new tweets for a longer period. An in-depth understanding of the characteristics of health news distribution and discussion is the first step toward mitigating the surge of health misinformation. The findings provide insights into understanding the mechanism of health information dissemination on social media and practical implications to fight and mitigate health misinformation on digital media platforms.  相似文献   

9.
A new approach to narrative abstractive summarization (NATSUM) is presented in this paper. NATSUM is centered on generating a narrative chronologically ordered summary about a target entity from several news documents related to the same topic. To achieve this, first, our system creates a cross-document timeline where a time point contains all the event mentions that refer to the same event. This timeline is enriched with all the arguments of the events that are extracted from different documents. Secondly, using natural language generation techniques, one sentence for each event is produced using the arguments involved in the event. Specifically, a hybrid surface realization approach is used, based on over-generation and ranking techniques. The evaluation demonstrates that NATSUM performed better than extractive summarization approaches and competitive abstractive baselines, improving the F1-measure at least by 50%, when a real scenario is simulated.  相似文献   

10.
Creativity is considered a human characteristic; creative endeavors, including automatic story generation, have been a major challenge for artificial intelligences. To understand how humans create and evaluate stories, we (1) construct a story dataset and (2) analyze the relationship between emotions and story interestingness. Given that understanding how to move readers emotionally is a crucial creative technique, we focus on the role of emotions in evaluating reader satisfaction. Although conventional research has highlighted emotions read from a text, we hypothesize that readers’ emotions do not necessarily coincide with those of the characters. The story dataset created for this study describes situations surrounding two characters. Crowdsourced volunteers label stories with the emotions of the two characters and those of readers; we then empirically analyze the relationship between emotions and interestingness. The results show that a story’s score has a stronger relationship to the readers’ emotions than the characters’ emotions.  相似文献   

11.
Assigning paper to suitable reviewers is of great significance to ensure the accuracy and fairness of peer review results. In the past three decades, many researchers have made a wealth of achievements on the reviewer assignment problem (RAP). In this survey, we provide a comprehensive review of the primary research achievements on reviewer assignment algorithm from 1992 to 2022. Specially, this survey first discusses the background and necessity of automatic reviewer assignment, and then systematically summarize the existing research work from three aspects, i.e., construction of candidate reviewer database, computation of matching degree between reviewers and papers, and reviewer assignment optimization algorithm, with objective comments on the advantages and disadvantages of the current algorithms. Afterwards, the evaluation metrics and datasets of reviewer assignment algorithm are summarized. To conclude, we prospect the potential research directions of RAP. Since there are few comprehensive survey papers on reviewer assignment algorithm in the past ten years, this survey can serve as a valuable reference for the related researchers and peer review organizers.  相似文献   

12.
Interest in real-time syndromic surveillance based on social media data has greatly increased in recent years. The ability to detect disease outbreaks earlier than traditional methods would be highly useful for public health officials. This paper describes a software system which is built upon recent developments in machine learning and data processing to achieve this goal. The system is built from reusable modules integrated into data processing pipelines that are easily deployable and configurable. It applies deep learning to the problem of classifying health-related tweets and is able to do so with high accuracy. It has the capability to detect illness outbreaks from Twitter data and then to build up and display information about these outbreaks, including relevant news articles, to provide situational awareness. It also provides nowcasting functionality of current disease levels from previous clinical data combined with Twitter data.The preliminary results are promising, with the system being able to detect outbreaks of influenza-like illness symptoms which could then be confirmed by existing official sources. The Nowcasting module shows that using social media data can improve prediction for multiple diseases over simply using traditional data sources.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号