首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
One of the most time-critical challenges for the Natural Language Processing (NLP) community is to combat the spread of fake news and misinformation. Existing approaches for misinformation detection use neural network models, statistical methods, linguistic traits, fact-checking strategies, etc. However, the menace of fake news seems to grow more vigorous with the advent of humongous and unusually creative language models. Relevant literature reveals that one major characteristic of the virality of fake news is the presence of an element of surprise in the story, which attracts immediate attention and invokes strong emotional stimulus in the reader. In this work, we leverage this idea and propose textual novelty detection and emotion prediction as the two tasks relating to automatic misinformation detection. We re-purpose textual entailment for novelty detection and use the models trained on large-scale datasets of entailment and emotion to classify fake information. Our results correlate with the idea as we achieve state-of-the-art (SOTA) performance (7.92%, 1.54%, 17.31% and 8.13% improvement in terms of accuracy) on four large-scale misinformation datasets. We hope that our current probe will motivate the community to explore further research on misinformation detection along this line. The source code is available at the GitHub.2  相似文献   

2.
Company movements and market changes often are headlines of the news, providing managers with important business intelligence (BI). While existing corporate analyses are often based on numerical financial figures, relatively little work has been done to reveal from textual news articles factors that represent BI. In this research, we developed BizPro, an intelligent system for extracting and categorizing BI factors from news articles. BizPro consists of novel text mining procedures and BI factor modeling and categorization. Expert guidance and human knowledge (with high inter-rater reliability) were used to inform system development and profiling of BI factors. We conducted a case study of using the system to profile BI factors of four major IT companies based on 6859 sentences extracted from 231 news articles published in major news sources. The results show that the chosen techniques used in BizPro – Naïve Bayes (NB) and Logistic Regression (LR) – significantly outperformed a benchmark technique. NB was found to outperform LR in terms of precision, recall, F-measure, and area under ROC curve. This research contributes to developing a new system for profiling company BI factors from news articles, to providing new empirical findings to enhance understanding in BI factor extraction and categorization, and to addressing an important yet under-explored concern of BI analysis.  相似文献   

3.
4.
The presentation of search results on the web has been dominated by the textual form of document representation. On the other hand, the document’s visual aspects such as the layout, colour scheme, or presence of images have been studied in a limited context with regard to their effectiveness of search result presentation. This article presents a comparative evaluation of textual and visual forms of document representation as additional components of document surrogates. A total of 24 people were recruited for our task-based user study. The experimental results suggest that an increased level of document representation available in the search results can facilitate users’ interaction with a search interface. The results also suggest that the two forms of additional representations are likely beneficial to users’ information searching process in different contexts.  相似文献   

5.
Using intelligent agent-based systems to support information processing for executives has not been significantly advanced in both theory and practice. Research into this field tends to focus more on technical aspects than on social perspective. When executives are faced with increasing information availability and uncertainty in the business environment, using intelligent agent-based systems to enhance executives’ information processing capability appears both an opportunity and a necessity. This study examines UK executives’ perceptions of intelligent agent-based systems for information scanning, filtering, interpretation and alerting. The study follows a deductive research design, i.e. hypothesis formulation and testing from the user’s perspective. Qualitative data was collected through focus groups and interviews with executives in the UK. The study produces rich evidence that challenges preconceptions of using agent-based information processing system by executives. The findings develop insight into executives’ behavior in information processing, which has implications for intelligent system developers and organizational information processing practice.  相似文献   

6.
In this paper we focus on the problem of question ranking in community question answering (cQA) forums in Arabic. We address the task with machine learning algorithms using advanced Arabic text representations. The latter are obtained by applying tree kernels to constituency parse trees combined with textual similarities, including word embeddings. Our two main contributions are: (i) an Arabic language processing pipeline based on UIMA—from segmentation to constituency parsing—built on top of Farasa, a state-of-the-art Arabic language processing toolkit; and (ii) the application of long short-term memory neural networks to identify the best text fragments in questions to be used in our tree-kernel-based ranker. Our thorough experimentation on a recently released cQA dataset shows that the Arabic linguistic processing provided by Farasa produces strong results and that neural networks combined with tree kernels further boost the performance in terms of both efficiency and accuracy. Our approach also enables an implicit comparison between different processing pipelines as our tests on Farasa and Stanford parsers demonstrate.  相似文献   

7.
People increasingly use Social Media (SM) platforms such as Twitter and Facebook during disasters and emergencies to post situational updates including reports of injured or dead people, infrastructure damage, requests of urgent needs, and the like. Information on SM comes in many forms, such as textual messages, images, and videos. Several studies have shown the utility of SM information for disaster response and management, which encouraged humanitarian organizations to start incorporating SM data sources into their workflows. However, several challenges prevent these organizations from using SM data for response efforts. These challenges include near-real-time information processing, information overload, information extraction, summarization, and verification of both textual and visual content. We highlight various applications and opportunities of SM multimodal data, latest advancements, current challenges, and future directions for the crisis informatics and other related research fields.  相似文献   

8.
Term weighting for document ranking and retrieval has been an important research topic in information retrieval for decades. We propose a novel term weighting method based on a hypothesis that a term’s role in accumulated retrieval sessions in the past affects its general importance regardless. It utilizes availability of past retrieval results consisting of the queries that contain a particular term, retrieved documents, and their relevance judgments. A term’s evidential weight, as we propose in this paper, depends on the degree to which the mean frequency values for the relevant and non-relevant document distributions in the past are different. More precisely, it takes into account the rankings and similarity values of the relevant and non-relevant documents. Our experimental result using standard test collections shows that the proposed term weighting scheme improves conventional TF*IDF and language model based schemes. It indicates that evidential term weights bring in a new aspect of term importance and complement the collection statistics based on TF*IDF. We also show how the proposed term weighting scheme based on the notion of evidential weights are related to the well-known weighting schemes based on language modeling and probabilistic models.  相似文献   

9.
Annemarie Jutel 《Endeavour》2021,45(1-2):100764
One common contemporary usage of the term “diagnostic uncertainty” is to refer to cases for which a diagnosis is not, or cannot, be applied to the presenting case. This is a paradoxical usage, as the absence of diagnosis is often as close to a certainty as can be a human judgement. What makes this sociologically interesting is that it represents an “epistemic defence,” or a means of accounting for a failure of medicine’s explanatory system. This system is based on diagnosis, or the classification of individual complaints into recognizable diagnostic categories. Diagnosis is pivotal to medicine’s epistemic setting, for it purports to explain illness via diagnosis, and yet is not always able to do so. This essay reviews this paradoxical use, and juxtaposes it to historical explanations for non-diagnosable illnesses. It demonstrates how representing non-diagnosis as uncertainty protects the epistemic setting by positioning the failure to locate a diagnosis in the individual, rather than in the medical paradigm.  相似文献   

10.
This paper presents QACID an ontology-based Question Answering system applied to the CInema Domain. This system allows users to retrieve information from formal ontologies by using as input queries formulated in natural language. The original characteristic of QACID is the strategy used to fill the gap between users’ expressiveness and formal knowledge representation. This approach is based on collections of user queries and offers a simple adaptability to deal with multilingual capabilities, inter-domain portability and changes in user information requirements. All these capabilities permit developing Question Answering applications for actual users. This system has been developed and tested on the Spanish language and using an ontology modelling the cinema domain. The performance level achieved enables the use of the system in real environments.  相似文献   

11.
The phenomenal spread of fake news online necessitates further research into fake news perception. We stress human factors in misinformation management. This study extends prior research on fake news and media consumption to examine how people perceive fake news. The objective is to understand how news categories and sources influence individuals' perceptions of fake news. Participants (N = 1008) were randomly allocated to six groups in which they evaluated the believability of news from three categories (misinformation, conspiracy, and correction news) coupled with six online news sources whose background (official media, commercial media, and social media) and expertise level varied (the presence or absence of a professional editorial team). Our findings indicated people could distinguish media sources, which have a significant effect on fake news perception. People believed most in conspiracy news and then misinformation included in correction news, demonstrating the backfire of correction news. The significant interaction effects indicate people are more sensitive to misinformation news and show more skepticism toward misinformation on social media. The findings support news literacy that users are capable to leverage credible sources in navigating online news. Meanwhile, challenges of processing correction news require design measures to promote truth-telling news.  相似文献   

12.
This paper reports the results of a preliminary study of interpersonal information seeking interactions between a user and a human information source. The study showed that users specify their information needs (uncertainty) largely in terms of what they know (certainty) during the interaction. The articulated certainty and uncertainty in the interaction can be classified as utterances focusing on either the topic (what the user is talking about) or comment (how that topic fits in with the user's situation or problem). We suggest that user studies in information seeking research should conceptually realign from an emphasis on user's uncertainty- and topic-based matching to the inclusion of user's certainty and comment dimensions in order to develop a more linguistically robust, multi-dimensional approach to matching for information retrieval.  相似文献   

13.
《Endeavour》2020,44(3):100732
This paper aims to show how the specific ethics of scientific undertaking tightly underlies epistemic reflection upon the nature of linguistic work and its outcome. The relationship between linguistics and ethics seems evident at the level of the narrative, i.e. the language in which the basic linguistic findings are established. The article is intended as an introduction to an interplay of linguistics, epistemology and the ethics of linguistic work. The departure point for the argument is the CONTAINER perception of language by linguists, which produces the well-established distinction between internalist and externalist positions. The paper, however, invites the reader to reconsider the tension between internalists and externalists and instead argues for a more general opposition, i.e. between the non-transcendental naturalists (naturalists) and transcendental naturalists (extra-naturalists). The polarity is seen as underpinning the present-day debates, while concurrently transversing the traditionally recognised dichotomies. The distinction promises to be productive both at the level of substantive assessment of linguistic research and at the level of epistemic qualification of the outcome of a linguistic study. Sharp and uncompromising as the naturalist vs extra-naturalist dichotomy seems to hold, the paper offers ways to bridge the gap between the apparently exclusive philosophies. The proposed solution, while seemingly only aesthetic, ultimately touches an ethical dimension as it centres on the appropriate construction of the narrative of linguistic fact-finding, which promotes approximative rather than definitive statements in the scholarly discourse. The desired effect is an ethical consensus underlying the work of a linguist.  相似文献   

14.
刘天明 《现代情报》2007,27(11):58-60
当前网络信息传播中,一种特殊的信息符号——网络语言备受关注,其相关研究多从语言学或传播学的角度展开。本文则试图将网络信息传播环境与之相结合.对这一虚拟世界的信息符号进行一定的总结和概述,并指出今后应该致力的研究重点。  相似文献   

15.
GPS-enabled devices and social media popularity have created an unprecedented opportunity for researchers to collect, explore, and analyze text data with fine-grained spatial and temporal metadata. In this sense, text, time and space are different domains with their own representation scales and methods. This poses a challenge on how to detect relevant patterns that may only arise from the combination of text with spatio-temporal elements. In particular, spatio-temporal textual data representation has relied on feature embedding techniques. This can limit a model’s expressiveness for representing certain patterns extracted from the sequence structure of textual data. To deal with the aforementioned problems, we propose an Acceptor recurrent neural network model that jointly models spatio-temporal textual data. Our goal is to focus on representing the mutual influence and relationships that can exist between written language and the time-and-place where it was produced. We represent space, time, and text as tuples, and use pairs of elements to predict a third one. This results in three predictive tasks that are trained simultaneously. We conduct experiments on two social media datasets and on a crime dataset; we use Mean Reciprocal Rank as evaluation metric. Our experiments show that our model outperforms state-of-the-art methods ranging from a 5.5% to a 24.7% improvement for location and time prediction.  相似文献   

16.
Sarcasm expression is a pervasive literary technique in which people intentionally express the opposite of what is implied. Accurate detection of sarcasm in a text can facilitate the understanding of speakers’ true intentions and promote other natural language processing tasks, especially sentiment analysis tasks. Since sarcasm is a kind of implicit sentiment expression and speakers deliberately confuse the audience, it is challenging to detect sarcasm only by text. Existing approaches based on machine learning and deep learning achieved unsatisfactory performance when handling sarcasm text with complex expression or needing specific background knowledge to understand. Especially, due to the characteristics of the Chinese language itself, sarcasm detection in Chinese is more difficult. To alleviate this dilemma on Chinese sarcasm detection, we propose a sememe and auxiliary enhanced attention neural model, SAAG. At the word level, we introduce sememe knowledge to enhance the representation learning of Chinese words. Sememe is the minimum unit of meaning, which is a fine-grained portrayal of a word. At the sentence level, we leverage some auxiliary information, such as the news title, to learning the representation of the context and background of sarcasm expression. Then, we construct the representation of text expression progressively and dynamically. The evaluation on a sarcasm dateset, consisting of comments on news text, reveals that our proposed approach is effective and outperforms the state-of-the-art models.  相似文献   

17.
面向社科领域的网络新闻分析与监测   总被引:1,自引:0,他引:1  
通过自然语言处理技术和数理统计方法的运用,网络新闻在经济金融、公共卫生、政治科学、科研管理、舆情监测与预警等社会科学领域具有很大的利用价值。对新闻分析与监测在各个社会科学领域的应用现状进行分析和综述,包括新闻来源、关键技术、领域特点、实施方法和典型系统,总结得出当前研究的特点及发展趋势。  相似文献   

18.
This paper describes an ongoing research project that involves the study of teachers’ information seeking behaviors, needs and practices in relation to a collection of primary source materials available through the University of North Carolina at Chapel Hill (UNC) Library’s digital library Documenting the American South (DocSouth). By gaining an in-depth understanding of the needs and wants of teachers in the context of their work, we hope to build a collection of learning objects and a domain ontology applied to the collection to improve teachers’ access to the cultural heritage materials and to facilitate their actual use in the classroom.  相似文献   

19.
Stock prediction via market data analysis is an attractive research topic. Both stock prices and news articles have been employed in the prediction processes. However, how to combine technical indicators from stock prices and news sentiments from textual news articles, and make the prediction model be able to learn sequential information within time series in an intelligent way, is still an unsolved problem. In this paper, we build up a stock prediction system and propose an approach that 1) represents numerical price data by technical indicators via technical analysis, and represents textual news articles by sentiment vectors via sentiment analysis, 2) setup a layered deep learning model to learn the sequential information within market snapshot series which is constructed by the technical indicators and news sentiments, 3) setup a fully connected neural network to make stock predictions. Experiments have been conducted on more than five years of Hong Kong Stock Exchange data using four different sentiment dictionaries, and results show that 1) the proposed approach outperforms the baselines in both validation and test sets using two different evaluation metrics, 2) models incorporating prices and news sentiments outperform models that only use either technical indicators or news sentiments, in both individual stock level and sector level, 3) among the four sentiment dictionaries, finance domain-specific sentiment dictionary (Loughran–McDonald Financial Dictionary) models the news sentiments better, which brings more prediction performance improvements than the other three dictionaries.  相似文献   

20.
We propose a theory to characterize the information and information processing abilities of metasurfaces, and demonstrate the relation between the information of the metasurface and its radiation pattern in the far-field region. By incorporating a general aperture model with uncertainty relation in L2-space, we propose a theory to predict the upper bound of information contained in the radiation pattern of a metasurface, and reveal the theoretical upper limit of orthogonal radiation states. The proposed theory also provides guidance for inverse design of the metasurface with respect to given functionalities. Through investigation of the information of disordered-phase modulated metasurfaces, we find the information invariance (1−γ, where γ is Euler''s constant) of chaotic radiation patterns. That is to say, the information of the disordered-phase modulated radiation patterns is always equal to 1−γ, regardless of variations in size, the number of elements and the phase pattern of metasurface. This value might be the lower bound of radiation-pattern information of the metasurface, which can provide a theoretical limit for information modulation applications, including computational imaging, stealth technologies and wireless communications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号