首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Aspect level sentiment analysis is important for numerous opinion mining and market analysis applications. In this paper, we study the problem of identifying and rating review aspects, which is the fundamental task in aspect level sentiment analysis. Previous review aspect analysis methods seldom consider entity or rating but only 2-tuples, i.e., head and modifier pair, e.g., in the phrase “nice room”, “room” is the head and “nice” is the modifier. To solve this problem, we novelly present a Quad-tuple Probability Latent Semantic Analysis (QPLSA), which incorporates entity and its rating together with the 2-tuples into the PLSA model. Specifically, QPLSA not only generates fine-granularity aspects, but also captures the correlations between words and ratings. We also develop two novel prediction approaches, the Quad-tuple Prediction (from the global perspective) and the Expectation Prediction (from the local perspective). For evaluation, systematic experiments show that: Quad-tuple PLSA outperforms 2-tuple PLSA significantly on both aspect identification and aspect rating prediction for publication datasets. Moreover, for aspect rating prediction, QPLSA shows significant superiority over state-of-the-art baseline methods. Besides, the Quad-tuple Prediction and the Expectation Prediction also show their strong ability in aspect rating on different datasets.  相似文献   

The polarity shift problem is a major factor that affects classification performance of machine-learning-based sentiment analysis systems. In this paper, we propose a three-stage cascade model to address the polarity shift problem in the context of document-level sentiment classification. We first split each document into a set of subsentences and build a hybrid model that employs rules and statistical methods to detect explicit and implicit polarity shifts, respectively. Secondly, we propose a polarity shift elimination method, to remove polarity shift in negations. Finally, we train base classifiers on training subsets divided by different types of polarity shifts, and use a weighted combination of the component classifiers for sentiment classification. The results on a range of experiments illustrate that our approach significantly outperforms several alternative methods for polarity shift detection and elimination.  相似文献   

This article describes in-depth research on machine learning methods for sentiment analysis of Czech social media. Whereas in English, Chinese, or Spanish this field has a long history and evaluation datasets for various domains are widely available, in the case of the Czech language no systematic research has yet been conducted. We tackle this issue and establish a common ground for further research by providing a large human-annotated Czech social media corpus. Furthermore, we evaluate state-of-the-art supervised machine learning methods for sentiment analysis. We explore different pre-processing techniques and employ various features and classifiers. We also experiment with five different feature selection algorithms and investigate the influence of named entity recognition and preprocessing on sentiment classification performance. Moreover, in addition to our newly created social media dataset, we also report results for other popular domains, such as movie and product reviews. We believe that this article will not only extend the current sentiment analysis research to another family of languages, but will also encourage competition, potentially leading to the production of high-end commercial solutions.  相似文献   

Although deep learning breakthroughs in NLP are based on learning distributed word representations by neural language models, these methods suffer from a classic drawback of unsupervised learning techniques. Furthermore, the performance of general-word embedding has been shown to be heavily task-dependent. To tackle this issue, recent researches have been proposed to learn the sentiment-enhanced word vectors for sentiment analysis. However, the common limitation of these approaches is that they require external sentiment lexicon sources and the construction and maintenance of these resources involve a set of complexing, time-consuming, and error-prone tasks. In this regard, this paper proposes a method of sentiment lexicon embedding that better represents sentiment word's semantic relationships than existing word embedding techniques without manually-annotated sentiment corpus. The major distinguishing factor of the proposed framework was that joint encoding morphemes and their POS tags, and training only important lexical morphemes in the embedding space. To verify the effectiveness of the proposed method, we conducted experiments comparing with two baseline models. As a result, the revised embedding approach mitigated the problem of conventional context-based word embedding method and, in turn, improved the performance of sentiment classification.  相似文献   

Electronic word of mouth (eWOM) is prominent and abundant in consumer domains. Both consumers and product/service providers need help in understanding and navigating the resulting information spaces, which are vast and dynamic. The general tone or polarity of reviews, blogs or tweets provides such help. In this paper, we explore the viability of automatic sentiment analysis (SA) for assessing the polarity of a product or a service review. To do so, we examine the potential of the major approaches to sentiment analysis, along with star ratings, in capturing the true sentiment of a review. We further model contextual factors (specifically, product type and review length) as two moderators affecting SA accuracy. The results of our analysis of 900 reviews suggest that different tools representing the main approaches to SA display differing levels of accuracy, yet overall, SA is very effective in detecting the underlying tone of the analyzed content, and can be used as a complement or an alternative to star ratings. The results further reveal that contextual factors such as product type and review length, play a role in affecting the ability of a technique to reflect the true sentiment of a review.  相似文献   

Social media data have recently attracted considerable attention as an emerging voice of the customer as it has rapidly become a channel for exchanging and storing customer-generated, large-scale, and unregulated voices about products. Although product planning studies using social media data have used systematic methods for product planning, their methods have limitations, such as the difficulty of identifying latent product features due to the use of only term-level analysis and insufficient consideration of opportunity potential analysis of the identified features. Therefore, an opportunity mining approach is proposed in this study to identify product opportunities based on topic modeling and sentiment analysis of social media data. For a multifunctional product, this approach can identify latent product topics discussed by product customers in social media using topic modeling, thereby quantifying the importance of each product topic. Next, the satisfaction level of each product topic is evaluated using sentiment analysis. Finally, the opportunity value and improvement direction of each product topic from a customer-centered view are identified by an opportunity algorithm based on product topics’ importance and satisfaction. We expect that our approach for product planning will contribute to the systematic identification of product opportunities from large-scale customer-generated social media data and will be used as a real-time monitoring tool for changing customer needs analysis in rapidly evolving product environments.  相似文献   

This paper presents a model that incorporates contemporary theories of tense and aspect and develops a new framework for extracting temporal relations between two sentence-internal events, given their tense, aspect, and a temporal connecting word relating the two events. A linguistic constraint on event combination has been implemented to detect incorrect parser analyses and potentially apply syntactic reanalysis or semantic reinterpretation—in preparation for subsequent processing for multi-document summarization. An important contribution of this work is the extension of two different existing theoretical frameworks—Hornstein’s 1990 theory of tense analysis and Allen’s 1984 theory on event ordering—and the combination of both into a unified system for representing and constraining combinations of different event types (points, closed intervals, and open-ended intervals). We show that our theoretical results have been verified in a large-scale corpus analysis. The framework is designed to inform a temporally motivated sentence-ordering module in an implemented multi-document summarization system.  相似文献   

The advancement in mobile technology and the introduction of cloud computing systems enable the use of educational materials on mobile devices for a location- and time-agnostic learning process. These educational materials are delivered in the form of data and compute-intensive multimedia-enabled learning objects. Given these constraints, the desired objective of mobile learning (m-learning) may not be achieved. Accordingly, a number of m-learning systems are being developed by the industry and academia to transform society into a pervasive educational institute. However, no guideline on the technical issues concerning the m-learning environment is available. In this study, we present a taxonomy of such technical issues that can impede the life cycle of multimedia-enabled m-learning applications. The taxonomy is devised based on the issues related to mobile device heterogeneity, network performance, content heterogeneity, content delivery, and user expectation. These issues are discussed, along with their causes and measures, to achieve solutions. Furthermore, we identify several trending areas through which the adaptability and acceptability of multimedia-enabled m-learning platforms can be increased. Finally, we discuss open challenges, such as low complexity encoding, data dependency, measurement and modeling, interoperability, and security as future research directions.  相似文献   

We introduce Big Data Analytics (BDA) and Sentiment Analysis (SA) to the study of international negotiations, through an application to the case of the UK-EU Brexit negotiations and the use of Twitter user sentiment. We show that SA of tweets has potential as a real-time barometer of public sentiment towards negotiating outcomes to inform government decision-making. Despite the increasing need for information on collective preferences regarding possible negotiating outcomes, negotiators have been slow to capitalise on BDA. Through SA on a corpus of 13,018,367 tweets on defined Brexit hashtags, we illustrate how SA can provide a platform for decision-makers engaged in international negotiations to grasp collective preferences. We show that BDA and SA can enhance decision-making and strategy in public policy and negotiation contexts of the magnitude of Brexit. Our findings indicate that the preferred or least preferred Brexit outcomes could have been inferred by the emotions expressed by Twitter users. We argue that BDA can be a mechanism to map the different options available to decision-makers and bring insights to and inform their decision-making. Our work, thereby, proposes SA as part of the international negotiation toolbox to remedy for the existing informational gap between decision makers and citizens’ preferred outcomes.  相似文献   

朱娟 《现代情报》2017,37(5):166-171
[目的/意义] 对在线虚假评论的现有研究进行梳理,分析研究现状,明确未来研究发展方向。[方法/过程] 以CNKI和Web of Science文献为研究对象,从文献分析的视角,采用定性与定量分析相结合的方法,从虚假评论的识别方法、特征提取以及防治策略的角度,对国内外虚假评论研究的现状进行了分析,总结和概括了本领域研究的热点和存在的问题。[结果/结论] 研究表明,在虚假评论的识别方法上,需加强对半监督和无监督学习的研究;在特征提取上,可考虑本体技术的应用;在防治策略上,要考虑多学科多领域的合作。  相似文献   

In this paper we present the relevance ranking algorithm named PolarityRank. This algorithm is inspired in PageRank, the webpage relevance calculus method used by Google, and generalizes it to deal with graphs having not only positive but also negative weighted arcs. Besides the definition of our algorithm, this paper includes the algebraic justification, the convergence demonstration and an empirical study in which PolarityRank is applied to two unrelated tasks where a graph with positive and negative weights can be built: the calculation of word semantic orientation and instance selection from a learning dataset.  相似文献   

The blockchain is considered to be the potential driver of the digital economy. The Blockchain technology outweighs the challenges associated with the traditional transaction business governed and regulated by the third trusted party. There is a growth in the interest among the researchers, the industry, and the academia to study and leverage the potential of Blockchain. Blockchain provides a decentralized and distributed public ledger for all the participating parties. Though it seems that blockchain is a viable choice and solution for all the centralized governed and regulated transactions (in digital online space), it has potential challenges that need to be resolved; opportunities to be explored, and applications to be studied. This paper utilizes a systematic literature review to study several research endeavors made in the domain of blockchain. To further research on blockchain adoption, the paper theoretically constructs an integrated framework of the blockchain innovation adoption process in an organization considering organizational and user acceptance perspectives. This would facilitate its widespread adoption, thereby achieving sustained leadership solutions. The paper offers 23 propositions to information systems (IS)/information management (IM) scholars with respect to innovation characteristics, organizational characteristics, environmental characteristics, and user acceptance characteristics. Further, the paper explores several areas of future research and directions that can provide deep insights for overcoming challenges and for the adoption of blockchain technology.  相似文献   

虚拟团队研究:回顾、分析和展望   总被引:16,自引:0,他引:16  
本文采用质的研究方法对国内近年来出现的虚拟团队研究进行检索、编码和分析,从研究设计和方法、研究主题和内容的角度对国内的虚拟团队研究进行了回顾,并比较了国内和国外虚拟团队研究上的差异,最后指出了未来国内虚拟团队研究的努力方向。  相似文献   

Named Entity Recognition (NER) aims to automatically extract specific entities from the unstructured text. Compared with performing NER in English, Chinese NER is more challenging in recognizing entity boundaries because there are no explicit delimiters between Chinese characters. However, most previous researches focused on the semantic information of the Chinese language on the character level but ignored the importance of the phonetic characteristics. To address these issues, we integrated phonetic features of Chinese characters with the lexicon information to help disambiguate the entity boundary recognition by fully exploring the potential of Chinese as a pictophonetic language. In addition, a novel multi-tagging-scheme learning method was proposed, based on the multi-task learning paradigm, to alleviate the data sparsity and error propagation problems that occurred in the previous tagging schemes, by separately annotating the segmentation information of entities and their corresponding entity types. Extensive experiments performed on four Chinese NER benchmark datasets: OntoNotes4.0, MSRA, Resume, and Weibo, show that our proposed method consistently outperforms the existing state-of-the-art baseline models. The ablation experiments further demonstrated that the introduction of the phonetic feature and the multi-tagging-scheme has a significant positive effect on the improvement of the Chinese NER task.  相似文献   

严威  黄京华  张瑾 《科研管理》2017,38(4):123-131
本文回顾了发表在信息系统顶级会议和期刊上的98篇微博研究论文,从理论基础、研究方法、研究主题和研究层面四个方面进行了综述。研究结果显示:第一,微博研究的理论基础极为丰富,综合使用了信息系统、市场营销、社会学、心理学等诸多学科领域的理论;第二,微博研究体现了研究方法的多样性,案例研究、二手数据、内容分析、调查、数学建模等研究方法均被应用于微博研究;第三,微博研究可以分为用户行为、网络口碑、信息传播、组织战略、组织绩效、电子政务、群体决策、社会计算和系统工具9大主题;第四,微博研究包括信息、服务和网络三个研究层面。在此基础上,本文对基于信息层面和网络层面的微博研究进行了深入讨论,以进一步提升对微博研究的综合认识。最后,本文对未来的研究方向提出了建议。  相似文献   

代宝  罗蕊  续杨晓雪 《现代情报》2019,39(9):142-150
[目的/意义]把握国内外关于社交媒体倦怠的研究现状和发现可能的研究机会。[方法/过程]从社交媒体倦怠的含义、前因和后果三方面对相关文献予以系统分析。[结果/结论]社交媒体倦怠主要从情感体验、行为表现和两者的综合3个视角来定义;社交媒体倦怠的前因主要包括社交媒体相关因素(系统特征因素、信息特征因素)和用户相关因素(心理性因素、行为性因素和社会性因素);社交媒体倦怠的后果主要表现为影响用户的心理(不满意等)和行为(社交媒体不持续使用/转移行为、消极使用行为等)。  相似文献   

开源软件开发本着自愿参加和开放服务的原则吸引着越来越多的软件开发者,但是开源社区合作协调的管理一直是个难题。本文对开源软件开发者社区与其中的源代码管理系统的协调性进行了元网络分析实证研究。操作项目代码的次数可作为衡量开源软件成败的一个重要指标,而该指标与开发者和源代码之间的相互依存关系有密切联系。本文用Sourceforge.net开源软件孵化平台的CVS源代码管理系统中的记录文件构建开发者和源代码间的依存网络,分析了该网络中的依存关系对软件成功的影响,并从中介性、等级性、边缘性、一致性和邻接性五个方面探讨了相互依存中的协调性问题。本文提出的方法和得到的结论可帮助开发者降低沟通成本,更有效地协调软件开发中开发者和源代码中的依存关系。  相似文献   

The proposed work aims to explore and compare the potency of syntactic-semantic based linguistic structures in plagiarism detection using natural language processing techniques. The current work explores linguistic features, viz., part of speech tags, chunks and semantic roles in detecting plagiarized fragments and utilizes a combined syntactic-semantic similarity metric, which extracts the semantic concepts from WordNet lexical database. The linguistic information is utilized for effective pre-processing and for availing semantically relevant comparisons. Another major contribution is the analysis of the proposed approach on plagiarism cases of various complexity levels. The impact of plagiarism types and complexity levels, upon the features extracted is analyzed and discussed. Further, unlike the existing systems, which were evaluated on some limited data sets, the proposed approach is evaluated on a larger scale using the plagiarism corpus provided by PAN1 competition from 2009 to 2014. The approach presented considerable improvement in comparison with the top-ranked systems of the respective years. The evaluation and analysis with various cases of plagiarism also reflected the supremacy of deeper linguistic features for identifying manually plagiarized data.  相似文献   

  新形势下,面对关键核心技术“卡脖子”问题的挑战,系统提升我国科技研究的整体水平迫在眉睫。然而,我国现有知识创新生态研究的基本思路仍是以“大学、企业、政府”构成的“三螺旋”框架,无法体现新时代多元化主体、多样化路径、多种创新模式的平行共存与协同演化特征,致使研究者破题“卡脖子”难题时缺乏整合性视角与适用性框架的支撑。本研究基于知识创新范式转型背景下影响我国以系统工具解决“卡脖子”问题的核心挑战,深入探究新时代“模式3”知识创新“多重螺旋”(N-tuple helices)生态系统的主体特征与拓扑结构。以之为基础,讨论系统性提升我国科研创新能力,应对“卡脖子”挑战的可行路径与具体措施。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号