首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 628 毫秒
1.
Opinion mining is one of the most important research tasks in the information retrieval research community. With the huge volume of opinionated data available on the Web, approaches must be developed to differentiate opinion from fact. In this paper, we present a lexicon-based approach for opinion retrieval. Generally, opinion retrieval consists of two stages: relevance to the query and opinion detection. In our work, we focus on the second state which itself focusses on detecting opinionated documents . We compare the document to be analyzed with opinionated sources that contain subjective information. We hypothesize that a document with a strong similarity to opinionated sources is more likely to be opinionated itself. Typical lexicon-based approaches treat and choose their opinion sources according to their test collection, then calculate the opinion score based on the frequency of subjective terms in the document. In our work, we use different open opinion collections without any specific treatment and consider them as a reference collection. We then use language models to determine opinion scores. The analysis document and reference collection are represented by different language models (i.e., Dirichlet, Jelinek-Mercer and two-stage models). These language models are generally used in information retrieval to represent the relationship between documents and queries. However, in our study, we modify these language models to represent opinionated documents. We carry out several experiments using Text REtrieval Conference (TREC) Blogs 06 as our analysis collection and Internet Movie Data Bases (IMDB), Multi-Perspective Question Answering (MPQA) and CHESLY as our reference collection. To improve opinion detection, we study the impact of using different language models to represent the document and reference collection alongside different combinations of opinion and retrieval scores. We then use this data to deduce the best opinion detection models. Using the best models, our approach improves on the best baseline of TREC Blog (baseline4) by 30%.  相似文献   

2.
In order to evaluate the effectiveness of Information Retrieval (IR) systems it is key to collect relevance judgments from human assessors. Crowdsourcing has successfully been used as a method to scale-up the collection of manual relevance judgments, and previous research has investigated the impact of different judgment task design elements (e.g., highlighting query keywords in the document) on judgment quality and efficiency. In this work we investigate the positive and negative impacts of presenting crowd human assessors with more than just the topic and the document to be judged. We deploy different variants of crowdsourced relevance judgment tasks following a between-subjects design in which we present different types of metadata to the human assessor. Specifically, we investigate the effect of human metadata (e.g., what other human assessors think of the current document, as in which relevance level has already been selected by the majority crowd workers), machine metadata (e.g., how IR systems scored this document such as its average position in ranked lists, statistics about the document such as term frequencies). We look at the impact of metadata on judgment quality (i.e., the level of agreement with trained assessors) and cost (i.e., the time it takes for workers to complete the judgments) as well as at how metadata quality positively or negatively impact the collected judgments.  相似文献   

3.
基于博客的图书馆知识服务模式研究   总被引:4,自引:0,他引:4  
李文 《现代情报》2009,29(11):111-113
博客是Web2.0下的新型网络交流方式。探讨了博客的本质和原则,从博客和图书馆知识服务的共性分析了博客技术应用于图书馆知识服务的可行性,提出图书馆要利用博客技术优势,建立多种博客知识服务模式,以不断提高图书馆知识服务的水平。  相似文献   

4.
In the period of Corona Virus Disease 2019 (COVID-19), millions of people participate in the discussion of COVID-19 on the Internet, which can easily trigger public opinion and threaten social stability. This paper creatively proposes a multi-stage risk grading model of Internet public opinion for public health emergencies. On the basis of general public opinion risk grading analysis, the model continuously pays attention to the risk level of Internet public opinion based on the time scale of regular or major information updates. This model combines Analytic Hierarchy Process Sort II (AHPSort II) and Swing Weighting (SW) methods and proposes a new Multi-Criteria Decision Making (MCDM) method – AHPSort II-SW. Intuitionistic fuzzy number and linguistic fuzzy number are introduced into the model to evaluate the criteria that cannot be quantified. The multi-stage model is tested using more than 2,000 textual data about COVID-19 collected from Microblog, a leading social media platform in China. Seven public opinion risk assessments were conducted from January 23 to April 8, 2020. The empirical results show that in the early COVID-19 outbreak, the risk of public opinion is more serious on macroscopic view. In details, the risk of public opinion decreases slowly with time, but the emergence of important events may still increase the risk of public opinion. The analysis results are in line with the actual situation and verify the effectiveness of the method. Comparative analysis indicates the improved method is proved to be superior and effective, sensitivity analysis confirms its stability. Finally, management suggestions was provided, this study contributes to the literature on public opinion risk assessment and provides implications for practice.  相似文献   

5.
孙建军  屈良 《情报科学》2012,(2):161-165,172
在研究中事先充分获取了图林博客圈所有博客首页的实际链接数据,并对这十几万条链接数据进行了详细统计和归类,为后面分析研究奠定了基础;利用链接分类体系对整理统计后的实际数据进行实际分类并计算它们各自的比例,提出了"加权入链数"的概念,以此来修正以往链接分析研究中所采用的绝对入链数;同时以图林博客为例,以两种不同的方式进行博客排名,最后分析测试结果。并通过统计工具SPSS进行假设检验,以验证两种排序结果的相关度,并给出相应结论。  相似文献   

6.
专业博客与学科门户的互动分析   总被引:8,自引:0,他引:8  
从博客兴起的原因入手,总结基于知识管理的专业博客应用与发展现状。从服务功能、资源整合和个性化方面,分析专业博客与学科门户之间的联系。基于专业博客与学科门户之间所体现出的相似性,探讨专业博客与学科门户在目标上的互动关系,在内容构成上的互补关系以及在服务流程和定制服务方面的相似性。最后提出学科门户要加强互动,使专业博客的内容成为学科门户的重要知识来源。  相似文献   

7.
新时代,高校网络舆情呈现出直接性、突发性和非理性的特点。随着微博、微信等新媒体的出现,互联网表现出明显的"微"特征,网络舆情被赋予了新的特点。研判和处置是高校管理者进行网络舆情治理的两大具体措施。在舆情研判中,我们要着重建立分层分级的研判工作制度,构建全方位的舆情信息网络,培养综合素质过硬的研判队伍。在舆情处置中,预警是重点,要制订具有针对性的工作预案;引导是关键,要培养合格的舆论"把关人"和"意见领袖";公开是保障,要强化信息报送和反馈机制。同时要做好善后工作,最终形成全员、全过程、全方位的网络舆情治理大格局。  相似文献   

8.
学术博客的"无形学院"交流模式探析   总被引:8,自引:0,他引:8  
江亮 《情报科学》2006,24(2):296-299
从人类的交流历史看,非正式交流一直发挥着正式交流无法替代的作用,特别是网络信息交流中传统信息交流和网络交流的融合,博客的出现正是人际交流在互联网上的一种延伸。本文分析无形学院以及在网络环境下的具有无形学院特征的博客,同时通过两者的比较,探讨学术博客的“无形学院”交流方式。  相似文献   

9.
陈震  王静茹 《情报科学》2020,38(4):51-56
【目的/意义】目前网络舆情事件与社会稳定密切相关,其中定量计算方法在网络舆情事件分析中占有重要地位。【方法/过程】本文提出了一种基于贝叶斯网络(Bayesian Network下文简称BN)分析网络舆情事件趋势的方法。先根据先验知识和专家指导设计BN拓扑结构;再利用EM算法推算条件概率表;最后通过训练集和测试集的方法检验BN的有效性。【结果/结论】本文以随机抽取的2018年100件网络舆情事件为数据源进行实验,结果表明本文设计的BN在预测网络舆情事件趋势方面是可靠的。这为基于BN处理网络舆情事件提供了一定理论依据。  相似文献   

10.
[目的/意义]旨在官方舆论场和民间舆论场并存的现实背景下,深入探究异质性网络上的信息传播规律。[方法/过程]从理论层面给出异质层次网络的研究框架,构建个体信息传播影响力评价指标。以抖音短视频平台为例,界定账号类型,基于账号之间的关注关系数据进行实证分析,获得高影响力舆论个体账号,验证了研究方法的可行性。[结果/结论]异质层次网络方法为分析不同类型舆论个体之间的交互作用提供了理论体系,识别出的核心传播节点和关键传播路径有助于为舆情管控提供决策支持。  相似文献   

11.
周汉杰  王刚 《情报科学》2021,39(12):118-125
【 目的/意义】通过构建区块链社交网络舆情风险管理纾解对策,对于创造良好网络环境,提升网络舆情治理 效能以及应对网络风险复杂局面提供可行性方案。【方法/过程】本研究立足区块链理念与技术,分析社交网络舆情 风险管理概念框架,指明社交网络舆情风险现存问题,进一步探究区块链技术下社交网络舆情治理与纾解机理,建 立社交网络舆情管理系统模型,并从技术推进、多方协作、媒体引导等层面探究防控社交网络舆情风险的密钥。【结 果/结论】区块链技术对社交网络舆情风险治理、提高舆情信息传播质量具有积极作用,为营造健康、安全共享等网 络社交生态系统提供诸多可能性。【创新/局限】本文构建的区块链社交网络舆情管理模型是一个初步的设想,没有 考虑信息存储的上限,下一步研究工作是提升区块链社交网络舆情管理系统的兼容性和数据采集的标准化,为区 块链社交网络舆情管控提供参考。  相似文献   

12.
Three-way opinion classification (3WOC) models are based on a human perspective of opinion classification and offer human-like decision-making capabilities. The purpose of this study was to determine the effectiveness of a three-way decision-making framework with multiple features (fuzzy features and semantic features) in simulating human judgement of opinions. This was an quantitative study. A simple prototype of the three-way decision model was run against the Amazon Musical Instrument dataset to evaluate the model. The data used to verify the results were collected from 125 respondents via an online survey. The participants tested the model in context, then immediately filled in the online questionnaire. Results show that the statistical correlation between semantic features and fuzzy feature is low. Therefore, classification coverage and accuracy can be increased when both types of features are used together rather than using one type of feature alone. With the integration of semantic features and fuzzy features, we found that our three-way decision model performs better than a two-way classification model. Furthermore, the 3WOC model is a simulation of human judgements executed when people make decisions. Finally, we offer usability recommendations based on our analysis. A three-way decision-making framework is a better solution to simulate human judgement of opinion classification than a two-way decision model. The research outcomes will help in the development of better opinion classification systems that can support businesses and organisations to make strategic plans to improve their products or services based on customer preference patterns.  相似文献   

13.
刘家国 《情报科学》2008,26(1):49-52
综合对博客的多年研究,构建了博客运作模型,试图解释博客兴起的原因和博客运作的机理.本研究对于深入认知博客现象、理解博客兴起原因、探索博客发展规律、推动中国互联网行业的健康发展有着重要的意义.  相似文献   

14.
博客技术在高校教育信息化中的应用   总被引:1,自引:0,他引:1  
在信息时代和知识经济社会,可以说与他人进行经常性的沟通以及共享资源已成为一种日益重要的趋势。对高等院校从事本科教学的教师而言,博客的兴起和广泛应用为他们提供了此种便利。而对教育信息化而言,博客无疑又成为了一个难得的载体,在学校条件并不成熟时使对课程教学的知识管理成为可能,更为教学课堂内外的资源共事和人与人之间沟通和交流搭建了一个良好的平台,因此进一步研究博客对促进高校本科信息化教学无疑具有现实重要意义。  相似文献   

15.
Web2.0环境下博客在科学信息的非正式交流中扮演着越来越重要的角色。对Web2.0以及非正式科学交流进行了概述,用社会网络分析法对收集到的图书情报领域专业博客的链接数据进行处理以及详细的分析,发现本领域博客网络小世界特征明显,但博客间的联系较为松散。  相似文献   

16.
针对目前主题搜索引擎检索结果的主题相关度不能满足专业用户需求的问题,以图情博客为切入点并以开源搜索引擎Nutch为技术框架尝试构建图情博客搜索引擎,为以上问题提供解决方案。  相似文献   

17.
Opinion summarization can facilitate user’s decision-making by mining the salient review information. However, due to the lack of sufficient annotated data, most of the early works are based on extractive methods, which restricts the performance of opinion summarization. In this work, we aim to improve the informativeness of opinion summarization to provide better guidance to users. We consider the setting with only reviews without corresponding summaries, and propose an aspect-augmented model for unsupervised abstractive opinion summarization, denoted as AsU-OSum. We first employ an aspect-based sentiment analysis system to extract opinion phrases from reviews. Then, we construct a heterogeneous graph consisting of reviews and opinion clusters as nodes, which is used to enhance the Transformer-based encoder–decoder framework. Furthermore, we design a novel cascaded attention mechanism to prompt the decoder to pay more attention to the aspects that are more likely to appear in summary. During training, we introduce a sentiment accuracy reward that further enhances the learning ability of our model. We conduct comprehensive experiments on the Yelp, Amazon, and Rotten Tomatoes datasets. Automatic evaluation results show that our model is competitive and performs better than the state-of-the-art (SOTA) models on some ROUGE metrics. Human evaluation results further verify that our model can generate more informative summaries and reduce redundancy.  相似文献   

18.
In the traditional evaluation of information retrieval systems, assessors are asked to determine the relevance of a document on a graded scale, independent of any other documents. Such judgments are absolute judgments. Learning to rank brings some new challenges to this traditional evaluation methodology, especially regarding absolute relevance judgments. Recently preferences judgments have been investigated as an alternative. Instead of assigning a relevance grade to a document, an assessor looks at a pair of pages and judges which one is better. In this paper, we generalize pairwise preference judgments to relative judgments. We formulate the problem of relative judgments in a formal way and then propose a new strategy called Select-the-Best-Ones to solve the problem. Through user studies, we compare our proposed method with a pairwise preference judgment method and an absolute judgment method. The results indicate that users can distinguish by about one more relevance degree when using relative methods than when using the absolute method. Consequently, the relative methods generate 15–30% more document pairs for learning to rank. Compared to the pairwise method, our proposed method increases the agreement among assessors from 95% to 99%, while halving the labeling time and the number of discordant pairs to experts’ judgments.  相似文献   

19.
【目的/意义】针对引发持续效应甚至严重后果的多级次舆情开展研究,尝试基于概率分析方法建立发酵预 警模型,精准诊断发酵原因,期冀为网络舆情治理管控提供决策依据。【方法/过程】吸取传统模型的经验与教训,减 少主观评价指标,加大数据层指标的细化程度,利用贝叶斯概率思想构造发酵预测模型。同时通过最大可能解释 原理对发酵原因进行精准诊断。【结果/结论】将60个多级次事例中的55个、30个单级次事例中的27个作为训练数 据,构造多级次预警模型,使用剩余 5 个多级次与3个单级次事例作为测试组,测试得到的发酵趋势预测结果与事 实相符。【创新/局限】探究出多级次发酵内在成因,对其进行多层次的原因诊断,实现了预测指标的精准把握与科 学量化,为网络舆情提前预警及干预措施制定提供了有益的理论支撑。  相似文献   

20.
Propaganda is a mechanism to influence public opinion, which is inherently present in extremely biased and fake news. Here, we propose a model to automatically assess the level of propagandistic content in an article based on different representations, from writing style and readability level to the presence of certain keywords. We experiment thoroughly with different variations of such a model on a new publicly available corpus, and we show that character n-grams and other style features outperform existing alternatives to identify propaganda based on word n-grams. Unlike previous work, we make sure that the test data comes from news sources that were unseen on training, thus penalizing learning algorithms that model the news sources used at training time as opposed to solving the actual task. We integrate our supervised model in a public website, which organizes recent articles covering the same event on the basis of their propagandistic contents. This allows users to quickly explore different perspectives of the same story, and it also enables investigative journalists to dig further into how different media use stories and propaganda to pursue their agenda.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号