首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
When speaking of information retrieval, we often mean text retrieval. But there exist many other forms of information retrieval applications. A typical example is collaborative filtering that suggests interesting items to a user by taking into account other users’ preferences or tastes. Due to the uniqueness of the problem, it has been modeled and studied differently in the past, mainly drawing from the preference prediction and machine learning view point. A few attempts have yet been made to bring back collaborative filtering to information (text) retrieval modeling and subsequently new interesting collaborative filtering techniques have been thus derived. In this paper, we show that from the algorithmic view point, there is an even closer relationship between collaborative filtering and text retrieval. Specifically, major collaborative filtering algorithms, such as the memory-based, essentially calculate the dot product between the user vector (as the query vector in text retrieval) and the item rating vector (as the document vector in text retrieval). Thus, if we properly structure user preference data and employ the target user’s ratings as query input, major text retrieval algorithms and systems can be directly used without any modification. In this regard, we propose a unified formulation under a common notational framework for memory-based collaborative filtering, and a technique to use any text retrieval weighting function with collaborative filtering preference data. Besides confirming the rationale of the framework, our preliminary experimental results have also demonstrated the effectiveness of the approach in using text retrieval models and systems to perform item ranking tasks in collaborative filtering.  相似文献   

2.
介绍传统协同过滤方法,提出一个新的基于情景的多维协同过滤推荐模型。在该模型中,介绍情景的概念;阐述建立基于情景的多维用户模型的方法,并对基于情景的多维协同过滤推荐模型的组成部分进行详细介绍。提出一种计算情景相似度的新算法。基于该新算法,可以得到当前用户的“最近邻”在当前用户所在情景下对项目的评分。  相似文献   

3.
Privacy-preserving collaborative filtering algorithms are successful approaches. However, they are susceptible to shilling attacks. Recent research has increasingly focused on collaborative filtering to protect against both privacy and shilling attacks. Malicious users may add fake profiles to manipulate the output of privacy-preserving collaborative filtering systems, which reduces the accuracy of these systems. Thus, it is imperative to detect fake profiles for overall success. Many methods have been developed for detecting attack profiles to keep them outside of the system. However, these techniques have all been established for non-private collaborative filtering schemes. The detection of shilling attacks in privacy-preserving recommendation systems has not been deeply examined. In this study, we examine the detection of shilling attacks in privacy-preserving collaborative filtering systems. We utilize four attack-detection methods to filter out fake profiles produced by six well-known shilling attacks on perturbed data. We evaluate these detection methods with respect to their ability to identify bogus profiles. Real data-based experiments are performed. Empirical outcomes demonstrate that some of the detection methods are very successful at filtering out fake profiles in privacy-preserving collaborating filtering schemes.  相似文献   

4.
[目的/意义]用户兴趣推荐是信息服务中的重要内容,针对目前融合情境信息推荐的研究更多是直接将情境作为单因素而缺乏考虑情境关联的思想,本文以情境关系为切入点实现社会化媒体用户的兴趣推荐。[方法/过程]以具有相似情境用户可能具有相似兴趣为假设,来改进用户原始兴趣网络从而实现推荐。通过社会网络和资源相似性计算构建原始兴趣网络中显性网络和隐性网络;借鉴共现原理和情境本身相似性构建情境网络;通过兴趣传递关系计算直接兴趣度与间接兴趣度;最后借鉴协同过滤的思想实现推荐。[结果/结论]与以往的只考虑单一情境因素的推荐方法相比,基于本方法的实验表明,将情境关系融入到推荐过程中不仅可以扩展用户的社会关系,而且可以得到更好的推荐效果。  相似文献   

5.
[目的/意义] 为解决高校图书推荐过程中面临的“数据稀疏”和“冷启动”问题,研究表明:优化读者评价矩阵和相似度模型是提高图书推荐质量的关键。[方法/过程] 提出一种协同过滤改进方法,以图书分类为项目生成用户评价矩阵,并考虑借阅方式、借阅时间和图书相似度对用户兴趣度的影响,优化矩阵中的样本数据;同时,在计算读者相似度时融入读者特征和图书特征。[结果/结论] 实验结果表明,该方法可有效解决“数据稀疏”和“冷启动”问题,显著降低计算量。与基本协同过滤和聚类改进的协同过滤方法相比,无论是在推荐准确率还是在用户满意率上都有较大的提高,综合推荐效果更好。  相似文献   

6.
Scale and Translation Invariant Collaborative Filtering Systems   总被引:1,自引:0,他引:1  
Collaborative filtering systems are prediction algorithms over sparse data sets of user preferences. We modify a wide range of state-of-the-art collaborative filtering systems to make them scale and translation invariant and generally improve their accuracy without increasing their computational cost. Using the EachMovie and the Jester data sets, we show that learning-free constant time scale and translation invariant schemes outperforms other learning-free constant time schemes by at least 3% and perform as well as expensive memory-based schemes (within 4%). Over the Jester data set, we show that a scale and translation invariant Eigentaste algorithm outperforms Eigentaste 2.0 by 20%. These results suggest that scale and translation invariance is a desirable property.  相似文献   

7.
This article introduces a new language-independent approach for creating a large-scale high-quality test collection of tweets that supports multiple information retrieval (IR) tasks without running a shared-task campaign. The adopted approach (demonstrated over Arabic tweets) designs the collection around significant (i.e., popular) events, which enables the development of topics that represent frequent information needs of Twitter users for which rich content exists. That inherently facilitates the support of multiple tasks that generally revolve around events, namely event detection, ad-hoc search, timeline generation, and real-time summarization. The key highlights of the approach include diversifying the judgment pool via interactive search and multiple manually-crafted queries per topic, collecting high-quality annotations via crowd-workers for relevancy and in-house annotators for novelty, filtering out low-agreement topics and inaccessible tweets, and providing multiple subsets of the collection for better availability. Applying our methodology on Arabic tweets resulted in EveTAR, the first freely-available tweet test collection for multiple IR tasks. EveTAR includes a crawl of 355M Arabic tweets and covers 50 significant events for which about 62K tweets were judged with substantial average inter-annotator agreement (Kappa value of 0.71). We demonstrate the usability of EveTAR by evaluating existing algorithms in the respective tasks. Results indicate that the new collection can support reliable ranking of IR systems that is comparable to similar TREC collections, while providing strong baseline results for future studies over Arabic tweets.  相似文献   

8.
一个新的基于协作过滤的用户浏览预测模型   总被引:2,自引:0,他引:2  
本文提出了一个新的基于协作过滤的用户浏览协作预测模型———UNCPM ,它有效地解决了目前协作过滤预测方法的准确性和覆盖率低等问题。UNCPM从Web日志中获取用户浏览信息 ,系统分为两个部分 :离线构件和在线构件。离线构件用于用户浏览历史记录的K means聚类 ,并在聚类时充分考虑URL的相似分析来避免协作过滤的同义性和分散性等不足 ;在线构件用于活动用户预测。该模型可以应用在大型电子商务网站的用户浏览预测上。  相似文献   

9.
Collaborative filtering is concerned with making recommendations about items to users. Most formulations of the problem are specifically designed for predicting user ratings, assuming past data of explicit user ratings is available. However, in practice we may only have implicit evidence of user preference; and furthermore, a better view of the task is of generating a top-N list of items that the user is most likely to like. In this regard, we argue that collaborative filtering can be directly cast as a relevance ranking problem. We begin with the classic Probability Ranking Principle of information retrieval, proposing a probabilistic item ranking framework. In the framework, we derive two different ranking models, showing that despite their common origin, different factorizations reflect two distinctive ways to approach item ranking. For the model estimations, we limit our discussions to implicit user preference data, and adopt an approximation method introduced in the classic text retrieval model (i.e. the Okapi BM25 formula) to effectively decouple frequency counts and presence/absence counts in the preference data. Furthermore, we extend the basic formula by proposing the Bayesian inference to estimate the probability of relevance (and non-relevance), which largely alleviates the data sparsity problem. Apart from a theoretical contribution, our experiments on real data sets demonstrate that the proposed methods perform significantly better than other strong baselines.
Marcel J. T. ReindersEmail:
  相似文献   

10.
Collaborative filtering is a general technique for exploiting the preference patterns of a group of users to predict the utility of items for a particular user. Three different components need to be modeled in a collaborative filtering problem: users, items, and ratings. Previous research on applying probabilistic models to collaborative filtering has shown promising results. However, there is a lack of systematic studies of different ways to model each of the three components and their interactions. In this paper, we conduct a broad and systematic study on different mixture models for collaborative filtering. We discuss general issues related to using a mixture model for collaborative filtering, and propose three properties that a graphical model is expected to satisfy. Using these properties, we thoroughly examine five different mixture models, including Bayesian Clustering (BC), Aspect Model (AM), Flexible Mixture Model (FMM), Joint Mixture Model (JMM), and the Decoupled Model (DM). We compare these models both analytically and experimentally. Experiments over two datasets of movie ratings under different configurations show that in general, whether a model satisfies the proposed properties tends to be correlated with its performance. In particular, the Decoupled Model, which satisfies all the three desired properties, outperforms the other mixture models as well as many other existing approaches for collaborative filtering. Our study shows that graphical models are powerful tools for modeling collaborative filtering, but careful design is necessary to achieve good performance.  相似文献   

11.
文献推荐系统:提高信息检索效率之途   总被引:2,自引:0,他引:2  
Traditional Information Retrieval (IR) systems have limitations in improving search performance in today’s information environment. The high recall and poor precision of traditional IR systems are only as good as with the accuracy of search query, which is, however, usually difficult for the user to construct. It is also time-consuming for the user to evaluate each search result. The recommendation techniques having been developed since the early 1990s help solve the problems that traditional IR systems have. This paper explains the basic process and major elements of document recommender systems, especially the two recommendation techniques of content-based filtering and collaborative filtering. Also discussed are the evaluation issue and the problems that current document recommender systems are facing, which need to be taken into account in future system designs. Traditional Information Retrieval (IR) systems have limitations in improving search performance in today’s information environment. The high recall and poor precision of traditional IR systems are only as good as with the accuracy of search query, which is, however, usually difficult for the user to construct. It is also time-consuming for the user to evaluate each search result. The recommendation techniques having been developed since the early 1990s help solve the problems that traditional IR systems have. This paper explains the basic process and major elements of document recommender systems, especially the two recommendation techniques of content-based filtering and collaborative filtering. Also discussed are the evaluation issue and the problems that current document recommender systems are facing, which need to be taken into account in future system designs.  相似文献   

12.
Modeling users in information filtering systems is a difficult challenge due to dimensions such as nature, scope, and variability of interests. Numerous machine-learning approaches have been proposed for user modeling in filtering systems. The focus has been primarily on techniques for user model capture and representation, with relatively simple assumptions made about the type of users' interests. Although many studies claim to deal with adaptive techniques and thus they pay heed to the fact that different types of interests must be modeled or even changes in interests have to be captured, few studies have actually focused on the dynamic nature and the variability of user-interests and their impact on the modeling process. A simulation based information filtering environment called SIMSFITER was developed to overcome some of the barriers associated with conducting studies on user-oriented factors that can impact interests. SIMSIFTER implemented a user modeling approach known as reinforcement learning that has proven to be effective in previous filtering studies involving humans. This paper reports on several studies conducted using SIMSIFTER that examined the impact of key dimensions such as type of interests, rate of change of interests and level of user-involvement on modeling accuracy and ultimately on filtering effectiveness.  相似文献   

13.
针对传统协同过滤推荐算法的不足,依据现实生活经验,认为在协同过滤推荐过程中考虑用户的专家信任因素十分必要。详细阐述专家信任的概念以及利用用户评分数据计算专家信任度的方法,提出一种基于专家优先信任的协同过滤推荐算法。在公开数据集GroupLens上的实验结果表明,该算法预测用户评分的精度和成功率都明显优于传统的最近邻法。  相似文献   

14.
针对高校图书馆场景存在的无显式反馈、借阅数据稀疏和传统推荐算法效果不好问题,提出基于时间上下文优化协同过滤的推荐算法,包含读者阅读行为评分、时间上下文和内容兴趣变迁3个要素。在数据准备阶段,通过制定评分转化规则、设计标准化函数来构建一种基于用户行为操作的兴趣评分模型,以解决用户评分缺失问题;在推荐召回阶段,提出一种非线性的时间衰减模型来对评价矩阵进行优化,以提高推荐效果;在推荐排序阶段,提出一种兴趣捕捉模型对召回结果按照图书类别进行精排序,以缓解数据稀疏问题并进一步提高推荐效果。实验结果表明,文章提出的优化算法在Top5的F值较未经优化的协同过滤提升增幅达141%。  相似文献   

15.
协同过滤是推荐系统中广泛使用的最成功的推荐技术,但是随着系统中用户数目和商品数目的不断增加,整个商品空间上的用户评分数据极端稀疏,传统协同过滤算法的最近邻搜寻方式存在很大不足,导致推荐质量急剧下降。针对这一问题,本文提出了一种基于项类偏好的协同过滤推荐算法。首先为目标用户找出一组项类偏好一致的候选邻居,候选邻居与目标用户兴趣相近,共同评分较多,在候选邻居中搜寻最近邻,可以排除共同评分较少用户的干扰,从整体上提高最近邻搜寻的准确性。实验结果表明,该算法能有效提高推荐质量。  相似文献   

16.
基于协作过滤的Web智能信息推荐方法   总被引:1,自引:0,他引:1  
何波 《图书情报工作》2010,54(19):115-110
传统的协作过滤方法存在的主要问题是需要人为地提供评价,论文设计的协作过滤方法对其进行了改进,根据用户模式自动获取用户评价,构建评价矩阵。将设计的协作过滤方法应用到个性化信息推荐,提出一种基于协作过滤的Web智能信息推荐方法(WIIRM)。WIIRM考虑用户访问页面的时间特性,不需要用户注册,在推荐时考虑页面的新颖性,同时实现离线处理与在线推荐的结合。实验结果表明,WIIRM是有效的。
  相似文献   

17.
基于信任的电子商务推荐多样性研究   总被引:3,自引:0,他引:3  
现有的推荐系统研究大都千方百计地关注于如何提高推荐算法的准确性,考虑到用户兴趣的覆盖范围,这样做的缺陷是只考虑了推荐列表中单个项目的准确度,而忽略了整个推荐列表多样性对用户满意度的影响.近几年的研究表明将信任机制融入到个性化推荐过程中对提高传统协同过滤技术的准确性和鲁棒性有积极的影响,本文提出了基于社会网络信任的推荐多样性算法,该算法通过选择主题多样性好的信任邻居来平衡推荐结果的准确性和多样性.一系列的实验结果表明,该算法能有效地提高推荐的多样性.  相似文献   

18.
推荐系统已成为数字图书馆个性化服务不可缺少的一项重要技术。目前的推荐方法主要是基于规则的推荐和协同过滤方法,这两种方法都有其优缺点,它们共同的缺点是没有考虑语境信息对推荐的影响,从而导致推荐结果不佳。在分析语境信息在推荐过程中的作用的基础上,把语境信息集成到多维推荐模型中,利用数据仓库和OLAP处理层级式聚合计算的能力,建立具有多维信息收集与分析的推荐框架,并做了模块的分析。  相似文献   

19.
Collaborative filtering systems predict a user's interest in new items based on the recommendations of other people with similar interests. Instead of performing content indexing or content analysis, collaborative filtering systems rely entirely on interest ratings from members of a participating community. Since predictions are based on human ratings, collaborative filtering systems have the potential to provide filtering based on complex attributes, such as quality, taste, or aesthetics. Many implementations of collaborative filtering apply some variation of the neighborhood-based prediction algorithm. Many variations of similarity metrics, weighting approaches, combination measures, and rating normalization have appeared in each implementation. For these parameters and others, there is no consensus as to which choice of technique is most appropriate for what situations, nor how significant an effect on accuracy each parameter has. Consequently, every person implementing a collaborative filtering system must make hard design choices with little guidance. This article provides a set of recommendations to guide design of neighborhood-based prediction systems, based on the results of an empirical study. We apply an analysis framework that divides the neighborhood-based prediction approach into three components and then examines variants of the key parameters in each component. The three components identified are similarity computation, neighbor selection, and rating combination.  相似文献   

20.
基于Hadoop开源分布式计算框架和Mahout协同过滤推荐引擎技术构建图书推荐引擎系统,并利用云模型和Pearson系数对传统协同过滤推荐算法进行改进,改善传统单机推荐算法在高维稀疏矩阵上进行运算所导致的系统性能不佳及推荐结果不准确的问题。利用实验对分布式推荐平台的整体性能及改善后的协同过滤推荐算法进行测试评估,发现当虚拟机节点不断增加时,协同过滤推荐引擎的计算时间不断减少,这表明推荐引擎系统的总体性能较传统单机推荐引擎得到提升;利用MAE分别对原始协同过滤推荐效果和改进后的推荐算法进行测评,发现改进后的推荐引擎算法的推荐准确率较改进前提高13.1%。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号