首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Some insight into the behavior of adaptive filtering systems may be gained by comparing them with similar ranked-output retrieval systems. This is not easy; however, a new optimization measure, introduced for the TREC-9 filtering track, makes some such comparison possible. A series of experiments using the TREC-9 filtering data shows that filtering effectiveness is comparable to routing effectiveness, and demonstrates the gains to be made from adaptation.  相似文献   

2.
基于内容和协作的信息过滤方法研究   总被引:7,自引:0,他引:7  
白丽君 《情报学报》2005,24(3):304-308
随着互联网上信息的迅速增长,信息过滤技术得到越来越广泛的应用。本文论述了内容过滤和协作过滤两种信息过滤技术,针对它们存在的问题,提出一种结合两种过滤技术的方法。实验结果表明,该方法能较好地解决问题,提高过滤结果的准确性,是一种更好的信息过滤方法  相似文献   

3.
针对传统TF-IDF在文本过滤时存在的缺点,提出一种基于特征词抽取的文本过滤算法。简要分析文档信息过滤原理和流程,重点讨论文档信息过滤算法设计及技术实现。实验结果表明,所提出的算法可有效对文档信息进行过滤,能够提高信息检索质量。  相似文献   

4.
[目的/意义]为解决移动图书馆信息过载与用户个性化信息需求间的矛盾,对用户所处不同场景的信息接受情境进行有效的配置,最大限度地满足用户信息接受期望,以增强用户体验的愉悦度,促进移动图书馆服务创新。[方法/过程]引入场景化服务理念,以场景要素、用户信息行为与信息接受情境为主维度,构建移动图书馆信息接受适配模型,规划信息接受流程。[结果/结论]以移动图书馆信息接受适配模型为基础,运用协同过滤算法,实现移动图书馆信息接受的场景推荐。  相似文献   

5.
It is a big challenge to clearly identify the boundary between positive and negative streams for information filtering systems. Several attempts have used negative feedback to solve this challenge; however, there are two issues for using negative relevance feedback to improve the effectiveness of information filtering. The first one is how to select constructive negative samples in order to reduce the space of negative documents. The second issue is how to decide noisy extracted features that should be updated based on the selected negative samples. This paper proposes a pattern mining based approach to select some offenders from the negative documents, where an offender can be used to reduce the side effects of noisy features. It also classifies extracted features (i.e., terms) into three categories: positive specific terms, general terms, and negative specific terms. In this way, multiple revising strategies can be used to update extracted features. An iterative learning algorithm is also proposed to implement this approach on the RCV1 data collection, and substantial experiments show that the proposed approach achieves encouraging performance and the performance is also consistent for adaptive filtering as well.  相似文献   

6.
Collaborative filtering systems predict a user's interest in new items based on the recommendations of other people with similar interests. Instead of performing content indexing or content analysis, collaborative filtering systems rely entirely on interest ratings from members of a participating community. Since predictions are based on human ratings, collaborative filtering systems have the potential to provide filtering based on complex attributes, such as quality, taste, or aesthetics. Many implementations of collaborative filtering apply some variation of the neighborhood-based prediction algorithm. Many variations of similarity metrics, weighting approaches, combination measures, and rating normalization have appeared in each implementation. For these parameters and others, there is no consensus as to which choice of technique is most appropriate for what situations, nor how significant an effect on accuracy each parameter has. Consequently, every person implementing a collaborative filtering system must make hard design choices with little guidance. This article provides a set of recommendations to guide design of neighborhood-based prediction systems, based on the results of an empirical study. We apply an analysis framework that divides the neighborhood-based prediction approach into three components and then examines variants of the key parameters in each component. The three components identified are similarity computation, neighbor selection, and rating combination.  相似文献   

7.
基于矩阵划分和兴趣方差的协同过滤算法   总被引:14,自引:4,他引:10  
数据稀疏性是协同过滤系统面临的一个巨大挑战。本文提出了一种新的推荐算法———基于矩阵划分和兴趣方差的协同过滤算法。该算法采用矩阵分块的思想来缩小最近邻搜索的范围。矩阵分块时,采用聚类的方法,大大降低了矩阵的维度和稀疏等级。同时引入兴趣方差的概念,提高了计算最近邻的准确度。实验证明,本文提出的过滤算法在预测精度上较传统的推荐算法有很大的提高。  相似文献   

8.
Modeling users in information filtering systems is a difficult challenge due to dimensions such as nature, scope, and variability of interests. Numerous machine-learning approaches have been proposed for user modeling in filtering systems. The focus has been primarily on techniques for user model capture and representation, with relatively simple assumptions made about the type of users' interests. Although many studies claim to deal with adaptive techniques and thus they pay heed to the fact that different types of interests must be modeled or even changes in interests have to be captured, few studies have actually focused on the dynamic nature and the variability of user-interests and their impact on the modeling process. A simulation based information filtering environment called SIMSFITER was developed to overcome some of the barriers associated with conducting studies on user-oriented factors that can impact interests. SIMSIFTER implemented a user modeling approach known as reinforcement learning that has proven to be effective in previous filtering studies involving humans. This paper reports on several studies conducted using SIMSIFTER that examined the impact of key dimensions such as type of interests, rate of change of interests and level of user-involvement on modeling accuracy and ultimately on filtering effectiveness.  相似文献   

9.
樊康新 《图书情报工作》2009,53(23):107-127
检出阈值的优化调整是自适应信息过滤的重点和难点之一。分析现有的阈值调整方法中普遍存在的问题,以TREC效用指标为目标函数,对阈值调整方法中的极大似然估计法和局部优化法进行比较分析,提出基于TREC目标优化的全局极大似然估计法与局部效用指标优化相结合的自适应过滤阈值调整算法。实验结果表明该方法能有效地提高信息过滤系统的性能。  相似文献   

10.
When speaking of information retrieval, we often mean text retrieval. But there exist many other forms of information retrieval applications. A typical example is collaborative filtering that suggests interesting items to a user by taking into account other users’ preferences or tastes. Due to the uniqueness of the problem, it has been modeled and studied differently in the past, mainly drawing from the preference prediction and machine learning view point. A few attempts have yet been made to bring back collaborative filtering to information (text) retrieval modeling and subsequently new interesting collaborative filtering techniques have been thus derived. In this paper, we show that from the algorithmic view point, there is an even closer relationship between collaborative filtering and text retrieval. Specifically, major collaborative filtering algorithms, such as the memory-based, essentially calculate the dot product between the user vector (as the query vector in text retrieval) and the item rating vector (as the document vector in text retrieval). Thus, if we properly structure user preference data and employ the target user’s ratings as query input, major text retrieval algorithms and systems can be directly used without any modification. In this regard, we propose a unified formulation under a common notational framework for memory-based collaborative filtering, and a technique to use any text retrieval weighting function with collaborative filtering preference data. Besides confirming the rationale of the framework, our preliminary experimental results have also demonstrated the effectiveness of the approach in using text retrieval models and systems to perform item ranking tasks in collaborative filtering.  相似文献   

11.
12.
牛欣伊 《大观周刊》2012,(32):118-119
在反Q滤波中.普遍存在着两个重要的问题.一个是数字的不稳定.一个是效率不高。Y-ghua Wang在他的《一种稳定高效的反Q滤波方法》中.提出一种稳定的反Q滤波方法。本文对他方法进行了跟踪研究,在从算法上实现的基础上.通过对实际生产数据进行验证,证明了谈方法的稳定性和提高资料质量的效果。  相似文献   

13.
推荐系统已成为数字图书馆个性化服务不可缺少的一项重要技术。目前的推荐方法主要是基于规则的推荐和协同过滤方法,这两种方法都有其优缺点,它们共同的缺点是没有考虑语境信息对推荐的影响,从而导致推荐结果不佳。在分析语境信息在推荐过程中的作用的基础上,把语境信息集成到多维推荐模型中,利用数据仓库和OLAP处理层级式聚合计算的能力,建立具有多维信息收集与分析的推荐框架,并做了模块的分析。  相似文献   

14.
Large-scale retrieval systems are often implemented as a cascading sequence of phases—a first filtering step, in which a large set of candidate documents are extracted using a simple technique such as Boolean matching and/or static document scores; and then one or more ranking steps, in which the pool of documents retrieved by the filter is scored more precisely using dozens or perhaps hundreds of different features. The documents returned to the user are then taken from the head of the final ranked list. Here we examine methods for measuring the quality of filtering and preliminary ranking stages, and show how to use these measurements to tune the overall performance of the system. Standard top-weighted metrics used for overall system evaluation are not appropriate for assessing filtering stages, since the output is a set of documents, rather than an ordered sequence of documents. Instead, we use an approach in which a quality score is computed based on the discrepancy between filtered and full evaluation. Unlike previous approaches, our methods do not require relevance judgments, and thus can be used with virtually any query set. We show that this quality score directly correlates with actual differences in measured effectiveness when relevance judgments are available. Since the quality score does not require relevance judgments, it can be used to identify queries that perform particularly poorly for a given filter. Using these methods, we explore a wide range of filtering options using thousands of queries, categorize the relative merits of the different approaches, and identify useful parameter combinations.  相似文献   

15.
基于概念空间方法的信息检索技术研究   总被引:14,自引:0,他引:14  
为了解决词汇差异问题,词表构造在信息检索系统中有着重要意义。概念空间方法是利用计算机自动构造概念语义网络(词表)并以此为基础进行概念检索的一种方法。由词语作为语义网络的节点,词语之间的关联权重以一个给定文档集合中词语的共现率来计算,其大小代表它们之间的相似性。检索时系统采用人工智能方法激活与检索入口词相关的术语或概念,为用户提供交互式的检索用语建议。方法的具体步骤包括文档和对象列表收集、对象过滤和自动标引、共现分析和联想检索四个阶段。这种方法多用于英文检索系统,但对我国的信息检索系统也有重要的借鉴意义。  相似文献   

16.
网络信息过滤的基本问题研究   总被引:3,自引:0,他引:3  
网络信息过滤的技术和方法虽然在因特网上得到了很好的应用,但人们对它的认识仍然比较模糊。本文通过对网络信息过滤的基本问题特别是对网络信息过滤与网络信息检索、网络信息过滤的意义和局限性的研究,以期能廓清人们的认识,更有利于网络信息过滤研究的发展。  相似文献   

17.
区分文档过滤、信息过滤和文本过滤并介绍文档过滤技术的研究现状;提出基于Ontology的文档过滤的设想,认为其优势在于灵活、共享性好、有利于进行个性化服务等;讨论基于Ontology的文档过滤的实施过程,包括构建准备、本体构建、本体调用,重点阐述公共本体、用户本体和文档本体的构建方法以及实施过程中涉及的技术体系;最后指出今后的努力方向。  相似文献   

18.
文阳  陈文宇  袁野  朱建 《图书情报工作》2014,58(20):125-130
认为传统的基于主题的链接过滤算法虽然在某一领域的主题爬虫中使用广泛,但该方法只关心抓取的网页与主题之间的相关性,忽略了网站自身链接的结构特点。提出基于域名的链接过滤算法,该方法对基于网页链接中域名的结构特点进行比较,同时以基于主题的链接过滤算法作为辅助,判断出无用的垃圾链接。与单一基于主题的链接过滤算法相比较,基于域名的链接过滤算法的判断方式更为全面,链接过滤效率更高,从而能有效地提高网络爬虫的抓取效率和情报检索的效率。最后,通过仿真实验证明该算法的有效性。  相似文献   

19.
网络信息过滤系统研究   总被引:22,自引:0,他引:22  
黄晓斌  邱明辉 《情报学报》2004,23(3):326-332
网络信息过滤是根据一定的标准和利用一定的工具从动态的网络信息流中选取相关的信息或剔除不相关信息的一系列过程。本文论述了网络信息过滤的原理 ,概述了网络信息过滤系统的主要类型 ,分析网络信息过滤软件的结构和功能、介绍了过滤软件的评价与选择方法  相似文献   

20.
Collaborative filtering is a popular recommendation technique. Although researchers have focused on the accuracy of the recommendations, real applications also need efficient algorithms. An index structure can be used to store the rating matrix and compute recommendations very fast. In this paper we study how compression techniques can reduce the size of this index structure and, at the same time, speed up recommendations. We show how coding techniques commonly used in Information Retrieval can be effectively applied to collaborative filtering, reducing the matrix size up to 75 %, and almost doubling the recommendation speed. Additionally, we propose a novel identifier reassignment technique, that achieves high compression rates, reducing by 40 % the size of an already compressed matrix. It is a very simple approach based on assigning the smallest identifiers to the items and users with the highest number of ratings, and it can be efficiently computed using a two pass indexing. The usage of the proposed compression techniques can significantly reduce the storage and time costs of recommender systems, which are two important factors in many real applications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号