排序方式: 共有29条查询结果,搜索用时 15 毫秒
21.
序列模式挖掘广泛应用于网络入侵检测,运用Weka软件的序列模式挖掘算法对KDDCUP99数据集中的拒绝服务攻击记录进行序列模式分析,得到的频繁序列为开发入侵检测系统提供依据。 相似文献
22.
提出一种新的基于负关联规则与频繁项集挖掘的信息检索系统模型,详细阐述系统模型的设计思想、各模块的功能,以及检索系统实现的三种关键技术(即频繁项集挖掘技术、负关联规则挖掘技术和查询优化扩充技术)及其检索算法。实验结果表明,该检索系统能有效提高和改善信息检索性能。 相似文献
23.
关联规则的挖掘分为两步,首先找出满足最小支持度要求的频繁项目集,然后根据频繁项目集生成满足最小置信度要求的关联规则集.目前对关联规则挖掘的研究主要集中在频繁项集的生成上,然而,作为整个关联规则挖掘的一部分,由频繁项集生成关联规则的算法也有待进一步研究和改进.本文首先对传统的集合操作进行了扩展,然后在扩展集合操作的基础上,提出了由已挖掘出的最大频繁项集生成关联规则的算法ARD-ES,并对算法的复杂度作了理论和实验上的分析.实验表明,ARD-ES算法随着事务数据库容量的增大,时间占用的攀升基本上是线性的,空间占用在某一定值上下波动. 相似文献
24.
25.
王强 《现代图书情报技术》2008,3(8):63-69
设计并采用Java语言实现基于事务数据库标识列表的频繁项集的产生算法——TidlistApriori。通过与采用Hash-Tree的Apriori算法进行比较,表明TidlistApriori能够提高频繁项集的产生效率,可以成为主题关联挖掘的有效算法工具。 相似文献
26.
This paper presents a new efficient algorithm for mining frequent closed itemsets. It enumerates the closed set of frequent itemsets by using a novel compound frequent itemset tree that facilitates fast growth and efficient pruning of search space. It also employs a hybrid approach that adapts search strategies, representations of projected transaction subsets, and projecting methods to the characteristics of the dataset. Efficient local pruning, global subsumption checking, and fast hashing methods are detailed in this paper. The principle that balances the overheads of search space growth and pruning is also discussed. Extensive experimental evaluations on real world and artificial datasets showed that our algorithm outperforms CHARM by a factor of five and is one to three orders of magnitude more efficient than CLOSET and MAFIA. 相似文献
27.
As text documents are explosively increasing in the Internet, the process of hierarchical document clustering has been proven to be useful for grouping similar documents for versatile applications. However, most document clustering methods still suffer from challenges in dealing with the problems of high dimensionality, scalability, accuracy, and meaningful cluster labels. In this paper, we will present an effective Fuzzy Frequent Itemset-Based Hierarchical Clustering (F2IHC) approach, which uses fuzzy association rule mining algorithm to improve the clustering accuracy of Frequent Itemset-Based Hierarchical Clustering (FIHC) method. In our approach, the key terms will be extracted from the document set, and each document is pre-processed into the designated representation for the following mining process. Then, a fuzzy association rule mining algorithm for text is employed to discover a set of highly-related fuzzy frequent itemsets, which contain key terms to be regarded as the labels of the candidate clusters. Finally, these documents will be clustered into a hierarchical cluster tree by referring to these candidate clusters. We have conducted experiments to evaluate the performance based on Classic4, Hitech, Re0, Reuters, and Wap datasets. The experimental results show that our approach not only absolutely retains the merits of FIHC, but also improves the accuracy quality of FIHC. 相似文献
28.
白川平 《宁夏师范学院学报》2014,35(3):86-89
提出一种多数据流频繁模式挖掘算法Multiple Data Stream Mining(MDSM).MDSM算法用Multiple Frequent Pattern Tree(MFP-Tree)结构来存储多数据流中的频繁项集和潜在的频繁项集,并通过增量更新的方式高效地挖掘多数据流中的协同频繁模式和比较频繁模式.通过理论分析和实验证明其可行性. 相似文献
29.
经过分析关联规则中Apriori算法存在的不足,为减少对事务数据库的扫描次数,缩减产生频繁项集的时间,列出两种基于哈希表的计算项集支持计数的方法以及利用哈希表来进行项集的地址定位的方法,使得生成频繁项集的效率有所提高。 相似文献