共查询到18条相似文献,搜索用时 46 毫秒
1.
SA-DBSCAN:一种自适应基于密度聚类算法 总被引:10,自引:0,他引:10
DBSCAN是一种经典的基于密度聚类算法,能够自动确定簇的数量,对任意形状的簇都能有效处理.DBSCAN算法需要人为确定Eps和minPts?2个参数,导致聚类过程需人工干预才能进行.在DBSCAN的基础上提出了SA-DBSCAN聚类算法,通过分析数据集统计特性来自动确定Eps和minPts参数,从而避免了聚类过程的人工干预,实现聚类过程的全自动化.实验表明,SA-DBSCAN能够选择合理的Eps和minPts参数并得到较高准确度的聚类结果. 相似文献
2.
网络安全是当今信息社会人们所关注的问题,入侵检测机制是防范网络攻击的有效手段。聚类算法是建立入侵检测模型的重要手段,在各种聚类算法中,密度聚类基于密度而非距离进行聚类,可以克服"类圆形"的缺点,遗传算法借鉴生物学的技术,是用于寻找最优解的算法。将遗传算法和密度聚类相结合的一种入侵检测算法,可以更准确的判断网络异常行为,从而提高网络的安全性。 相似文献
3.
基于数据挖掘的DBSCAN算法及其应用 总被引:1,自引:0,他引:1
利用基于数据挖掘技术的DBSCAN算法,提出了解决图像分割的新方法.把数字图像按照点的分布情况建立图像样本数据库,然后使用密度聚类法,利用DBSCAN算法进行图像分割.它能找到图像样本比较密集的部分,并且概括出图像样本相对比较集中的类,并可在带有"噪声"的图像中进行聚类,完成图像分割,有较强的抗"噪声"能力. 相似文献
4.
5.
利用空间坐标和属性特征的有机结合,定义了3种曼哈顿空间距离,用matlab编程给出了基于该空间距离的ACA—Cluster聚类算法,并对山东省生态环境质量进行了聚类分析和类型分区。实验表明,该方法可以较好地反映出空间位置邻近和属性特征相似的空间聚类要求。 相似文献
6.
7.
8.
9.
针对传统的K-means算法运行的结果依赖于初始的聚类数目和聚类中心,本文提出了一种基于优化初始聚类中心的K-means算法。该算法通过量化样本间距离和聚类的紧密性来确定聚类数目K值;根据数据集的分布特征来选取相距较远的数据作为初始聚类中心,避免了传统K-means算法的聚类数目和聚类中心的随机选取。UCI机器学习数据库数据集的实验证明,本文所提出的改进的聚类算法获得了良好的聚类效果,同时获得较高的聚类准确率。 相似文献
10.
11.
《Information processing & management》2023,60(1):103109
With an increase in the number of data instances, data processing operations (e.g. clustering) requires an increasing amount of computational resources, and it is often the case that for considerably large datasets such operations cannot be executed on a single workstation. This requires the use of a server computer for carrying out the operations. However, to ensure privacy of the shared data, a privacy preserving data processing workflow involves applying an encoding transformation on the set of data points prior to applying the computation. This encoding should ideally cater to two objectives—first, it should be difficult to reconstruct the data, second, the results of the operation executed on the encoded space should be as close as possible to the results of the same operation executed on the original data. While standard encoding mechanisms, such as locality sensitive hashing, caters to the first objective, the second objective may not always be adequately satisfied.In this paper, we specifically focus on ‘clustering’ as the data processing operation. We apply a deep metric learning approach to learn a parameterized encoding transformation function with an objective to maximize the alignment of the clusters in the encoded space to those in the original data. We conduct experimentation on four standard benchmark datasets, particularly MNIST, Fashion-MNIST (each dataset contains 70K grayscale images), CIFAR-10 consisting of 60K color images and 20-Newsgroups containing 18K news articles. Our experiments demonstrate that the proposed method yields better clusters in comparison to approaches where the encoding process is agnostic of the clustering objective. 相似文献
12.
《Information processing & management》2005,41(2):177-194
In this paper, we propose a re-ranking algorithm using post-retrieval clustering for content-based image retrieval (CBIR). In conventional CBIR systems, it is often observed that images visually dissimilar to a query image are ranked high in retrieval results. To remedy this problem, we utilize the similarity relationship of the retrieved results via post-retrieval clustering. In the first step of our method, images are retrieved using visual features such as color histogram. Next, the retrieved images are analyzed using hierarchical agglomerative clustering methods (HACM) and the rank of the results is adjusted according to the distance of a cluster from a query. In addition, we analyze the effects of clustering methods, query-cluster similarity functions, and weighting factors in the proposed method. We conducted a number of experiments using several clustering methods and cluster parameters. Experimental results show that the proposed method achieves an improvement of retrieval effectiveness of over 10% on average in the average normalized modified retrieval rank (ANMRR) measure. 相似文献
13.
This study presents a simple yet effective carrier frequency offset (CFO) estimation algorithm for orthogonal frequency division multiplexing (OFDM) systems. At the transmitter, the proposed algorithm uses null subcarriers to render the OFDM signal periodic in the time domain. At the receiver, these periodic time samples become CFO-bearing signals, which can be adopted to develop the maximum likelihood (ML) CFO estimation algorithm accordingly. In addition to providing reliable and efficient CFO estimation, the proposed algorithm has an adjustable acquisition region linearly proportional to the order of the null subcarrier insertion scheme. 相似文献
14.
This paper presents an optimal fuzzy partition based Takagi Sugeno Fuzzy Model (TSFM) in which a novel clustering algorithm, known as Modified Fuzzy C-Regression Model (MFCRM), has been proposed. The objective function of MFCRM algorithm has been developed by considering of geometrical structure of input data and linear functional relation between input–output data. The MFCRM partitions the data space to create fuzzy subspaces (rules). A new validation criterion has been developed for detecting the right number of rules (subspaces) in a given data set. The obtained fuzzy partition is used to build the fuzzy structure and identify the premise parameters. Once, right number of rules and premise parameters have been identified, then consequent parameters have been identified by orthogonal least square (OLS) approach. The cluster validation index has been tested on synthetic data set. The effectiveness of MFCRM based TSFM has been validated on benchmark examples, such as Boiler Turbine system, Mackey–Glass time series data and Box–Jenkins model. The model performance is also validated through high-dimensional data such as Auto-MPG data and Boston Housing data. 相似文献
15.
为了研究在复杂光照环境下的多目标特征聚类跟踪,文章分析了从傍晚到夜景时段下车辆视频流的素材,并设计了结合灯组聚类跟踪、灯影去除、车身聚类跟踪的多特征跟踪算法,实验结果表明采用多特征聚类跟踪算法后,在复杂的光照环境下取得较好的跟踪效果。 相似文献
16.
《Information processing & management》1999,35(4):541-557
We present an efficient document clustering algorithm that uses a term frequency vector for each document instead of using a huge proximity matrix. The algorithm has the following features: (1) it requires a relatively small amount of memory and runs fast, (2) it produces a hierarchy in the form of a document classification tree and (3) the hierarchy obtained by the algorithm explicitly reveals a collection structure. We confirm these features and thus show the algorithm's feasibility through clustering experiments in which we use two collections of Japanese documents, the sizes of which are 83,099 and 14,701 documents. We also introduce an application of this algorithm to a document browser. This browser is used in our Japanese-to-English translation aid system. The browsing module of the system consists of a huge database of Japanese news articles and their English translations. The Japanese article collection is clustered into a hierarchy by our method. Since each node in the hierarchy corresponds to a topic in the collection, we can use the hierarchy to directly access articles by topic. A user can learn general translation knowledge of each topic by browsing the Japanese articles and their English translations. We also discuss techniques of presenting a large tree-formed hierarchy on a computer screen. 相似文献
17.
构建电子计算机及办公设备制造业竞争力评价指标体系,运用基于密度的聚类算法进行定量评价竞争力,得出相应结论,为政府和企业决策提供参考。 相似文献
18.
《Information processing & management》2005,41(3):587-598
The Internet, together with the large amount of textual information available in document archives, has increased the relevance of information retrieval related tools. In this work we present an extension of the Gambal system for clustering and visualization of documents based on fuzzy clustering techniques. The tool allows to structure the set of documents in a hierarchical way (using a fuzzy hierarchical structure) and represent this structure in a graphical interface (a 3D sphere) over which the user can navigate.Gambal allows the analysis of the documents and the computation of their similarity not only on the basis of the syntactic similarity between words but also based on a dictionary (Wordnet 1.7) and latent semantics analysis. 相似文献