Grid based method & model based clustering method

 INTRODUCTION  STING  WAVECLUSTER  CLIQUE-Clustering in QUEST  FAST PROCESSING TIME

 The grid based clustering approach uses a multi resolution grid data structure.  The object space is quantized into finite number of cells that form a grid structure.  The major advantage of this method is fast processing time.  It is dependent only on the number of cells in each dimension in the quantized space.

 Statistical information GRID.  Spatial area is divided into rectangular cells  Several levels of cells-at different levels of resolution  High level cell is partitioned into several lower level cells.  Statistical attributes are stored in cell. (mean , maximum , minimum)

 Computation is query independent  Parallel processing-supported.  Data is processed in a single pass  Quality depends on granuerily

 A multi-resolution clustering approach which applies wavelet transform to the feature space  A wavelet transform is a signal processing technique that decomposes a signal into different frequency sub-band  Both grid-based and density-based  Input parameters:  # of cells for each dimension  The wavelet , and the # of application wavelet transform.

 Complexity O(N)  Detect arbitrary shaped clusters at different scales.  Not sensitive to noise , not sensitive to input order.  Only applicable to low dimensional data.

CLIQUE can be considered as both density- based and grid-based 1.It partitions each dimension into the same number of equal length interval. 2.It partitions an m-dimensional data space into non-overlapping rectangular units. 3.A unit is dense if the fraction of total data points contained in the unit exceeds the input model parameter. 4.A cluster is a maximal set of connected dense units within a subspace.

 Attempt to optimize the fit between the data and some mathematical model.  ASSUMPTION:-data are generated by a mixture of underlying portability distributes.  TECHNIQUES:  expectation-maximization  Conceptual clustering  Neural networks approach

 ITERATIVE REFINEMENT ALGORITHM- used to find parameter estimates EXTENSION OF K-MEANS  Assigns an object to a cluster according to a weight representing portability of membership.  Initial estimate of parameters  Iteratively reassigns scores.

 A form of clustering in machine learning  Produces a classification scheme for a set of unlabeled objects.  Finds characteristics description for each concept  COBWEB  A popular and simple method of incremental conceptual learning.  Creates a hierarchical clustering in the form of a classification tree.

 Represent each cluster as an exemplar , acting as a “prototype” of the cluster.  New objects are distributed to the cluster whose exemplar is the most similar according to some distance measure. SELF ORGANIZING MAP  Competitive learning  Involves a hierarchical architecture of several units  Organization of units-forms a feature map  Web document clustering.

FEATURE TRANSFORMATION METHODS  PCA , SVD-Summarize data by creating linear combinations of attributes.  But do not remove any attributes ; transformed attributes-complex to interpret FEATURE SELECTION METHODS  Most relevant of attributes with represent to class labels  Entropy analysis .

Grid based method & model based clustering method

More Related Content

What's hot

Similar to Grid based method & model based clustering method

More from rajshreemuthiah

Recently uploaded

In this document

Grid based method & model based clustering method