class Ai4r::Clusterers::WardLinkage
Implementation of an Agglomerative Hierarchical clusterer with Ward's method linkage algorithm, aka the minimum variance method (Everitt et al., 2001 ; Jain and Dubes, 1988 ; Ward, 1963 ). Hierarchical clusterer create one cluster per element, and then progressively merge clusters, until the required number of clusters is reached. The objective of this method is to minimize the variance.
D(cx, (ci U cj)) = (ni/(ni+nj+nx))*D(cx, ci) + (nj/(ni+nj+nx))*D(cx, cj) - (nx/(ni+nj)^2)*D(ci, cj)
Public Instance Methods
build(data_set, number_of_clusters)
click to toggle source
Build a new clusterer, using data examples found in data_set. Items will be clustered in “number_of_clusters” different clusters.
Calls superclass method
Ai4r::Clusterers::SingleLinkage#build
# File lib/ai4r/clusterers/ward_linkage.rb, line 38 def build(data_set, number_of_clusters) super end
eval(data_item)
click to toggle source
This algorithms does not allow classification of new data items once it has been built. Rebuild the cluster including you data element.
# File lib/ai4r/clusterers/ward_linkage.rb, line 44 def eval(data_item) Raise "Eval of new data is not supported by this algorithm." end
Protected Instance Methods
linkage_distance(cx, ci, cj)
click to toggle source
return distance between cluster cx and cluster (ci U cj), using ward's method linkage
# File lib/ai4r/clusterers/ward_linkage.rb, line 52 def linkage_distance(cx, ci, cj) ni = @index_clusters[ci].length nj = @index_clusters[cj].length nx = @index_clusters[cx].length ( ( ( 1.0* (ni+nx) * read_distance_matrix(cx, ci) ) + ( 1.0* (nj+nx) * read_distance_matrix(cx, cj) ) ) / (ni + nj + nx) - ( 1.0 * nx * read_distance_matrix(ci, cj) / (ni+nj)**2 ) ) end