class Ai4r::Clusterers::BisectingKMeans
The Bisecting k-means algorithm is a variation of the “k-means” algorithm, somewhat less sensitive to the initial election of centroids than the original.
More about K Means algorithm: en.wikipedia.org/wiki/K-means_algorithm
Attributes
centroids[R]
clusters[R]
data_set[R]
distance_function[RW]
max_iterations[RW]
number_of_clusters[R]
refine[RW]
Public Instance Methods
build(data_set, number_of_clusters)
click to toggle source
Build a new clusterer, using data examples found in data_set. Items will be clustered in “number_of_clusters” different clusters.
Calls superclass method
# File lib/ai4r/clusterers/bisecting_k_means.rb, line 51 def build(data_set, number_of_clusters) @data_set = data_set @number_of_clusters = number_of_clusters @clusters = [@data_set] @centroids = [@data_set.get_mean_or_mode] while @clusters.length < @number_of_clusters biggest_cluster_index = find_biggest_cluster_index(@clusters) clusterer = KMeans.new. set_parameters(get_parameters). build(@clusters[biggest_cluster_index], 2) @clusters.delete_at(biggest_cluster_index) @centroids.delete_at(biggest_cluster_index) @clusters.concat(clusterer.clusters) @centroids.concat(clusterer.centroids) end super if @refine return self end
intialize()
click to toggle source
# File lib/ai4r/clusterers/bisecting_k_means.rb, line 44 def intialize @refine = true end
Protected Instance Methods
calc_initial_centroids()
click to toggle source
# File lib/ai4r/clusterers/bisecting_k_means.rb, line 74 def calc_initial_centroids @centroids # Use existing centroids end
find_biggest_cluster_index(clusters)
click to toggle source
# File lib/ai4r/clusterers/bisecting_k_means.rb, line 78 def find_biggest_cluster_index(clusters) max_index = 0 max_length = 0 clusters.each_index do |cluster_index| cluster = clusters[cluster_index] if max_length < cluster.data_items.length max_length = cluster.data_items.length max_index = cluster_index end end return max_index end