KMeans
The Bisecting k-means algorithm is a variation of the "k-means" algorithm, somewhat less sensible to the initial election of centroids than the original.
More about K Means algorithm: en.wikipedia.org/wiki/K-means_algorithm
Build a new clusterer, using data examples found in data_set. Items will be clustered in "number_of_clusters" different clusters.
# File lib/ai4r/clusterers/bisecting_k_means.rb, line 51 def build(data_set, number_of_clusters) @data_set = data_set @number_of_clusters = number_of_clusters @clusters = [@data_set] @centroids = [@data_set.get_mean_or_mode] while @clusters.length < @number_of_clusters biggest_cluster_index = find_biggest_cluster_index(@clusters) clusterer = KMeans.new. set_parameters(get_parameters). build(@clusters[biggest_cluster_index], 2) @clusters.delete_at(biggest_cluster_index) @centroids.delete_at(biggest_cluster_index) @clusters.concat(clusterer.clusters) @centroids.concat(clusterer.centroids) end super if @refine return self end
# File lib/ai4r/clusterers/bisecting_k_means.rb, line 74 def calc_initial_centroids @centroids # Use existing centroids end
# File lib/ai4r/clusterers/bisecting_k_means.rb, line 78 def find_biggest_cluster_index(clusters) max_index = 0 max_length = 0 clusters.each_index do |cluster_index| cluster = clusters[cluster_index] if max_length < cluster.data_items.length max_length = cluster.data_items.length max_index = cluster_index end end return max_index end
Generated with the Darkfish Rdoc Generator 2.