T
- type of the points to clusterpublic class KMeansPlusPlusClusterer<T extends Clusterable<T>>
extends java.lang.Object
Modifier and Type | Class and Description |
---|---|
static class |
KMeansPlusPlusClusterer.EmptyClusterStrategy
Strategies to use for replacing an empty cluster.
|
Modifier and Type | Field and Description |
---|---|
private KMeansPlusPlusClusterer.EmptyClusterStrategy |
emptyStrategy
Selected strategy for empty clusters.
|
private java.util.Random |
random
Random generator for choosing initial centers.
|
Constructor and Description |
---|
KMeansPlusPlusClusterer(java.util.Random random)
Build a clusterer.
|
KMeansPlusPlusClusterer(java.util.Random random,
KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)
Build a clusterer.
|
Modifier and Type | Method and Description |
---|---|
private static <T extends Clusterable<T>> |
assignPointsToClusters(java.util.Collection<Cluster<T>> clusters,
java.util.Collection<T> points)
Adds the given points to the closest
Cluster . |
private static <T extends Clusterable<T>> |
chooseInitialCenters(java.util.Collection<T> points,
int k,
java.util.Random random)
Use K-means++ to choose the initial centers.
|
java.util.List<Cluster<T>> |
cluster(java.util.Collection<T> points,
int k,
int maxIterations)
Runs the K-means++ clustering algorithm.
|
private T |
getFarthestPoint(java.util.Collection<Cluster<T>> clusters)
Get the point farthest to its cluster center
|
private static <T extends Clusterable<T>> |
getNearestCluster(java.util.Collection<Cluster<T>> clusters,
T point)
Returns the nearest
Cluster to the given point |
private T |
getPointFromLargestNumberCluster(java.util.Collection<Cluster<T>> clusters)
Get a random point from the
Cluster with the largest number of points |
private T |
getPointFromLargestVarianceCluster(java.util.Collection<Cluster<T>> clusters)
Get a random point from the
Cluster with the largest distance variance. |
private final java.util.Random random
private final KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy
public KMeansPlusPlusClusterer(java.util.Random random)
The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.
random
- random generator to use for choosing initial centerspublic KMeansPlusPlusClusterer(java.util.Random random, KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)
random
- random generator to use for choosing initial centersemptyStrategy
- strategy to use for handling empty clusters that
may appear during algorithm iterationspublic java.util.List<Cluster<T>> cluster(java.util.Collection<T> points, int k, int maxIterations)
points
- the points to clusterk
- the number of clusters to split the data intomaxIterations
- the maximum number of iterations to run the algorithm
for. If negative, no maximum will be usedprivate static <T extends Clusterable<T>> void assignPointsToClusters(java.util.Collection<Cluster<T>> clusters, java.util.Collection<T> points)
Cluster
.private static <T extends Clusterable<T>> java.util.List<Cluster<T>> chooseInitialCenters(java.util.Collection<T> points, int k, java.util.Random random)
T
- type of the points to clusterpoints
- the points to choose the initial centers fromk
- the number of centers to chooserandom
- random generator to useprivate T getPointFromLargestVarianceCluster(java.util.Collection<Cluster<T>> clusters)
Cluster
with the largest distance variance.clusters
- the Cluster
s to searchprivate T getPointFromLargestNumberCluster(java.util.Collection<Cluster<T>> clusters)
Cluster
with the largest number of pointsclusters
- the Cluster
s to searchprivate T getFarthestPoint(java.util.Collection<Cluster<T>> clusters)
clusters
- the Cluster
s to searchprivate static <T extends Clusterable<T>> Cluster<T> getNearestCluster(java.util.Collection<Cluster<T>> clusters, T point)
Cluster
to the given pointCopyright (c) 2003-2016 Apache Software Foundation