org.apache.commons.math3.stat.clustering
T
- type of the points to clusterpublic class DBSCANClusterer<T extends Clusterable<T>> extends java.lang.Object
The DBSCAN algorithm forms clusters based on the idea of density connectivity, i.e. a point p is density connected to another point q, if there exists a chain of points pi, with i = 1 .. n and p1 = p and pn = q, such that each pair <pi, pi+1> is directly density-reachable. A point q is directly density-reachable from point p if it is in the ε-neighborhood of this point.
Any point that is not density-reachable from a formed cluster is treated as noise, and will thus not be present in the result.
The algorithm requires two parameters:
Note: as DBSCAN is not a centroid-based clustering algorithm, the resulting
Cluster
objects will have no defined center, i.e. Cluster.getCenter()
will
return null
.
Modifier and Type | Class and Description |
---|---|
private static class |
DBSCANClusterer.PointStatus
Status of a point during the clustering process.
|
Modifier and Type | Field and Description |
---|---|
private double |
eps
Maximum radius of the neighborhood to be considered.
|
private int |
minPts
Minimum number of points needed for a cluster.
|
Constructor and Description |
---|
DBSCANClusterer(double eps,
int minPts)
Creates a new instance of a DBSCANClusterer.
|
Modifier and Type | Method and Description |
---|---|
java.util.List<Cluster<T>> |
cluster(java.util.Collection<T> points)
Performs DBSCAN cluster analysis.
|
private Cluster<T> |
expandCluster(Cluster<T> cluster,
T point,
java.util.List<T> neighbors,
java.util.Collection<T> points,
java.util.Map<Clusterable<T>,DBSCANClusterer.PointStatus> visited)
Expands the cluster to include density-reachable items.
|
double |
getEps()
Returns the maximum radius of the neighborhood to be considered.
|
int |
getMinPts()
Returns the minimum number of points needed for a cluster.
|
private java.util.List<T> |
getNeighbors(T point,
java.util.Collection<T> points)
Returns a list of density-reachable neighbors of a
point . |
private java.util.List<T> |
merge(java.util.List<T> one,
java.util.List<T> two)
Merges two lists together.
|
private final double eps
private final int minPts
public DBSCANClusterer(double eps, int minPts) throws NotPositiveException
eps
- maximum radius of the neighborhood to be consideredminPts
- minimum number of points needed for a clusterNotPositiveException
- if eps < 0.0
or minPts < 0
public double getEps()
public int getMinPts()
public java.util.List<Cluster<T>> cluster(java.util.Collection<T> points) throws NullArgumentException
Note: as DBSCAN is not a centroid-based clustering algorithm, the resulting
Cluster
objects will have no defined center, i.e. Cluster.getCenter()
will
return null
.
points
- the points to clusterNullArgumentException
- if the data points are nullprivate Cluster<T> expandCluster(Cluster<T> cluster, T point, java.util.List<T> neighbors, java.util.Collection<T> points, java.util.Map<Clusterable<T>,DBSCANClusterer.PointStatus> visited)
cluster
- Cluster to expandpoint
- Point to add to clusterneighbors
- List of neighborspoints
- the data setvisited
- the set of already visited pointsprivate java.util.List<T> getNeighbors(T point, java.util.Collection<T> points)
point
.point
- the point to look forpoints
- possible neighborsCopyright (c) 2003-2013 Apache Software Foundation