org.apache.cassandra.service
Class AntiEntropyService

java.lang.Object
  extended by org.apache.cassandra.service.AntiEntropyService

public class AntiEntropyService
extends java.lang.Object

AntiEntropyService encapsulates "validating" (hashing) individual column families, exchanging MerkleTrees with remote nodes via a TreeRequest/Response conversation, and then triggering repairs for disagreeing ranges. Every Tree conversation has an 'initiator', where valid trees are sent after generation and where the local and remote tree will rendezvous in rendezvous(cf, endpoint, tree). Once the trees rendezvous, a Differencer is executed and the service can trigger repairs for disagreeing ranges. Tree comparison and repair triggering occur in the single threaded AE_SERVICE_STAGE. The steps taken to enact a repair are as follows: 1. A major compaction is triggered either via nodeprobe, or automatically: * Nodeprobe sends TreeRequest messages to all neighbors of the target node: when a node receives a TreeRequest, it will perform a readonly compaction to immediately validate the column family. * Automatic compactions will also validate a column family and broadcast TreeResponses, but since TreeRequest messages are not sent to neighboring nodes, repairs will only occur if two nodes happen to perform automatic compactions within TREE_STORE_TIMEOUT of one another. 2. The compaction process validates the column family by: * Calling getValidator(), which can return a NoopValidator if validation should not be performed, * Calling IValidator.prepare(), which samples the column family to determine key distribution, * Calling IValidator.add() in order for every row in the column family, * Calling IValidator.complete() to indicate that all rows have been added. * If getValidator decided that the column family should be validated, calling complete() indicates that a valid MerkleTree has been created for the column family. * The valid tree is broadcast to neighboring nodes via TreeResponse, and stored locally. 3. When a node receives a TreeResponse, it passes the tree to rendezvous(), which checks for trees to rendezvous with / compare to: * If the tree is local, it is cached, and compared to any trees that were received from neighbors. * If the tree is remote, it is immediately compared to a local tree if one is cached. Otherwise, the remote tree is stored until a local tree can be generated. * A Differencer object is enqueued for each comparison. 4. Differencers are executed in AE_SERVICE_STAGE, to compare the two trees. * Based on the fraction of disagreement between the trees, the differencer will either perform repair via the io.Streaming api, or via RangeCommand read repairs.


Nested Class Summary
static class AntiEntropyService.Differencer
          Compares two trees, and launches repairs for disagreeing ranges.
static interface AntiEntropyService.IValidator
          A Strategy to handle building and validating a merkle tree for a column family.
static class AntiEntropyService.NoopValidator
          The IValidator to be used before a cluster has stabilized, or when repairs are disabled.
static class AntiEntropyService.TreeRequestVerbHandler
          Handler for requests from remote nodes to generate a valid tree.
static class AntiEntropyService.TreeResponseVerbHandler
          Handler for responses from remote nodes which contain a valid tree.
static class AntiEntropyService.Validator
          The IValidator to be used in normal operation.
 
Field Summary
static AntiEntropyService instance
           
static long NATURAL_REPAIR_FREQUENCY
           
static long TREE_STORE_TIMEOUT
           
 
Constructor Summary
protected AntiEntropyService()
          Protected constructor.
 
Method Summary
static java.util.Set<java.net.InetAddress> getNeighbors(java.lang.String table)
          Return all of the neighbors with whom we share data.
 AntiEntropyService.IValidator getValidator(java.lang.String table, java.lang.String cf, java.net.InetAddress initiator, boolean major)
          Return a Validator object which can be used to collect hashes for a column family.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

TREE_STORE_TIMEOUT

public static final long TREE_STORE_TIMEOUT
See Also:
Constant Field Values

NATURAL_REPAIR_FREQUENCY

public static final long NATURAL_REPAIR_FREQUENCY
See Also:
Constant Field Values

instance

public static final AntiEntropyService instance
Constructor Detail

AntiEntropyService

protected AntiEntropyService()
Protected constructor. Use AntiEntropyService.instance.

Method Detail

getNeighbors

public static java.util.Set<java.net.InetAddress> getNeighbors(java.lang.String table)
Return all of the neighbors with whom we share data.


getValidator

public AntiEntropyService.IValidator getValidator(java.lang.String table,
                                                  java.lang.String cf,
                                                  java.net.InetAddress initiator,
                                                  boolean major)
Return a Validator object which can be used to collect hashes for a column family. A Validator must be prepared() before use, and completed() afterward.

Parameters:
table - The table name containing the column family.
cf - The column family name.
initiator - Endpoint that initially triggered this validation, or null if the validation is occuring due to a natural major compaction.
major - True if the validator will see all of the data contained in the column family.
Returns:
A Validator.


Copyright © 2010 The Apache Software Foundation