Home | Trees | Index | Help |
---|
Package Halberd :: Package clues :: Module analysis |
|
Function Summary | |
---|---|
list
|
Draw conclusions from the clues obtained during the scanning phase. |
dict
|
Classify a sequence according to one or several criteria. |
tuple
|
Finds clusters of clues. |
list
|
Computes the differences between the elements of a sequence of integers. |
list
|
Study differences between fields. |
list
|
Detect and merge clues pointing to a proxy cache on the remote end. |
str
|
Returns the specified clue's digest. |
int
|
Compute the total number of hits in a sequence of clues. |
Tries to detect and ignore MIME fields with ever changing content. | |
Clue
|
Merges a sequence of clues into one. |
Identify and ignore changing header fields. | |
list
|
Returns sections (and their items) from a nested dict. |
list of slice
|
Returns slices of a given sequence separated by the specified indices. |
Sorts clues according to their time difference. | |
list
|
Return a list of unique clues. |
Function Details |
---|
analyze(clues)Draw conclusions from the clues obtained during the scanning phase.
|
classify(seq, *classifiers)Classify a sequence according to one or several criteria. We store each item into a nested dictionary using the classifiers as key generators (all of them must be callable objects). In the following example we classify a list of clues according to their digest and their time difference.>>> a, b, c = Clue(), Clue(), Clue() >>> a.diff, b.diff, c.diff = 1, 2, 2 >>> a.info['digest'] = 'x' >>> b.info['digest'] = c.info['digest'] = 'y' >>> get_diff = lambda x: x.diff >>> classified = classify([a, b, c], get_digest, get_diff) >>> digests = classified.keys() >>> digests.sort() # We sort these so doctest won't fail. >>> for digest in digests: ... print digest ... for diff in classified[digest].keys(): ... print ' ', diff ... for clue in classified[digest][diff]: ... if clue is a: print ' a' ... elif clue is b: print ' b' ... elif clue is c: print ' c' ... x 1 a y 2 b c
|
clusters(clues, step=3)Finds clusters of clues. A cluster is a group of at moststep clues which only
differ in 1 seconds between each other.
|
deltas(xs)Computes the differences between the elements of a sequence of integers.>>> deltas([-1, 0, 1]) [1, 1] >>> deltas([1, 1, 2, 3, 5, 8, 13]) [0, 1, 1, 2, 3, 5]
|
diff_fields(clues)Study differences between fields.
|
filter_proxies(clues, maxdelta=3)Detect and merge clues pointing to a proxy cache on the remote end.
|
get_digest(clue)Returns the specified clue's digest. This function is usually passed as a parameter forclassify so it can separate clues
according to their digest (among other fields).
|
hits(clues)Compute the total number of hits in a sequence of clues.
|
ignore_changing_fields(clues)Tries to detect and ignore MIME fields with ever changing content. Some servers might include fields varying with time, randomly, etc. Those fields are likely to alter the clue's digest and interfer withanalyze , producing many false positives
and making the scan useless. This function detects those fields and
recalculates each clue's digest so they can be safely analyzed
again.
|
merge(clues)Merges a sequence of clues into one. A new clue will store the total count of the clues. Note that eachClue has a starting count of 1
>>> a, b, c = Clue(), Clue(), Clue() >>> sum([x.getCount() for x in [a, b, c]]) 3 >>> a.incCount(5), b.incCount(11), c.incCount(23) (None, None, None) >>> merged = merge((a, b, c)) >>> merged.getCount() 42 >>> merged == a True
|
reanalyze(clues, analyzed, threshold)Identify and ignore changing header fields. After initial analysis one must check that there aren't as many realservers as obtained clues. If there were it could be a sign of something wrong happening: each clue is different from the others due to one or more MIME header fields which change unexpectedly.
|
sections(classified, sects=None)Returns sections (and their items) from a nested dict. See also:classify
|
slices(start, xs)Returns slices of a given sequence separated by the specified indices. If we wanted to get the slices necessary to split range(20) in sub-sequences of 5 items each we'd do:>>> seq = range(20) >>> indices = [5, 10, 15] >>> for piece in slices(0, indices): ... print seq[piece] [0, 1, 2, 3, 4] [5, 6, 7, 8, 9] [10, 11, 12, 13, 14] [15, 16, 17, 18, 19]
|
sort_clues(clues)Sorts clues according to their time difference. |
uniq(clues)Return a list of unique clues. This is needed when merging clues coming from different sources. Clues with the same time diff and digest are not discarded, they are merged into one clue with the aggregated number of hits.
|
Home | Trees | Index | Help |
---|
Generated by Epydoc 2.1 on Wed Jul 18 22:25:57 2007 | http://epydoc.sf.net |