randomlinks - generate a data file with random links between chromosomes
randomlinks -karyotype KARYOTYPE_FILE {-chr_rx REGEX } -size AVG[,SD] [-nointra] [-nointer]
Generate a link data file suitable for use in Circos containing random links between chromosomes. Chromosomes are sampled from the karyotype file KARYOTYPE_FILE and optionally further filtered using the regular expression REGEX.
The number of links between any two chromosome pairs is determined by rules (read below). The size of each link is determined by the average and standard deviation values provided by -size.
Intrachromosomal links can be avoided using -nointra. Similiarly, interchromosomal links can be avoided using -nointer. The -nointer option is much less useful.
Given a filtered set of chromosomes (first sampled from the KARYOTYPE_FILE and then passed through the regular expression REGEX), the number of links joining any pair of chromosomes is determined by a set of rules.
Each rule contains two regular expressions, one for each of the chromosomes in the pair, and these determine which pairs of chromosomes the rule will apply to.
For example, if the regular expressions are ``.'' and ``.'' then all chromosome pairs are matched. However, if the regular expressions are ``.'' and ``chr10'' then only pairs of chromosomes for which one contains chr10 are affected.
In addition to the regular expression selection filter, each rule contains either (a) avg/sd parameters used to generate a normally distributed random number which is used as the number of links between the selected chromosomes, or (b) a multiplier which is used to multiply the number of links as determined by a previous rule.
Optionally, rules may contain a sampling parameter which determines how frequently the rule is applied.
Rules are applied in increasing order of specificity. Thus, rules that affect the largest number of chromosome pairs are applied first, followed by rules that affect fewer pairs.
For more details about the syntax of rules, see etc/randomlinks.conf.
Added documentation and refined rule set syntax.
Started and versioned.
Martin Krzywinski
Martin Krzywinski Genome Sciences Centre Vancouver BC Canada www.bcgsc.ca martink@bcgsc.ca