Bio::PhyloXML::Parser is for parsing phyloXML format files.
Libxml2 XML parser is required. Install libxml-ruby bindings from libxml.rubyforge.org or
gem install -r libxml-ruby
require 'bio' # Create new phyloxml parser phyloxml = Bio::PhyloXML::Parser.open('example.xml') # Print the names of all trees in the file phyloxml.each do |tree| puts tree.name end
www.phyloxml.org/documentation/version_100/phyloxml.xsd.html
After parsing all the trees, if there is anything else in other xml format, it is saved in this array of PhyloXML::Other objects
Initializes LibXML::Reader and reads from the IO until it reaches the first phylogeny element.
Create a new Bio::PhyloXML::Parser object.
p = Bio::PhyloXML::Parser.for_io($stdin)
Arguments:
(required) io: IO object
(optional) validate: For IO reader, the “validate” option is ignored and no validation is executed.
Returns |
Bio::PhyloXML::Parser object |
# File lib/bio/db/phyloxml/phyloxml_parser.rb, line 218 def self.for_io(io, validate=true) obj = new(nil, validate) obj.instance_eval { @reader = XML::Reader.io(io, { :options => LibXML::XML::Parser::Options::NONET }) _skip_leader } obj end
Initializes LibXML::Reader and reads the PhyloXML-formatted string until it reaches the first phylogeny element.
Create a new Bio::PhyloXML::Parser object.
str = File.read("./phyloxml_examples.xml") p = Bio::PhyloXML::Parser.new(str)
Deprecated usage: Reads data from a file. <em>str<em> is a filename.
p = Bio::PhyloXML::Parser.new("./phyloxml_examples.xml")
Taking filename is deprecated. Use Bio::PhyloXML::Parser.open(filename).
Arguments:
(required) str: PhyloXML-formatted string
(optional) validate: Whether to validate the file against schema or not. Default value is true.
Returns |
Bio::PhyloXML::Parser object |
# File lib/bio/db/phyloxml/phyloxml_parser.rb, line 318 def initialize(str, validate=true) @other = [] return unless str # For compatibility, if filename-like string is given, # treat it as a filename. if /[\<\>\r\n]/ !~ str and File.exist?(str) then # assume that str is filename warn "Bio::PhyloXML::Parser.new(filename) is deprecated. Use Bio::PhyloXML::Parser.open(filename)." filename = _secure_filename(str) _validate(:file, filename) if validate @reader = XML::Reader.file(filename) _skip_leader return end # initialize for string @reader = XML::Reader.string(str, { :options => LibXML::XML::Parser::Options::NONET }) _skip_leader end
Initializes LibXML::Reader and reads the file until it reaches the first phylogeny element.
Example: Create a new Bio::PhyloXML::Parser object.
p = Bio::PhyloXML::Parser.open("./phyloxml_examples.xml")
If the optional code block is given, Bio::PhyloXML object is passed to the block as an argument. When the block terminates, the Bio::PhyloXML object is automatically closed, and the open method returns the value of the block.
Example: Get the first tree in the file.
tree = Bio::PhyloXML::Parser.open("example.xml") do |px| px.next_tree end
Arguments:
(required) filename: Path to the file to parse.
(optional) validate: Whether to validate the file against schema or not. Default value is true.
Returns |
(without block) Bio::PhyloXML::Parser object |
Returns |
(with block) the value of the block |
# File lib/bio/db/phyloxml/phyloxml_parser.rb, line 102 def self.open(filename, validate=true) obj = new(nil, validate) obj.instance_eval { filename = _secure_filename(filename) _validate(:file, filename) if validate # XML::Parser::Options::NONET for security reason @reader = XML::Reader.file(filename, { :options => LibXML::XML::Parser::Options::NONET }) _skip_leader } if block_given? then begin ret = yield obj ensure obj.close if obj and !obj.closed? end ret else obj end end
Initializes LibXML::Reader and reads the file until it reaches the first phylogeny element.
Create a new Bio::PhyloXML::Parser object.
p = Bio::PhyloXML::Parser.open_uri("http://www.phyloxml.org/examples/apaf.xml")
If the optional code block is given, Bio::PhyloXML object is passed to the block as an argument. When the block terminates, the Bio::PhyloXML object is automatically closed, and the open_uri method returns the value of the block.
Arguments:
(required) uri: (URI or String) URI to the data to parse
(optional) validate: For URI reader, the “validate” option is ignored and no validation is executed.
Returns |
(without block) Bio::PhyloXML::Parser object |
Returns |
(with block) the value of the block |
# File lib/bio/db/phyloxml/phyloxml_parser.rb, line 143 def self.open_uri(uri, validate=true) case uri when URI uri = uri.to_s else # raises error if not a String uri = uri.to_str # raises error if invalid URI URI.parse(uri) end obj = new(nil, validate) obj.instance_eval { @reader = XML::Reader.file(uri) _skip_leader } if block_given? then begin ret = yield obj ensure obj.close if obj and !obj.closed? end else obj end end
Access the specified tree in the file. It parses trees until the specified tree is reached.
# Get 3rd tree in the file (starts counting from 0). parser = PhyloXML::Parser.open('phyloxml_examples.xml') tree = parser[2]
# File lib/bio/db/phyloxml/phyloxml_parser.rb, line 364 def [](i) tree = nil (i+1).times do tree = self.next_tree end return tree end
Closes the LibXML::Reader inside the object. It also closes the opened file if it is created by using Bio::PhyloXML::Parser.open method.
When closed object is closed again, or closed object is used, it raises LibXML::XML::Error.
Returns |
nil |
# File lib/bio/db/phyloxml/phyloxml_parser.rb, line 188 def close @reader.close @reader = ClosedPhyloXMLParser.new nil end
If the object is closed by using the close method or equivalent, returns true. Otherwise, returns false.
Returns |
true or false |
# File lib/bio/db/phyloxml/phyloxml_parser.rb, line 198 def closed? if @reader.kind_of?(ClosedPhyloXMLParser) then true else false end end
Iterate through all trees in the file.
phyloxml = Bio::PhyloXML::Parser.open('example.xml') phyloxml.each do |tree| puts tree.name end
# File lib/bio/db/phyloxml/phyloxml_parser.rb, line 351 def each while tree = next_tree yield tree end end
Parse and return the next phylogeny tree. If there are no more phylogeny element, nil is returned. If there is something else besides phylogeny elements, it is saved in the PhyloXML::Parser#other.
p = Bio::PhyloXML::Parser.open("./phyloxml_examples.xml") tree = p.next_tree
Returns |
# File lib/bio/db/phyloxml/phyloxml_parser.rb, line 381 def next_tree() if not is_element?('phylogeny') if @reader.node_type == XML::Reader::TYPE_END_ELEMENT if is_end_element?('phyloxml') return nil else @reader.read @reader.read if is_end_element?('phyloxml') return nil end end end # phyloxml can hold only phylogeny and "other" elements. If this is not # phylogeny element then it is other. Also, "other" always comes after # all phylogenies @other << parse_other #return nil for tree, since this is not valid phyloxml tree. return nil end tree = Bio::PhyloXML::Tree.new # keep track of current node in clades array/stack. Current node is the # last element in the clades array clades = [] clades.push tree #keep track of current edge to be able to parse branch_length tag current_edge = nil # we are going to parse clade iteratively by pointing (and changing) to # the current node in the tree. Since the property element is both in # clade and in the phylogeny, we need some boolean to know if we are # parsing the clade (there can be only max 1 clade in phylogeny) or # parsing phylogeny parsing_clade = false while not is_end_element?('phylogeny') do break if is_end_element?('phyloxml') # parse phylogeny elements, except clade if not parsing_clade if is_element?('phylogeny') @reader["rooted"] == "true" ? tree.rooted = true : tree.rooted = false @reader["rerootable"] == "true" ? tree.rerootable = true : tree.rerootable = false parse_attributes(tree, ["branch_length_unit", 'type']) end parse_simple_elements(tree, [ "name", 'description', "date"]) if is_element?('confidence') tree.confidences << parse_confidence end end if @reader.node_type == XML::Reader::TYPE_ELEMENT case @reader.name when 'clade' #parse clade element parsing_clade = true node= Bio::PhyloXML::Node.new branch_length = @reader['branch_length'] parse_attributes(node, ["id_source"]) #add new node to the tree tree.add_node(node) # The first clade will always be root since by xsd schema phyloxml can # have 0 to 1 clades in it. if tree.root == nil tree.root = node else current_edge = tree.add_edge(clades[-1], node, Bio::Tree::Edge.new(branch_length)) end clades.push node #end if clade element else parse_clade_elements(clades[-1], current_edge) if parsing_clade end end #end clade element, go one parent up if is_end_element?('clade') #if we have reached the closing tag of the top-most clade, then our # curent node should point to the root, If thats the case, we are done # parsing the clade element if clades[-1] == tree.root parsing_clade = false else # set current node (clades[-1) to the previous clade in the array clades.pop end end #parsing phylogeny elements if not parsing_clade if @reader.node_type == XML::Reader::TYPE_ELEMENT case @reader.name when 'property' tree.properties << parse_property when 'clade_relation' clade_relation = CladeRelation.new parse_attributes(clade_relation, ["id_ref_0", "id_ref_1", "distance", "type"]) #@ add unit test for this if not @reader.empty_element? @reader.read if is_element?('confidence') clade_relation.confidence = parse_confidence end end tree.clade_relations << clade_relation when 'sequence_relation' sequence_relation = SequenceRelation.new parse_attributes(sequence_relation, ["id_ref_0", "id_ref_1", "distance", "type"]) if not @reader.empty_element? @reader.read if is_element?('confidence') sequence_relation.confidence = parse_confidence end end tree.sequence_relations << sequence_relation when 'phylogeny' #do nothing else tree.other << parse_other #puts "Not recognized element. #{@reader.name}" end end end # go to next element @reader.read end #end while not </phylogeny> #move on to the next tag after /phylogeny which is text, since phylogeny #end tag is empty element, which value is nil, therefore need to move to #the next meaningful element (therefore @reader.read twice) @reader.read @reader.read return tree end
Generated with the Darkfish Rdoc Generator 2.