Class HTML::SGMLParser
In: lib/html/htmlparser.rb
Parent: Object

A parser for SGML, using the derived class as static DTD.

Methods

Constants

Interesting = /[&<]/   Regular expressions used for parsing:
Incomplete = Regexp.compile('&([a-zA-Z][a-zA-Z0-9]*|#[0-9]*)?|' + '<([a-zA-Z][^<>]*|/([a-zA-Z][^<>]*)?|' + '![^<>]*)?')
Entityref = /&([a-zA-Z][-.a-zA-Z0-9]*)[^-.a-zA-Z0-9]/
Charref = /&#([0-9]+)[^0-9]/
Starttagopen = /<[>a-zA-Z]/
Endtagopen = /<\/[<>a-zA-Z]/
Endbracket = /<|>|\/>/   Assaf: fixed to allow tag to close itself (XHTML)
Special = /<![^<>]*>/
Commentopen = /<!--/
Commentclose = /--[ \t\n]*>/
Tagfind = /[a-zA-Z][a-zA-Z0-9.-]*/
Attrfind = Regexp.compile('[\s,]*([a-zA-Z_][a-zA-Z_0-9.-]*)' + '(\s*=\s*' + "('[^']*'" + '|"[^"]*"' + '|[-~a-zA-Z0-9,.:+*%?!()_#=]*))?')   Assaf: / is no longer part of allowed attribute value
Entitydefs = {'lt'=>'<', 'gt'=>'>', 'amp'=>'&', 'quot'=>'"', 'apos'=>'\''}

Public Class methods

Public Instance methods

[Validate]