All Packages  Class Hierarchy  This Package  Previous  Next  Index

Class hplb.xml.Tokenizer

java.lang.Object
   |
   +----hplb.xml.Tokenizer

public class Tokenizer
extends Object
implements Parser
This is a hand-written lexical analyzer for XML/HTML Markup. The parser is simple, fast and quite robust. Element and attribute names are mapped to lower case. Comments are returned as (part of) PCDATA tokens. Markup elements within comments is not recognized as markup.

Author:
Anders Kristensen

Variable Index

 o _column
 o _line
 o atomize
 o BOOLATTR
The value of boolean attributes is this string.
 o column
 o entMngr
 o line
 o noCaseElms
 o qchar
 o rcgnzCDATA
 o rcgnzComments
 o rcgnzEntities
 o rcgnzWS
 o state

Constructor Index

 o Tokenizer()

Method Index

 o fatal(String)
 o gotAttr(boolean)
 o gotComment()
 o gotPCDATA(boolean)
 o gotTag(boolean)
 o ignoreCase(String)
 o isCtlOrTspecial(int)
Returns true if c is either an ascii control character or a tspecial according to the HTTP specification.
 o isWS(int)
 o keysToLowerCase(SAXAttributeMap)
 o parse(InputStream)
 o parse(String, String)
 o parseCDATA()
 o parsePI()
 o pos()
 o rcgnzWS(boolean)
 o read()
 o read_ex()
 o setDocumentHandler(DocumentHandler)
 o setEntityHandler(EntityHandler)
 o setErrorHandler(ErrorHandler)
 o tokenize()
 o toStart()
 o warning(String)

Variables

 o BOOLATTR
 public static final String BOOLATTR
The value of boolean attributes is this string.

 o noCaseElms
 protected Hashtable noCaseElms
 o rcgnzWS
 public boolean rcgnzWS
 o rcgnzEntities
 public boolean rcgnzEntities
 o rcgnzCDATA
 public boolean rcgnzCDATA
 o rcgnzComments
 public boolean rcgnzComments
 o atomize
 public boolean atomize
 o entMngr
 public final EntityManager entMngr
 o state
 protected int state
 o _line
 protected int _line
 o _column
 protected int _column
 o line
 public int line
 o column
 public int column
 o qchar
 protected int qchar

Constructors

 o Tokenizer
 public Tokenizer()

Methods

 o setEntityHandler
 public void setEntityHandler(EntityHandler handler)
 o setDocumentHandler
 public void setDocumentHandler(DocumentHandler handler)
 o setErrorHandler
 public void setErrorHandler(ErrorHandler handler)
 o parse
 public void parse(String publicID,
                   String sysID) throws Exception
 o parse
 public void parse(InputStream in) throws Exception
 o pos
 protected void pos()
 o ignoreCase
 public void ignoreCase(String elementName)
 o rcgnzWS
 public void rcgnzWS(boolean b)
 o toStart
 protected void toStart()
 o tokenize
 public void tokenize() throws Exception
 o read
 public final int read() throws IOException
 o read_ex
 public final int read_ex() throws IOException, EmptyInputStream
 o gotAttr
 protected final void gotAttr(boolean isBoolean) throws Exception
 o gotTag
 protected void gotTag(boolean isEmpty) throws Exception
 o keysToLowerCase
 public final void keysToLowerCase(SAXAttributeMap attrs)
 o gotPCDATA
 protected void gotPCDATA(boolean toomuch) throws Exception
 o gotComment
 protected void gotComment() throws IOException, EmptyInputStream
 o parsePI
 protected void parsePI() throws Exception
 o parseCDATA
 protected void parseCDATA() throws Exception
 o isWS
 public boolean isWS(int c)
 o isCtlOrTspecial
 public static final boolean isCtlOrTspecial(int c)
Returns true if c is either an ascii control character or a tspecial according to the HTTP specification.

 o warning
 protected final void warning(String s) throws Exception
 o fatal
 protected final void fatal(String s) throws Exception

All Packages  Class Hierarchy  This Package  Previous  Next  Index