IBM's XML Parser for Java - FAQ


General Questions

What is XML?

Extensible Markup Language (XML) is a data format for structured document interchange on the Web. It is desinged by World Wide Web Consortium (W3C). Its specification is on http://www.w3.org/TR/1998/REC-xml-19980210.

What is `XML Parser for Java'?

`XML Parser for Java' is an XML processor written in Java, a library for parsing XML documents and generating XML documents.

What version of Java can XML Parser for Java run on?

It runs on Java 1.1 and Java 1.2 Beta, not Java 1.0. The Feb 1998 version of XML for Java counldn't run on Java 1.2 Beta.


Programming Questions

The parser makes an error `sun.io.MalformedInputException' for my XML document. What is it?

This error means your XML document has invalid character in its character encoding. When an XML document has no <?xml ... encoding="..."?>, the parser proecsses the document as it is wrtten in UTF-8 encoding. The UTF-8 encoding is compatible to US-ASCII in #x00-#x7f characters, but it is incompatible in other characters.

The parser can't know correct position of an invalid character because sun.io.MalformedInputException has no interface to get the position.

An object tree has some not-intended text elements(TXText class).
      <ROOT>
        <FOO>....</FOO>
        <BAR>....</BAR>
      </ROOT>
      

The above example looks that the ROOT element has 2 child elements, the first is FOO, the second is BAR. But it is incorrect. In fact, ROOT has 5 child elements. The first is a text element (TXText class), it have "\n ". The second is FOO element (TXElement class). The third is a text element (TXText), "\n ". The fourth is BAR element (TXElement). The fifth is a text element (TXText), "\n".


I can't import sources of XML Parser for Java in VisualAge for Java.

VisualAge for Java 1.0 doesn't support compiling inner classes. See VisualAge for Java FAQ Q35.


I cannot parse files authored using LPEX editor (part of IBM's VisualAge tools).

LPEX uses the character 0x1A as the end-of-file (EOF) marker. According to XML specifications, this character is illegal. The parser correctly reports the illegal character as an error "Invalid document structure". To correctly parse such files, open the file in any other editor and save it again to remove this special EOF character.