XML Parser for Java - FAQ

General Questions

What is XML?
What is 'XML Parser for Java'?
What version of Java can 'XML Parser for Java' run on?
Is XML4J 100% pure Java compliant?

Programming Questions

The parser makes an error `sun.io.MalformedInputException' for my XML document. What is it?
An object tree has some unexpected text elements(TXText class).
How do I eliminate TXText nodes (whitespace) from a DOM Tree?
I can't import sources of XML for Java in VisualAge for Java 1.0
I can't import sources of XML for Java in VisualAge for Java 2.0


General Questions

What is XML?

Extensible Markup Language (XML) is a data format for structured document interchange on the Web. It is desinged by World Wide Web Consortium (W3C). Its specification is on http://www.w3.org/TR/1998/REC-xml-19980210.

What is 'XML Parser for Java'?

`XML for Java' is an XML processor written in Java, a library for parsing XML documents and generating XML documents.

What version of Java can 'XML Parser for Java' run on?

XML4J runs on Java 1.1 and Java 2, but not Java 1.0. However, some of the samples use Swing 1.1, which depends on Java version 1.1.5 or later. Refer to Swing-JDK Compatibility Table for details.

Is XML4J 100% pure Java compliant?

XML4J has not been formally certified to be 100% pure Java compliant. Running 'JavaPureCheck' results in few warnings. As required, the explanations for these warning are given below.

###################### JavaPureCheck Report ##########################
#
# Generated on : December 10, 1998 1:33:22 PM PST
# System Model Version : jdk11
# JavaPureCheck Version : 3.15
# Rule Base Version : 1.92
#
# Summary:
#
# PURE: 210 WARNING: 3 ERROR: 0
#
# Final Result : WARNING
#
######################################################################

Class: org.xml.sax.helpers.ParserFactory
Warning: method reference: java.lang.Class.forName(java.lang.String)
Note: May load impure class
Explanation:
This SAX API class defines a convenience method 'makeParser(java.lang.String className)' which is used to instantiate the class implementing the org.xml.sax.Parser interface. Its assumed here that all such implementations are in compliance with 100% pure Java guidelines. An example of a XML4J class that could be instantiated this way is com.ibm.xml.parser.SAXDriver.
Status: WARNING

Class: com.ibm.xml.parser.StylesheetPI
Warning: possible hard-coded path: text/css
Note: Defines a bad path
Explanation:
This text is part of a string definition and refers to the MIME type for Cascading Style Sheets. It does not refer to a path.
Status: WARNING

Class: com.ibm.xml.parser.TXAttribute
Warning: possible hard-coded path: com.ibm.xml.parser.TXAttribute#realInsert(): Nodes except Text/EntityReference are not allowed as attribute children.
Note: Defines a bad path
Explanation:
This text is part of a string definition and is passed as an error message when a exception is thrown. It does not refer to a path, rather it is using the JavaDoc linking syntax to refer to the realInsert() method of class TXAttribute.
Status: WARNING


Programming Questions

The parser makes an error `sun.io.MalformedInputException' for my XML document. What is it?

This error means your XML document has invalid character in its character encoding. When an XML document has no <?xml ... encoding="..."?>, the parser proecsses the document as it is wrtten in UTF-8 encoding. The UTF-8 encoding is compatible to US-ASCII in #x00-#x7f characters, but it is incompatible in other characters.

The parser can't know correct position of an invalid character because sun.io.MalformedInputException has no interface to get the position.

An object tree has some unexpected text elements(TXText class).
      <ROOT>
        <FOO>....</FOO>
        <BAR>....</BAR>
      </ROOT>
      

The above example looks that the ROOT element has 2 child elements, the first is FOO, the second is BAR. But it is incorrect. In fact, ROOT has 5 child elements. The first is a text element (TXText class), it have "\n ". The second is FOO element (TXElement class). The third is a text element (TXText), "\n ". The fourth is BAR element (TXElement). The fifth is a text element (TXText), "\n".

How do I eliminate TXText nodes (whitespace) from a DOM Tree? (see the previous question)

Since white space is significant in XML, it is always preserved in the tree, unless you take action. There are two ways to deal with this:

  1. Ignore it when you encounter it: The function getIsIgnorableWhitespace() return true on TXText instances which consist only of white spaces. If this returns true while you are traversing the tree, ignore the node. (See the Whitespace section of the User Guide). Note that this doesn't prevent the nodes from being created.
  2. Prevent the whitespace nodes from being inserted: See the user guide under "How to get a "filtered" parse tree". If a TXText node has isIgnorableWhitespace set, have handleElement return NULL, and the element will not be inserted.

I can't import sources of XML for Java in VisualAge for Java 1.0.

VisualAge for Java 1.0 doesn't support compiling inner classes. See VisualAge for Java FAQ Q35.

I can't import sources of XML for Java in VisualAge for Java 2.0.

The current version of the XML4J parser uses Swing 1.1, while VisualAge for Java 2.0 comes with Swing 1.0.2. The free update for the Professional version of VisualAge for Java 2.0 installs Swing 1.0.3. The most important difference between Swing 1.0.2 - 1.0.3 and 1.1 is the Java package was changed from com.sun.java.swing.* to javax.swing.*.

To fix the errors, you must download the Java Foundation Classes 1.1 with Swing 1.1 from Sun's Java home page and import the "swingall.jar" file into VisualAge for Java 2.0. The Swing 1.1 package can be found at the following URL: http://java.sun.com/products/jfc/index.html

Refer to the VisualAge for Java 2.0 documentation for information about how to import a JAR file into the repository and add that code to your workspace.

  • Are there any other tips for importing the XML4J parser into VisualAge for Java 2.0?

    The most useful tip applies to *any* updated code that you import into the VisualAge for Java 2.0 product. Before updating code, do the following:

    1. version the old code
    2. delete it from your workspace
    3. import the new code

    Deleting code from your workspace does not actually delete the code permanently -- the versioned code is moved to the repository where it can be retrieved later. Be aware, though, that removing code from your workspace will cause problems with all of the other classes that use that code. VisualAge for Java 2.0 will flag them as errors but this situation is temporary. When you import the new code, the errors found when deleting the old code will be fixed.

    If you are unsure as to how to perform any of these steps, refer to the VisualAge for Java 2.0 documentation.