What are the new features in XML4J version 2? Modular architecture XML4J's new architecture means that you only pay for the features that you use. This will allow
you to construct a parser configuration which only includes those features that your application needs, thus reducing the number of class files (or size of the jar file) required. It also means that we will be able to add new
functionality more easily, and that developers will be able to extend the functionality of the parser in new ways. As examples of this extensibility, XML4J version 2 offers:
- Pluggable Validator
This allows the DTD based validator to be replaced with a validator based on some other method, such as the DCD, SOX, or DDML proposals under consideration in the W3C.
- Pluggable DOM implementation
This allows a customized DOM implementation to be supplied. The TX compatiblity classes are implemented using this feature.
- Pluggable Catalog Support
Support for a catalog is modular. In this release, we support provide two catalog modules.
Performance The performance of XML4J has been greatly improved in version 2. This is especially true for SAX. Footprint The main-memory footprint of XML4J has improved in version 2. This is especially true for
SAX. Validating SAX One of the benefits of the new architecture is that we are able to offer a validating parser that uses the SAX API XCatalog In addition to supporting SGML Open catalogs, XML4J version 2 now
supports version 0.2 of the XCatalog specification. Revalidation XML4J version 2 will allow you to invoke the validator on a DOM tree after parsing has completed. This allows you to modify the DOM tree programatically and
then check to make sure that the resulting tree is valid.
What features are not supported in this new release?
- This release does not yet provide the readDTDStream() method on the TX Compatibility parser.
What international encodings are supported by XML4J?
- UTF-8
- UTF-16 Big Endian, UTF-16 Little Endian
- IBM-1208
- ISO Latin-1 (ISO-8859-1)
- ISO Latin-2 (ISO-8859-2) [Bosnian, Croatian, Czech,
Hungarian, Polish, Romanian, Serbian (in Latin transcription),
Serbocroatian, Slovak, Slovenian, Upper and Lower Sorbian]
- ISO Latin-3 (ISO-8859-3) [Maltese, Esperanto]
- ISO Latin-4 (ISO-8859-4)
- ISO Latin Cyrillic (ISO-8859-5)
- ISO Latin Arabic (ISO-8859-6)
- ISO Latin Greek (ISO-8859-7)
- ISO Latin Hebrew (ISO-8859-8)
- ISO Latin-5 (ISO-8859-9) [Turkish]
- Extended Unix Code, packed for Japanese (euc-jp, eucjis)
- Japanese Shift JIS (shift-jis)
- Chinese (big5)
- Chinese for PRC (mixed 1/2 byte) (gb2312)
- Japanese ISO-2022-JP (iso-2022-jp)
- Cyrllic (koi8-r)
- Extended Unix Code, packed for Korean (euc-kr)
- Russian Unix, Cyrillic (koi8-r)
- Windows Thai (cp874)
- Latin 1 Windows (cp1252)
- cp858
- EBCDIC encodings:
- EBCDIC US (ebcdic-cp-us)
- EBCDIC Canada (ebcdic-cp-ca)
- EBCDIC Netherland (ebcdic-cp-nl)
- EBCDIC Denmark (ebcdic-cp-dk)
- EBCDIC Norway (ebcdic-cp-no)
- EBCDIC Finland (ebcdic-cp-fi)
- EBCDIC Sweden (ebcdic-cp-se)
- EBCDIC Italy (ebcdic-cp-it)
- EBCDIC Spain & Latin America (ebcdic-cp-es)
- EBCDIC Great Britain (ebcdic-cp-gb)
- EBCDIC France (ebcdic-cp-fr)
- EBCDIC Hebrew (ebcdic-cp-he)
- EBCDIC Switzerland (ebcdic-cp-ch)
- EBCDIC Roece (ebcdic-cp-roece)
- EBCDIC Yugoslavia (ebcdic-cp-yu)
- EBCDIC Iceland (ebcdic-cp-is)
- EBCDIC Urdu (ebcdic-cp-ar2)
- Latin 0 EBCDIC
- EBCDIC Arabic (ebcdic-cp-ar1)
. |