This test focuses on performance of XML document models in Java. Ease of use is at least as important as performance for real-world applications, though, and for that you may want to see some code comparisons between the models. My IBM XML developerWorks article provides just this type of comparison. The code in that article is a little behind the current version of the performance test, but still accurate for all the models as of now (the next JDOM beta will require some changes). If you combine the code samples from that article with the performance test results shown here you'll have a good overview of working with the different models.

XPP2 is the performance leader in most respects (though not for large documents). Besides offering very good performance overall, XPP2 provides the best option for handling documents which may not need to be processed completely, in the form of the pull node model. For middleware-type applications that need good performance but do not require validation, entities, processing instructions, or comments, XPP2 looks to be an excellent choice. This is especially true for applications running as browser applets or in limited memory environments. It's somewhat awkward to use, though, so check out the example code in the developerWorks article before you decide to go down this path!

The DOM offers the advantages of cross-language compatibility and standardized APIs, but is not as easy to use as the pure Java models such as JDOM, dom4j, and EXML. If you're working with multiple languages the advantages of the DOM in this area probably outweigh Java-specific ease of use issues; if you're only using Java the choice is tougher.

Both versions of Xerces DOM do very well in most performance results, especially with large documents. If you're working with small documents you'll definitely want to disable the deferred node expansion option of Xerces, though (using a setFeature("http://apache.org/xml/features/dom/defer-node-expansion", false) method call to the DOMParser instance). Even with larger documents this option doesn't seem to offer much of a benefit. Crimson DOM does not seem to offer any substantial advantages over the Xerces versions at this point, so you're probably best off staying with Xerces - this is where new development effort is focused in any case (especially Xerces2).

dom4j offers good performance with a stable and fully functional implementation, including built-in support for SAX2, DOM, and even XPath. dom4j also provides a fairly compact representation of documents. If you're looking for a Java-specific XML document model dom4j is probably your best choice overall.

JDOM doesn't really have much to recommend it from the performance standpoint, though the developers have said they intend to focus on performance before the official release. It also suffers from an API that's still changing every few months with a new beta release (the next version, beta 8, is in preparation as I write this). JDOM does offer a convenient way of building and accessing simple XML documents, but the lack of stability makes it difficult to justify for use in production projects.

EXML is very small (in jar file size) and does well in some of the performance tests. It's also much easier to use than XPP2. If you don't care about EXML discarding whitespace separating elements, both XPP2 and EXML are good choices in limited-memory environments. The tradeoff probably comes down to XPP2 for performance and relatively open license, EXML for ease of use and better functionality.

Currently none of the models can offer good performance for Java serialization, though dom4j does the best. If you need to transfer a document representation between programs, generally your best alternative is to write the document out as text and parse it back in to reconstruct the representation. Custom serialization formats may offer a better alternative in the future.

  Dennis M. Sosnoski