|
The tests currently implemented are:
The test framework uses the approach of running one particular test some number of times (10 for the results shown, except in the large document case) on a document or collection of documents, tracking the best and average times for that test. When one test is completed the framework moves on to the next test on the same document. After the full sequence of tests has been completed on one document, it repeats the process with the next document or collection of documents. In order to prevent interactions between the document models, only one model is tested in each execution of the test framework. Timing benchmarks using HotSpot and similar dynamically optimizing JVMs are notoriously tricky; small changes in the test sequence often cause large variations in timing results. I've found that this is especially true for average times executing a particular piece of code; the best times are much more consistent, and are the values I've shown in these results. Testing the memory usage of the representations works a little differently, in that the program keeps all the constructed copies of the document and pauses between relevant tests to encourage garbage collection. Memory usage per copy of the representation is found by dividing the total memory used by the number of copies. All tests involving I/O use memory buffers to avoid any external timing variables. Input and output uses streams (specifically ByteArrayInputStream and ByteArrayOutputStream) to most closely simulate the normal usage. Some of the models support direct input from character arrays or Strings with higher performance than stream input, but using this type of input for testing gives misleading results; in real world applications, text documents are rarely resident in memory to be passed directly to parsers. Validation is turned off in all tests, and the documents used for the test do not specify DTDs. |