Changes between Initial Version and Version 1 of XmlVersionNumbers


Ignore:
Timestamp:
Jan 25, 2013, 3:45:23 PM (6 years ago)
Author:
flip
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • XmlVersionNumbers

    v1 v1  
     1= XML Versions =
     2
     3This technical document explains the version numbers Vespa writes
     4into the XML (VIFF files) it creates. They have not turned out to be as useful
     5as we expected.
     6
     7== Introduction ==
     8
     9In all of the XML (VIFF files) written by Vespa, we write a `version` attribute
     10on major blocks. This is not to be confused with the Vespa version information
     11which is written into a comment at the top of each VIFF file.
     12
     13For instance, in the sample below (taken from a real file)
     14there are version attributes on the top level `vespa_export` element, the
     15`dataset` element, and the `user_prior` element --
     16
     17{{{
     18#!xml
     19<vespa_export version="1.0.0">
     20        <!--
     21This XML file is in Vespa Interchange File Format (VIFF). You can download
     22applications that read and write VIFF files and learn more about VIFF here:
     23http://scion.duhs.duke.edu/vespa/
     24
     25It was created with Vespa version 0.6.4.
     26-->
     27        <timestamp>2013-01-18T17:28:39</timestamp>
     28        <comment />
     29        <dataset id="2b92bc53-d98d-48ea-a60e-d374658926b2" version="1.1.0">
     30                <user_prior version="1.0.0">
     31                        <auto_b0_range_start>1.7</auto_b0_range_start>
     32                        <auto_b0_range_end>3.4</auto_b0_range_end>
     33                        <auto_phase0_range_start>1.85</auto_phase0_range_start>
     34}}}
     35
     36== Purpose ==
     37
     38The purpose behind these was to allow us developers to change the XML format
     39as necessary without breaking our ability to read in XML files written by older
     40versions of Vespa. For instance, in version 1.0.0 of Analysis' dataset XML,
     41there was a `block_basic` element that moved to a `user_prior` element in
     42subsequent versions. Our code can read both the older and newer formats.
     43
     44== In Practice ==
     45
     46In practice, the XML versions aren't very useful for two reasons.
     47
     48First, as far as I can tell, in every instance where we've changed an XML
     49version number, the change to the version number has been superfluous. For
     50instance, in the example mentioned above where `block_basic` was eliminated in
     51favor of `user_prior`, the XML version is just a proxy for the presence of a
     52`block_basic` element.
     53
     54Our current code does this --
     55
     56{{{
     57#!python
     58if xml_version == "1.0.0":
     59    for i, block_element in enumerate(block_elements):
     60        if block_element.tag == "block_basic":
     61            break
     62    # ...change block_basic into a user_prior element...
     63}}}
     64
     65But it could just as well do this --
     66
     67{{{
     68#!python
     69block_basic = block_elements.find("block_basic")
     70if block_basic is not None:
     71    # ...change block_basic into a user_prior element...
     72}}}
     73
     74IMO the second version is clearer because it explicitly tests for the condition
     75in which it's really interested.
     76
     77The fact that the version number is superfluous was emphasized by the fact
     78that Vespa 0.6.1 changed the dataset XML format, but the XML version number
     79wasn't changed due to an oversight. The practical consequences of this were
     80nil (or None, to be Pythonic).
     81
     82A second, less important reason that the XML version numbers are not as useful
     83as they could be is that the format is unnecessarily complex. I (Philip) made
     84the bad
     85choice of using app-style version numbers in x.y.z format. A simple integer
     86would have been sufficient and easier to work with (especially in comparisons).
     87
     88The x.y.z format orders as one would hope (e.g. '1.0.0' < '1.0.1', and
     89'2.9.9' < '3.0.0') but such ordering is not as intuitive as simple int
     90comparison.
     91
     92
     93== Conclusion ==
     94
     95I'm not ready to advocate abandoning the XML version numbers yet. There might
     96be a case where they're necessary, we just haven't come across it yet. Currently
     97it costs us next to nothing to keep writing them to the XML, and if we find
     98we need them we'll be glad they're there.
     99