PathEngine home previous: Ground Mesh Validationnext: PathEngine Coordinates
Contents, Programmers Guide, World Representation, Direct XML Generation, Tokenised XML

Tokenised XML

A tokenised XML scheme is used by PathEngine to enable smaller file sizes while retaining structural equivalence over the subset of XML used.

Tokenised XML files are binary, and contain strings for elements and attributes once, at the top of the file.
In the remainder of the file, tokens are used instead of those strings.
Attribute values are either stored as integers in binary form or left as strings, depending on the contents.

Tokenised XML files are generally something like a third of the size of their XML counterparts.
Tokenisation is not intended to take the place of traditional compression.
Tokenised XML still contains a lot of redundancy, and tokenised XML files are ameniable to further compression.

Converting files to tokenised form

The simplest way to do this currently is to load the file into PathEngine, and then tell pathengine to save the file back out in tokenised form.
So for a mesh you would load the mesh using iPathEngine::loadMeshFromBuffer(), and then save the mesh using iMesh::saveGroundEx() with format set to "tok".

There is normally no need to understand or to work directly with the tokenised XML format on the side of the client application.
An example is provided nevertheless, for reference purposes.

An example

For a trivial square mesh, in xml:

! Note that this example uses a legacy mesh format, not supported by releases 4.81 and above.
Nevertheless, since the tokenised XML representation is independent of the information represented this mesh continues to provide a valid example for the tokenised XML representation.
Note also that only a subset of the tokenised XML format is presented here, but this should be sufficient for the purpose of generating ground mesh files directly in this format where this is desired.

<mesh>
    <polygon>
	    <vertex x="-100" y="100" height="0"/>
	    <vertex x="100" y="100" height="0" connectedpolygon="1"/>
	    <vertex x="-100" y="-100" height="0"/>
    </polygon>
    <polygon>
	    <vertex x="100" y="100" height="0"/>
	    <vertex x="100" y="-100" height="0"/>
	    <vertex x="-100" y="-100" height="0" connectedpolygon="0"/>
    </polygon>
</mesh>

The following records exactly the same information in the tokenised XML form:

0000 6d 65 73 68 00 70 6f 6c 79 67 6f 6e 00 76 65 72 mesh.polygon.ver
0010 74 65 78 00 00 01 78 00 01 79 00 01 68 65 69 67 tex...x..y..heig
0020 68 74 00 01 63 6f 6e 6e 65 63 74 65 64 70 6f 6c ht..connectedpol
0030 79 67 6f 6e 00 00 01 00 02 00 03 01 2d 31 30 30 ygon........-100
0040 00 02 31 30 30 00 03 30 00 00 00 03 01 31 30 30 ..100..0.....100
0050 00 02 31 30 30 00 03 30 00 04 31 00 00 00 03 01 ..100..0..1.....
0060 2d 31 30 30 00 02 2d 31 30 30 00 03 30 00 00 00 -100..-100..0...
0070 00 02 00 03 01 31 30 30 00 02 31 30 30 00 03 30 .....100..100..0
0080 00 00 00 03 01 31 30 30 00 02 2d 31 30 30 00 03 .....100..-100..
0090 30 00 00 00 03 01 2d 31 30 30 00 02 2d 31 30 30 0.....-100..-100
00a0 00 03 30 00 04 30 00 00 00 00 00                ..0..0.....
00ab
        

Description of the tokenised format

All element and attribute strings for the document are enumerated at the start of the file.
The order in which these are encountered is used to assign an index for each element or attribute string.
A type is also assigned to each attribute at this point.
The structure of the xml document follows with indices in place of element and attribute strings.

The elements and attributes are stored in the form of C strings, i.e. as a sequence of characters terminated by zero.
The element enumeration comes first, terminated by an empty string.
This is a sequence of strings terminated by the null string:

0000 6d 65 73 68 00 70 6f 6c 79 67 6f 6e 00 76 65 72 mesh.polygon.ver
0010 74 65 78 00 00                                  tex..
0015
        

This specifies that elements may be one of "mesh", "polygon", and "vertex".
The index zero is reserved to mark the end of an element, so index 1 = "mesh", index 2 = "polygon" and index 3 = "vertex".

This is followed by the attribute enumeration.
This is a sequence of type specifier followed by attribute string.
Each type specifier is a single byte.
The attribute enumeration is terminated by a type specifier with a value of zero.

0015                01 78 00 01 79 00 01 68 65 69 67      .x..y..heig
0020 68 74 00 01 63 6f 6e 6e 65 63 74 65 64 70 6f 6c ht..connectedpol
0030 79 67 6f 6e 00 00                               ygon..
0036
        

So the attributes "x", "y", "height", and "connectedpolygon" are enumerated.
Again, index zero is reserved, so index 1 = "x", index 2 = "y", index 3 = "height" and index 4 = "connectedpolygon".

Each attribute is assigned type 1, which means that the attribute value will be represented by an arbitrary string, just as in the original xml.

The following values are possible for this type specifier:

Type specifier
Meaning
1 C string.
2 32 bit signed integer.
3 16 bit signed integer.
4 8 bit signed integer.
5 32 bit unsigned integer.
6 16 bit unsigned integer.
7 8 bit unsigned integer.

Integers are stored in binary form, with low bytes first.

The remainder of the file records the structure of the xml document.
Each element is stored by index, and is followed by a sequence of attributes.

First of all we have the opening mesh element, with no attributes:

0036                   01 00                               ..
0038
        

An attribute list is a sequence of attribute indices, followed by attribute values.
As usual, the value zero is used to terminate the attribute list.

This is followed by a polygon element, again with no attributes.
Just as in the original XML, this nests inside the "mesh" element.

0038                         02 00                           ..
003a
        

And now a vertex element, with attributes:

003a                               03 01 2d 31 30 30           ..-100
0040 00 02 31 30 30 00 03 30 00 00                   ..100..0..
004a
        

Here attribute index 1 indicates an "x" attribute.
This is followed by a null terminated string "-100" for the value.
"y" and "height" attributes then follow, each with a null terminated string for value.
Finally a value of zero terminates the sequence of attributes.

The next element index is zero.
This indicates the end of the current element. (Which is the first vertex in the first polygon.)
Unlike in XML, the type of element is not explicitly represented at an end element marker.

004a                               00                          .     
004b
        

The rest of the file follows the same structure, and will be read until the root mesh element is ended.


Documentation for PathEngine release 6.04 - Copyright © 2002-2024 PathEnginenext: PathEngine Coordinates