(How to) Decode your own custom descriptors

Starting with the MPEG Analyzer software version 2.99 it is possible to import definitions of your own descriptors. The structure of descriptors needs to be described in an XML dialect which is described on this page. There is also limited support for logic in the form of "if this bit is set or that field has values between [a] and [b], this field appears here". 99% of the descriptors we are aware of in the current MPEG2 & DVB world can be decoded with this logic. The ability to map certain numeric field values to a descriptive human-readable text makes this feature complete.

All files should be encoded in UTF-8.
The name of the XML root element is irrelevant. Root elements can contain elements of the type <struct> and <enum>. The following image shows the file containing (among other things) the description of the LCN Descriptor for the private range 00000028 which has been assigned to EACEM:

Graphics showing the contents of private_EACEM.xml

Download the example private_EACEM.xml.

Configuration

You need to tell the MPEG analyzer where your custom descriptor files are located on your PC and which context, tables or other descriptor-carrying data structures your descriptors can appear in. This option is not available in the demo/trial version.

The configuration dialog.

In most cases, you will have to select "DVB" as usage. This will include your descriptors in the decoders for most MPEG2 & DVB related tables. Selecting MHP will include them in the AIT parser which uses a completely separate range of descriptor tags. ATSC will include it in the decoder for the limited decoding of ATSC that the software provides. DSMCC will include them in the decoder for the DIIs descriptor loop (Please contact us if you need this, this works a bit different).

The software has to be restarted for the new configuration to take effect. If the file(s) contain syntactical errors, a dialog box giving the filename and line of the error will appear.

<struct> elements

<struct> elements need to have the following two attributes:

"name" => contains the free-text name that will be displayed when your descriptor is encountered.
"tagname" => is used to identify your descriptor. It needs to take the form
- "descriptor_[TAG]"
- or "descriptor_[TAG]_[PRIVATERANGE]
- or "mpeg2exdescriptor_[TAG]"
where [TAG] needs to be replaced with the 2-digit hexadeximal descriptor tag and
[PRIVATERANGE] needs to be replaced with the 8-digit/4-byte private data specifier id if you are trying to decode a private descriptor.
The mpeg2exdescriptor name is used for descriptors that make use of the extension_descriptor and descriptor_tag_extension. In this case, the [TAG] needs to be replaced with the descriptor_tag_extension value of your descriptor. The mapping from the extension_descriptor with its own descriptor_tag is done automatically by the analyzer.

Inside <struct> elements, there can be a wide range of other elements that (in the order of appearance) describe what your descriptor contains after the descriptor_tag and length field. The parser currently supports the following elements. Except for the logic elements <loop>, <loopnum> and <if>, all of them need to contain a name attribute. Some of them need a length field. Some of them support a "ref4loop" attribute that can copy their value into a stack where they can be used as input parameter for loops or if blocks. Elements describing some sort of numerical value can also have an "isenum" attribute to lookup the number in an enum map element.

<bitfield>

Describes a bitfield. The length parameter indicates the size (=number of bits). Can contain a "ref4loop" attribute.
Bitfields should always be followed by other bitfields until a byte-boundary (=at least 8 bits) is reached.

<byte>

Describes a single byte. Can contain a "ref4loop" attribute.

<word>

Describes a 16-bit word. Can contain a "ref4loop" attribute.

<dword>

Describes a 32-bit dwword. Can contain a "ref4loop" attribute.

<hexblock>

Describes a block of bytes which will always be decoded as hex-dump. The length attribute contains the expected number of bytes and can have the special value of "exhaust" to indicate that this block ends at the end of the descriptor. Can contain a "ref4loop" attribute.

<char>

Describes a series of ASCII characters. The decoder will display the corresponding text. The length attribute contains the expected number of bytes and can have the special value of "exhaust" to indicate that this block ends at the end of the descriptor.

<dvbchar>

Describes a string which is encoded according to the EN 300 468 Annex A. The decoder will try to display the corresponding text, based on the encoded control code at the beginning. The length attribute contains the expected number of bytes and can have the special value of "exhaust" to indicate that this block ends at the end of the descriptor.

<loopnum>

Describes a loop of elements. The mandatory "count" attribute references a value that has been stored by a previous ref4loop attribute. It tells the analyzer how often a loop appears in the descriptor. <loopnum> elements can contain all elements that a <struct> can contain as well (=nesting loops is possible). Please note that the "ref4loop" stack is shared between inside and outside of the loop. There is intentionally only one such memory block per descriptor being parsed to allow complex decoding logic.

<looplen>

Describes a loop of elements. The mandatory "length" attribute references a value that has been stored by a previous ref4loop attribute. It tells the analyzer how many bytes the loop has. It can contain the keyword "exhaust" to indicate that this loop ends at the end of the descriptor. <looplen> elements can contain all elements that a <struct> can contain as well (=nesting loops is possible). Please note that the "ref4loop" stack is shared between inside and outside of the loop. There is intentionally only one such memory block per descriptor being parsed to allow complex decoding logic.

<if>

This is a logic element that tells the analyzer that a certain set of data fields only appear in the descriptor when certain conditions are met. These elements need to contain the following attributes:

condleft: references a value on the "ref4loop" stack that contains the value from the descriptor you want to compare

operator: can take the values "<", ">", "=="(equal), "!="(not equal)

condright: contains the actual value you want to compare against

The following image for the actual MPEG-H_3daudio_multi-stream_descriptor contains an example which probably explains this better any any dry text description:

<enum> elements

<enum> elements can be used to translate numerical values from bitfields, bytes, words and dwords into human-readable text. They contain a series of <enumentry> elements which all need to contain a "name" and a "value" attribute.

Please note that the value attribute

is encoded hexadecimal
can accept ranges (like "12-15" in the following example):

Submit your descriptors for inclusion in the public analyzer version (please)

Once you are done describing your own descriptors, please consider submitting your file for inclusion in the public mpeg analyzer version, ideally with a link to the original specification. Please note that we will not accept submissions that result in conflicts with existing official MPEG2 or DVB ranges or make use of user-private tag ranges without the corresponding private-data-specifier ID (Funfact: No, these 4 bytes will not kill you and cost you actual bandwidth - even if you send them 10 times in 10 different tables).