The CME Group Market Data Platform (MDP) disseminates market data for CME Group markets. The client state is updated in real time with an ordered stream of Incremental Refresh messages. These messages are encoded using Simple Binary Encoding (SBE), gathered in packets, and are sent over two redundant UDP connections: Incremental Refresh A and Incremental Refresh B. This redundancy is needed to reduce the chance of packet loss in the UDP.
Furthermore, there is another pair of UDP feeds: Market Recovery A and Market Recovery B. The recovery feeds are used to disseminate market data snapshots of the active books. These snapshot messages can be used to recover from a possible loss of Incremental Refresh packets from both of the incremental channels.
Generating test cases JSON files using imandra-analyser
imandra-analyser-cli strategy.yaml | python extract_packets.py generatedJsons
Starting the packets and transitions visualizer on http://localhost:8000
cd visualizer/
python -m SimpleHTTPServer
We model the CME book-builder as an infinite state machine. The feed_state
datatype contains all of the information about CME book-builder state:
feed_sec_type
: the ID of the security that the exchange is working on.books
: current state of books in the exchangechannels
: state of the communication channels. Contains a sequence of incoming unprocessed packets in the chronological order, a sequence of already processed packets, latest received snapshot, an ordered set of cached packets for recovery and a pair ofcycle_history
structures for channels A and B.feed_status
: the current status of the feed can be eitherNormal
orInRecovery
.internal_changes
: stores all the internal changes in the algorithm - how books are updated with every message, current state of the cache and cycle histories, whether the status isNormal
orInRecovery
, e.t.c.cur_time
: the timestamp of last processed packet
The model takes the next packet from the list of unprocessed packets and iterates over the messages in it.
In the Normal
state, the message sequence number is checked and the message
is either processed normally or, if the message is off-sequence, added to cache
with the feed transitioning to the InRecovery
state.
While InRecovery
, incremental update messages are added to cache, and
snapshot messages are saved as a last_snapshot
. After each message a recovery
is attempted: if all cached messages are in chronological order since last
snapshot -- the feed transitions back into the Normal
state:
The flowchart diagram above gives an overview of the
simulate
: a toplevel function that recursively applies theone_step
function untilfeed_state
is not changed by it.one_step
: performs one step of the simulation of thefeed_state
. The function extracts the next message from the stream of incoming packets and forwards it either toprocess_msg_normal
andprocess_msg_recovery
, depending on the currentfeed_status
. If we are currentlyInRecovery
, then recovery is attempted (viaattempt_recovery
) after the message is processed.process_msg_normal
: processes the next message, when our feed is in theNormal
state. All snapshot messages are ignored. If a refresh messageis_msg_relevant
and has the sequence number that is exactly (last_processed + 1
), then it is applied to the books in the exchange. If there is a gap in the sequence number, then thefeed_status
is changed toInRecovery
.is_msg_relevant
: used by the previous function, it checks whether the next incremental refresh message is "relevant": its security ID equals to this feed's security, if its sequirity sequence number is greater that the last processed one and if the order level number is smaller than the book size of the feed.process_md_update_action
: applies the current incremental refresh action (could be creation, deletion or update of some order) to the books in the exchange.process_msg_recovery
: processes the next message, when our feed isInRecovery
. The incremental refresh messages are placed in the cache (cache is kept sorted). The snapshots for "our" security are stored, and the snapshot for the "reference" security are used to detect whether our security isLiquid
orIlliquid
.attemp_recovery
: there are three possibilities for the status to change back fromInRecovery
The Simple Binary Encoding (SBE) is a binary encoding format for serialization/deserialization of a sequence of hierachically structured messages in a stream of raw binary data.
In the SBE framework, one first defines an XML schema -- an XML description
the binary layout of the messages. Typically, a special codegenerating tool
then reads this XML file, and produces a code (usually Java
or C++
) that
contains the message type declarations and read/write routines for them. The
generated code can then be linked with the business logic software.
In this chapter we'll describe such a code-generation tool for the OCaml programming language.
At the root of it, SBE relies on a small number of common types: ASCII-characters, signed or unsigned integers of various sizes and IEE754 floating-point numbers.
In the XML schema any such 'primitive' type might be augmented by one of the following modifiers:
- It can have a
nullValue
, which denotes the null or none value for the field of that type. - It can have an associated
length
value, meaning that the given field is a sequence of values of that type. - It can have a
constant
"presence" -- in that case the corresponding field is never read or written to the binary stream. Instead, the field is always equal to a constant value, provided in the XML file.
The following conversions between the SBE simple types and their modifiers is used.
SBE Type | OCaml type |
---|---|
int8 | int |
int16 | int |
int32 | Int32.t |
int64 | Int63.t |
char | char |
SBE type modifier | OCaml parametrized type |
---|---|
length | 'a list |
nullValue | 'a option |
Three kinds of "complex" types can then be constructed based on the primitive types described above:
- The
composite
type is a sequence (a record) of fields of various types. - The
enum
represents a number of mutually-exclusive cases. Each case encoded with a constantcase-id
, provided in the XML schema. - The
set
is a collection of boolean fields, packed into a single bit field.
For enum
and set
types, an encodingType
name must be provided.
The composite types are represented as OCaml record-types with each record entry having the corresponding primitive type.
<composite name="FLOAT">
<type name="mantissa" primitiveType="int64"/>
<type name="exponent" primitiveType="int"/>
</composite>
type t_FLOAT = {
f_FLOAT_mantissa : int64;
f_FLOAT_exponent : int
}
The enum types are represented as OCaml variant-types (2a) with each variant case being a constant.
<enum name="LegSide" encodingType="uInt8">
<validValue name="BuySide" >1</validValue>
<validValue name="SellSide">2</validValue>
</enum>
type t_LegSide =
| V_LegSide_BuySide
| V_LegSide_SellSide
If the enum
type has a nullable encodingType
, then one extra case is added to the variant.
<enum name="AggressorSide" encodingType="uInt8NULL">
<validValue name="NoAggressor">0</validValue>
<validValue name="Buy">1</validValue>
<validValue name="Sell">2</validValue>
</enum>
type t_AggressorSide =
| V_AggressorSide_NoAggressor
| V_AggressorSide_Buy
| V_AggressorSide_Sell
| V_AggressorSide_Null
Finally, the set types are treated as records, but with all entries
being of the boolean
type.
<set name="SettlPriceType" encodingType="uInt8">
<choice name="Final">0</choice>
<choice name="Actual">1</choice>
</set>
type t_SettlPriceType = {
r_SettlPriceType_Final : bool;
r_SettlPriceType_Actual : bool;
}
In the XML schema, the declaration of all the necessary simple and complex types is followed by the declaration of various messages. Each message contains a block of fields that are always present in the message, followed by a number of variable-sized groups. Each group is stored as a sequence of repeated blocks, with each block containing the same fixed number of fields.
The OCaml codegenerator takes as an input the XML schema file (set with -i
flag) and writes three files into a specified directory (set with -d
flag)
$ _build/default/src-codegenerator/cme_codegen.bc -i templates.xml -d outputdir
$ ls outputdir
message_types.ml readers.ml writers.ml
The message_types.ml
file contains all the OCaml type declarations. At the
very bottom of the file, the message
type is declared -- it encompasses all
the messages in a single variant type.
The readers.ml
and writers.ml
files contain the reading and writing
routines for various types, for individual messages and for the top-level
Message_types.message
.