Clouds (Partially Order Sets) – Streams (Linearly Ordered Sets) – Part 2
Here is my follow-up note on posets (partially ordered sets) and tosets (totally or linearly ordered sets) as background set theory for event processing, and in particular CEP and ESP.
In my last note, we discussed posets and tosets in the context of ESP (tosets) and CEP (posets) and we confirmed, via set theory, that tosets (chains) are a special case of posets with the added property of comparability. Kindly refer to this link for a quick review and this link for my prior post on set theory:
On CEP-Interest we have enjoyed excellent discussions of how many ESP software vendors process tosets (like stock market data for a particular stock) and reorder out-of-order events/transactions. The idea to note here, for our discussion on set theory, is that market data for a particular stock is a toset, ordered by the time of trade execution. If these transactions are then sent to an event processing server, for example a nice ESP server by one of our friends here, the ESP server may order the out-of-order toset, but the stream of market data is a toset, by definition.
Now, in tosets, what are the main properties that define the notion of “order” or “comparability”? Well, there can be many, for example; time, causality, association, taxonomy, ontology, etc. Market data, of course, is generally processed as a toset where the order of execution is the comparable property of the toset. Generally, ESP applications do not look, at this point in time, for the causality behind why one trade was at $9.213 per share and then the next, 20 ms later, was executed at $9.222 per share. Generally, ESP applications, today, run continuous queries across defined sliding time windows of market data and calculate some interesting value, such as VWAP.
On the other hand, CEP defines a more general poset application for event processing, where, for example, the set of events may or may not be linearly ordered in time, yet causality is unknown. In this case, the comparable property of tosets is unknown, because the events may or may not be related or comparable, from a cause-and-effect perspective. One obvious example is the CEP application of fraud detection, where many distributed events happen and we want to determine the cause, and the effect, in real-time. The set of seemingly unrelated events is a poset because the relationship or comparability between all the members of the set of events is (currently) unknown. Dr. Luckham’s seminal CEP work in this area, at Stanford University, was based on debugging distributed systems, or problems in causality.
If we understood all the cause-and-effect relationships in a set of events, and we could determine the order, then we effectively have transformed a poset into a toset, or an “event cloud” into an “event stream”. This is precisely what many classes of CEP applications are designed to do.
Now, to close this note, let us consider the universal set of all events. Let us assume (to make this easier) than the master clock of the universe is so granular, so precise, say to a trillion decimal places (maybe like calculating Pi) that all events in the universe can be ordered in time. If we were only processing all events based on time order, then we would have a stream of events, or a toset.
On the other hand, if we are interested in causality in the universal set of events, then we have a poset, because we do not possess the observation and computing power to deal with such a set of events. So, what do we do?
We create subsets of the universal set, and we process these sets, because we possess this computing and observational capability. For example, we can work on sets of data called “market transactions” and run VWAP across a stream of stock transactions. The set is a toset. When we create subsets of events from the set of distributed network application events, like on-line banking, and look for causes and effects from seemingly incomparable events, now we are operating on a poset. The first application on totally ordered events is what vendors are calling, ESP. The second event processing application, based on processing posets, are what Dr. Luckham (and I and others) refer to as CEP.
Thank you for reading.