Distributed Event Processing
Distributed Event Processing

In my last post, Analytical Patterns for Complex Event Processing, I provided an overview of a few slides I presented in March of 2006 at first event processing symposium titled Processing Patterns for Predictive Business.  In that same presentation (slide 15), I also introduced a generic high level architecture (HLA) for event processing in the illustration below:

HLA for Distributed Event Processing

The figure above is a application of historical work, more than a decade ago, in distributed blackboard architectures applied to complex event processing.

The genesis of the blackboard architectural concept was in the field of Artificial Intelligence (AI) to address issues of information sharing among multiple heterogeneous problem-solving agents.   As the name suggests,  the term “blackboard architecture” is a processing metaphor where intelligent agents collaborate around a blackboard to solve a complex problem.

Basically, the HLA consists of two key functional elements, (1) distributed data set (the “blackboard”) and “knowledge sources” (KS) that function as (self actuated or orchestrated) intelligent agents working together to solve complex problems.

When I was first introduced to Dr. David Luckham’s book, The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems, I immediately understood that distributed (complex) event processing, which included an event processing network  (EPN) and collaborative distributed event processing agents (EPAs), follow the same generic architectual pattern as other distributed, collaborative problem-solving software architectures. This also made perfect sense to me considering Dr. Luckham’s strong background in AI at Stanford.

In a nutshell, in my mind, “CEP engines” should operate as intelligent agents collaborating to solve complex distributed computing problems.   Professionally, I have much stronger interest in collaborative distributed agent-based network computing that stand-alone event processing.

An exciting complimentary technology for complex event processing is distributed object caching and grid computing, which I will discuss in more detail in a later post.  Together, these architectures, analytics and technologies help paint a total picture of the future of event processing, at least in my mind.

7 COMMENTS

  1. I hope you are planning to lay out how this can scale in terms of end-to-end latency, not to mention how you are going to debug it? If you are going to make something correct and very fast, designing it with a high degree of complexity and non-determinism is just going to make your life miserable.

    It is my claim “intelligent” agents cannot be used in critical large-scale systems. The degree of determinism is not sufficient, and I do not expect the performance to scale.

  2. Interesting stuff. It sounds similar to my own work with distributed reasoning. Since 2000, I’ve been working with Said Tabet on an efficient way to distribute reasoning. The work resulted in a patent filing in 2004 for distributed reasoning. I stumbled on a technique for extending RETE to support distributed reasoning. You should be able to find the patent filing on fresh patents website.

    Collaborative agents or distributed agents work well for some situations, but in some cases they do not scale well. The work I did was inspired by a situation where the data is distributed and large. By large I mean potentially terabytes of data distributed across dozens of locations. My approach is to distribute just the beta node indexes, which allows the system to perform index joins. Rather than have agents share data, it’s much more efficient to distribute the indexes of the partial matches, which reduces the work each agent has to perform.

    peter

  3. Hi Peter and Lars,

    Yes I agree with both of you, with some caveats. There are classes of applications where distributed architectures and specialization are appropriate, and other applications where distributed architectures can become unyieldly.

    In practice, we sometimes find the architectural decisions are technology based, but more often the architectural decisions are based on organizational politics, funding and culture.

    For example, if you have a large distributed network and you are working on fraud detection, insider trading detection, anti-money laundering detection or similar classes of detection-oriented applications, there are generally many administative domains and internal organizations in the architecture.

    Some of these folks already have specialized “purpose built” detection systems and they are not about to give up administrative control. These stand-alone systems are only able to “see” a part of the overall situational picture. If we are working on integration and inference, we can model these specialized systems as event processing nodes with an eye toward interoperability and collaboration.

    Latency may or may not be the critical issue in these classes of applications. There are classes of applications, like detection-oriented solutions in market data where latency (along with accuracy and determinism) are the key discriminators. On the other hand, this is not necessarily true for all classes of event processing applications.

    For these reasons, I disagree with generalizations that event processing and low latency are one-in-the-same. It really depends on the business application.

    Thank you for your comments!

    Yours faithfully, Tim

  4. I agree completely the technology is just one factor. From my short experience with large institutions, politics play a huge role. Each group has their own way of doing things and integrating with different systems is a pain point. Whether the integration is done with messaging, SOA or some other component model, I find the hardest part is the politics. Distributing events, data grids, distributed agents and distributed reasoning are fascinating topics. Short blogs can only scratch the surface.

    peter

  5. Dear Peter,

    Thank you for your insiteful comments.

    To elaborate a bit, one of the best operational examples I have direct and many years experience with is network and security management for large global organizations.

    Naturally, there are folks who want visibility into the entire network so they can perform a variety of enterprise management tasks.

    Operationally, however, folks want to make sure that “events” that leave their administrative domain have been “seen” and “acted upon” before sending them outside of their domain.

    For this reason, and others, distributed cooperative architectures designed by architects with operational experience and business experience have a much greater chance of success that technology-driven architectures.

    The largest reason that large IT projects fail is not technology, it is political. We can solve latency issues with out-of-band networks, (and more money) if required. However, solving operational and organizational control issues is not so easily solved by purchasing more bandwidth or building out-of-band networks!

    To close, distributed architectural frameworks provide the functional framework to work within complex political and organizational constraints. In addition, the model for distributed architectures can always be simplifed to mimic a centeralized architecture; but the reverse is not necessarily true.

    Yours faithfully, Tim

  6. […] CEP Cloud? Most if not all of the Complex Event Processing (CEP) products today require the product purchaser to build a specific CEP cluster. It’s logical therefore to assume (possibly incorrectly) that CEP needs to move onto the compute grids that a large proportion of the investment banks are investing in. Since we are moving into cloud land, we might as well add CEP into the equation as well – CEP Cloud, or just Event Cloud. […]

Comments are closed.