The Motivation Behind Adaptive Analytics and CEP
This is a continuation of The Genesis of Complex Event Processing: Asymmetric Capabilities and CEP, Event Noise and Asymmetric Event Processing where I have been discussing the motivation behind CEP and adaptive analytics in cyberspace.
Around the same time that Professor Luckham and his team was working on CEP applications in network management and security management, I was leading efforts to build network and security management control centers for the United States Air Force. In the beginning, dating back to 1994, my Internet-related work was for Air Combat Command (ACC), working out of ACC headquarters at Langley Air Force Base.
In 1997, I led a technical team that developed countermeasures against an actual distributed Internet-based attack on the Langley AFB SMTP email infrastructure. This attack was documented in a technical paper, E-Mail Bombs and Countermeasures: Cyber Attacks on Availability and Brand Integrity, IEEE Network Magazine, Vol. 12, No. 2, pp. 10-17, March/April 1998. In addition, this attack and the countermeasures I designed was featured in Popular Science Magazine in an 1998 article, War.Com and other news channels. I also published a number of related papers on this topic.
Our team used a rule-based approach for countermeasures against massive email bombs attacks on the Langley Air Force Base email infrastructure. We called this rule-based system BombShelter, and it was written in PERL. I developed both the original software architecture and the original working prototype for BombShelter (in two days) and then we turned the software over to our team who used the rule-based approach for daily attack countermeasures.
I watched for days, and then weeks, as my team designed rules, and the attackers wrote new attacks that circumvented the rules. Some folks in the Pentagon used to say that I “led the effort to fight the first war in cyberspace”. It might have have been the first cyberwar, I am not sure, but it was certainly the first publicly documented cyberwar. There is no doubt about this.
Without getting into all the historical footnotes and significance of this cyberwar that was fought with experts and rule-based systems, I would like to jump to an important conclusion.
Rule-based systems are useful, but have limited functionality and scaleability in most complex event processing applications.
Rule-based systems are human resource intensive because rule-based systems cannot learn and adapt on their own, humans learn and then write new rules. This is how rule-based systems work.
This is the motivation behind why I spend a lot of time to search for new, more efficient and adaptive methods as alternatives to rule-based systems. After extensive research, I published a series of papers on the future of intrusion detection in the Internet. Intrusion Detection Systems & Multisensor Data Fusion - Creating Cyberspace Situational Awareness [1], helped lead an evolution in Internet security, particularly in the area of network-based intrusion detection systems (IDS).
In my published research work, motivated by limitations with rule-based approaches, I used the same mature functional model that is used to process missile attacks, control global air traffic, and other complex event processing applications in physical space; but I applied these concepts to cyberspace.
Around the same time, Professor Luckham and others were working on similar problems, all related to real-time detection and response to threats in cyberspace. They were also funded by the US government.
Sidebar: Stream processing of transaction- based systems (databases), another area of interest, was focused on a totally different problem, which was low latency straight-thru processing in database-oriented systems. These stream processing systems were, and remain however, rule-based systems. The problems we were trying to solve in cyberspace, however, cannot be efficiently and pragmatically solved by rule-based systems alone. Only relatively simple scenarios can be efficiently detected by rule-based stream processing systems.
The vast majority of complex event processing classes of problems require rules plus advanced algorithms that can learn and adapt in real-time. I know this, not from reading papers or taking university classes on rule-bases systems, but from working on some very challenging operational problems in real-time. This is why I remain interested in complex event processing and why I continue to elaborate on why rule-based systems have limitations.
Filed under: Advanced Event Processing, Agents, Analytics, Artificial Intelligence, Blackboard Architecture, CEP News and Events, CEP Terminology, CEP Tutorials, Complex Event, Complex Event Processing, Consulting, Cybersecurity, Detection Theory, Event Cloud, Event Processing, Event Stream Processing, Intrusion Detection, Security Event Management, Sensor Fusion, Systems Engineering, Use Cases












OK, I’ll bite.
You say that “Rule-based systems are human resource intensive because rule-based systems cannot learn and adapt on their own, humans learn and then write new rules. This is how rule-based systems work.”
Well, for most commercial BRMS products, that is certainly true. I certainly haven’t seen business rules engines that double as learning systems. Maybe some will evolve in that direction over time.
The trouble is that you don’t say ‘BRMS’ - you simply say ‘rule-based systems’. This is surely not a logical statement. Learning system, almost by definition, are rule-based systems, and must, by definition, have the ability to generate new rules through inference. Some learning systems, like SOAR for example, are not only rule-based, but are built around the same core technology used by many BRMS systems.
Learning systems are rule based systems, so I am finding it difficult to understand the point you are making. Did you really mean simply that modern BRMS technology doesn’t generally implement learning or adaptive features? If you did, I agree with your observation.
Hi Charles,
Thanks for “biting”. I am speaking of rule-based systems in the context of expert systems as defined in the field of AI and related fields. For example, see this link:
http://www.ramalila.net/Adventures/AI/rule_based_systems.html
Rulel-based systems do not “learn”, according to AI theory, which is different than Neural Nets, which can “learn” or be “trained”. For example, from the same website, see:
http://www.ramalila.net/Adventresu/AI/neural_networks.html
This is also true for Bayesian Networks (BN), Self-Organizing Feature Maps (SOFM), and a number of other analytics which can be classifed as “learning algorithms”.
Rules, as I understand them, are generally described as expert systems processes as IF-THEN-ELSE processing, at least that is how the AI references refer to rules.
If we take, for example, a rule-based system and “bolt it” on to a network without humans writing rules, you get nothing. On the other hand, if you “bolt” a learning algorithm on to a network, and give it a bit of a “kick start” it can “learn” similar to how Neural Nets (NN) are used for anomaly detection in network security and telecommunication fraud, etc.
Since it is not feasible to know every possible “rule” to detect patterns of abuse or misuse, we use rules (signature detection) and also anomaly detection (for example a NN that has been trained to know what is normal usage) that detects deviations from anomaly.
So, I am working from detection theory using terms from the broad field of AI. Hopefully, I am doing it correctly!
If you think we should blur the lines by calling all this detection theory “rule-based” then you will need to redefine “rule-based” in AI, isn’t that right?
Did I understand you correctly? Or are we simply converging on semantics?
Yours faithfully, Tim
Thanks Tim. We’ve had discussions like this before, of course, and will probably continue to do so in future. We certainly do seem to be starting from different places here, but hopefully we can converge on semantics.
I don’t believe there is an one definitive source of authority with regards to defining the term ‘rule-based system’ from an AI perspective. To understand what a ‘rule-based system’ is, we would first need to define what a ‘rule’ is. That is a surprising complex subject. Typical rule taxonomies recognise reactive and transformational rules, terms, integrity constraints, alethic and deontic rules, etc. Production rules stand apart from these taxonomies in that they are a curiously informal concept, and can be mapped to various more formal concepts. For example, I came across a claim a little while ago that production rules are horn clauses. In truth, they can be used to represent Horn clauses, but are certainly not constrained in this way.
Rule-based systems certainly don’t ‘learn’ if they are not programmed to do so (and they often are not). However, as there is a significant history of building rule-based learning systems, and as much of this effort is rooted ultimately in the work done by people like Allen Newell and the like, I don’t think your statement re. AI theory can possibly be regarded as correct. Rule engines are typically associated with inference. Most commercial business rules engines draw the line at data inference. However, more advanced applications of production system technology support rule inference as well. As they chain, they infer new rules and add these to their knowledge base (e.g., they extend their inference network by adding new nodes). This is surely the clearest example of what is meant by a ‘learning system’. Expert systems typically require human input to add knowledge, but this is not a technological constraint - it simply reflects the type of applications that expert systems are traditionally used to implement. There is absolutely no reason why rule-based systems cannot infer new rules from the application of existing rules, or be used in conjunction with analytics to infer new knowledge. In truth, there is the widest spectrum of possibilities here.
I’m sorry to be so combative, but I do feel that you have not fully grasped the scope and breadth of rules-processing technology, the complexities of that technology or its true relationship to analytics, reasoning over uncertainty, cognitive systems and the like.
I’ll give you a hostage to fortune, just to make things interesting. Have a read of the following PDF.
http://www.soartech.com/projects/16%20SoarOverviewWP.pdf
You will find plenty of language in there that will seem, at first glance, to give some weight to your position. The authors are very keen to distinguish their technology from ‘typical’ rule-based systems. The technology in question (SOAR) is characterised by a number of advanced concepts that go way beyond anything you will find in a business rule engine or an expert system. They also try to relegate the centrality of production rule processing in defining what SOAR is there to do.
That’s fine, but at the end of the day, SOAR remains a complex learning/cognitive system built over a rules engine (using the Rete algorithm). Next to OPS5, SOAR is undoubtedly the second most widely used system in academic research into Rete and production systems. SOAR has the same roots as most of the commercial business rules processing systems. It isn’t a business rule engine. It isn’t an expert system. It certainly is rule-based at its very core.
Hi Charles,
No need to apologize. I don’t find you combative at all, you are making good points. To give your reasoning the considered response it deserves, I’ll need to read the references and think about what you have said.
You might be right, that I have missed something along the way that limits my full grasp of the subject, and if so, I’ll be happy to take a fresh look and learn some new concepts on rules-processing technologies.
I would like nothing better that to have more faith in rule-based approaches, so if there are advanced concepts to review, I am all eyes and ears.
Yours faithfully, Tim
I thought I’d chime in on the topic. From my limited study of the existing literature around expert systems, business rule engines, case base reasoning, machine learning and production rules, the definitions do vary quite a bit.
Many of the older papers on expert systems were used to build machine learning applications. One can view an expert system as the foundation for building machine learning applications. On top of that, the developer must develop the metarules, ontology and knowledge extraction routines to build machine learning. The field of machine learning vary dramatically from user guided training to statistical methods. Then there’s combination of various techniques to compliment each other.
No single technique is able to handle all situations, so it requires integrating several different approaches. My knowledge of the field is limited, but it’s pretty clear to me the field is diverse. It would take 20-30 years to get a solid understand of the field of machine learning.
For those interested in machine learning, I would highly recommend stanford’s paper on stanley the autonomous vehicle that won the grand challenge. The paper gives a tiny glimpse of what it takes to build a complex machine learning application.
peter
Well, this is were we always seem to end up in general agreement. Many different approaches need to be integrated together. I couldn’t have put it better.
I was really struck, a few weeks ago, at finding some ancient journal article (from the mid 1980s) on commercial expert systems designed for the new-fangled world of ‘microcomputers’. Most of them happily managed to pack Bayesian or Dempster Shafer-based analytics into the same small address space inhabited by the rule-based system. Nothing new under the sun!
Hello Again Charles,
(Hi, Peter, great to see you here!)
Charles, I did a quick, and proballly woefully inadequate survey of SOAR and a few related SOAR papers and manuals. I think I might see where our disconnect lies. Let me try to explain, and perhaps you can find what I missed or left out.
You are discussing rules in the context of goal-directed learning systems, for example your SOARS example. In laymans terms, I would say SOARS is a scheduling-oriented application. I have previously blogged how rules are critical to the scheduling process and I believe, if my memory is not failing me, that I addressed this in terms of distributed computing architectures like blackboards.
The scheduling function is critical (very critical) to a large class of applications and this is a very interesting technology area.
My post, however, was toward detection theory, not scheduling, or goal-driven processing. CEP, by definition, is about “detecting opportunities and threats” and I was not able to find any examples of SOAR in a detection-oriented application scenario. (I do admit, however, that scheduling is critical in detection processing, so it is not possible to entirely decouple the two topics, which I am seemingly attempting to do!)
Back to detection theory, which is one of the angles that I am coming from, it is well established (sorry for the “it is well established” phrase) that both pattern matching (signature detection) and anomaly detection algorithms, working in tandem, are required for most classes of complex detection-oriented classes of problems.
I believe this was the original premise of my post. So upon further review, you have made a strong (and good) argument for an advanced rules application that is interesting, but seemingly orthogonal, to the core detection concept that is at the heart of CEP.
What did I miss?
Yours faithfully, Tim
Well, now, I am no SOAR expert, but I understand SOAR to be a ‘cognitive’ system. It was designed to emulate human cognitive abilities (including the ability to learn), and is based on some very specific theories, I believe, about the way human brains work. In the same way that SOAR is not a business rules engine and not an expert system, it isn’t a CEP system or event processor.
My point, of course, is that it is rule-based, and it is also a learning system (amongst other things). I was reacting very specifically to your statement that “Rule-based systems are human resource intensive because rule-based systems cannot learn and adapt on their own, humans learn and then write new rules. This is how rule-based systems work.” Of course, that statement was made in the context of the statement just above, which is specifically to do with CEP. Given that rule systems can (if built to do so) learn and can adapt on their own (two of the central features of a cognitive system), I wonder if the statement you highlighted in yellow remains intact? Why shouldn’t rule-based cognitive systems be used within intrusion detection applications in order to remove any need for intensive human interaction?
Of course, there are not too many rule-based cognitive systems out there ? I note, though, that Sandia National Labs live in this same space. They are the people responsible for one of the best known Java rules engines (JESS). I don’t know how much of a tie up there is there.
Hi Charles,
Great discussion; however, there has not been any information surfaced that has caused me to retract from the statement I made in the original blog post (that you ask about):
“Rule-based systems are useful, but have limited functionality and scaleability in most complex event processing applications.”
I am not a member of any group or team that believes all (or most) of the world’s detection-oriented problems can be solved with rule-based systems. Nor am I a believer in reductionism.
On rules, I have never said that rules were not “good”, I have said they are “not enough” and “limited” in both “functionality” and “scaleability”. I think this is absolutely a true statement.
Since you have kindly shared some reading literature with me, I would hope you would take up my offer and review the key concepts in this book:
Handbook of Multisensor Data Fusion
http://www.amazon.com/Handbook-Multisensor-Electrical-Engineering-Processing/dp/0849323797
The leading experts in the field of sensor fusion (who have been doing CEP longer than any of the “CEP companies in the news”) have not attempted to reduce the art-and-science of MSDF to only rule-based systems.
Yours sincerely, Tim
I think it’s pretty clear that a rule only solution won’t scale to handle a very dynamic pattern discovery/detection environment. My take away from the discussion and from reading existing papers on machine learning is it really requires a wide mix of technologies. All machine learning systems start out with a base set of rules from which to grow.
Going back to Sebastian paper on stanley. Teaching a system to drive autonomously required a wide mix of techniques. The systems that weren’t able to adapt and distinguish noise from useful data weren’t able to complete the course. Here is a link to the paper http://robots.stanford.edu/papers/thrun.stanley05.pdf
In many ways, these are complimentary technologies. It just takes years and years of dedicated study to learn how to use it effectively.
peter
Thanks Tim. I concur with pretty much everything you say in your latest comment. Although I have a very specific interest in rule-based processing, I trust that I avoid taking the extremely naive (and plain wrong) position of suggesting that rules-based systems are the panacea for the areas of computing which you highlight on this blog. As I pointed out above, we have tended, in the past, to arrive at general agreement that CEP is a multi-disciplinary subject which requires many different approaches used in combination.
To state my position again, I was specifically reacting to the statement that “Rule-based systems are human resource intensive because rule-based systems cannot learn and adapt on their own”. They can, if they are designed to do so, and there is a reasonable amount of prior art to illustrate this.
Your comment about reductionism hints at the underlying profound nature of this discussion. Forgive me if I am being unfair to you by thinking that this was meant as a criticism of the position you believe I am taking (maybe I am paranoid). From my perspective, it is precisely because I am so conscious of the need to avoid reductionism that I reacted to your statement about rules engines. I don’t really see the concept of rules, or rules-processing, as existing in precisely delineated boxes (e.g., specific types of rules engine technology or specific types of application). In an absolute sense, I don’t believe in the concept of ‘rules’ as a useful first-class notion in computer science at all! After all, pretty much anything that constrains a computational device could be classified as a ‘rule’ – every opcode represents a rule of sorts (“if at the current program location, move register ‘a’ to register ‘b’”). Rules processing is a broad and multifaceted subject. From this perspective, statements that rules systems can’t ‘learn’, or can’t reason over uncertainty, or can’t scale make little sense to me. In practice, to believe this not only runs the risk of missing the contribution that rules-based approaches can offer within the greater picture, but actually partitions the world of computation in ways that don’t reflect reality.
Now, if you meant that modern business rules applications and expert systems have only limited application to CEP, I would absolutely agree with you.
Hi Charles,
Well, I think the literature is abound with statements about rule-based systems are not optimal learning systems nor scaleable for large scale problems.
Rules are useful, but there is a lot more to solving complex problems than rule-based systems, pattern matching and signature detection.
Yours faithfully, Tim
Dear Charles,
I forgot to mention that I admire your passion for rule-based systems and your professional on-line discussion style. You are an admirable professional; and I also like your blog, and will add it, if you don’t mind, to our blog roll.
Yours faithfully, Tim
I recommend charles blog to people. He’s knowledgeable and takes time to think thing through. Unlike my blog, charles blog doesn’t contain nasty typos and errors
Thanks Tim. And I reciprocate those feelings (let’s have a ‘love-in’ here) - I’m fascinated by the world of CEP and the opportunities it presents, and am therefore a regular visitor to your site which is one of the best places to visit for news and views on CEP. Like others, I particularly appreciate the way in which you regularly draw attention back to the core, and distinctive, characteristics and value of CEP.
No doubt we will have similar discussions from time to time. We have somewhat different viewpoints on the nature of rule processing. It occurred to me that, human nature being what it is, I tend only to react to the few things on which I disagree with you, rather than the many things on which I agree
I thought I had added some CEP sites to my own blog roll, but I seem only to have a section on rule engine-related blogs. I will be starting a new section, and adding your site to it.
Dear Charles and Peter,
Thanks for the great discussion. I look forward to the next time we meet in cyberspace.
Also, Charles, thanks for adding my blog to your blog roll. I hope our Google page rankings goes up by two!
Yours sincerely, Tim
Hi Tim, You have raised a good issue and follow up comments are equally great.
Based on my industry experiences, none of the self learning systems are good for medium to long term. They will equally require human intervention like other static systems. And if they are not monitored well, they very well over learn and then overfit too much to cause more harm than benefit.
Regarding the BRMS systems, it can very well automated too. While building a Decision Management system for a Sub-prime lender, we took the data from their operations and bureau, and then built variables over it. This data was put into cleanser and then pushed to CART. The segments which have less than 50% of the average performance of the overall sample lead to forming new rules. This system worked well but needed our intervention once in every 3 to 6 months.
So the underlying thing about BRMS limitations are not a good buy for me when compared to self learning systems.
Correct me if I went wrong way.
Bhupendra