Trend Prediction in Network Monitoring Systems

Following up on Real-Time Predictive Analytics for Web Servers I thought we should “move up a level” and look at various open network monitoring platforms with trend prediction capabilities.

Our web server management team picked Zabbix to monitor a busy production server and then we started to look into adding predictive analytics afterwards.   Alberto recommended we look into The R Project for open source predictive analytics, which was interesting because I was just about to blog on TIBCO’s integration of S-Plus with Spotfire.  Then, my research led me to an interesting comparative analysis regarding S, S-Plus and R based on Aberto’s recommendation.  (Thanks Alberto!)

Instead of writing on S, S-Plus and R today, I thought it might be good to take a look at potential trend prediction capabilities in network monitoring systems, especially the “open, free ones” under the GPL or similar license.  Based on this Wikipedia chart, A short comparison between the most common network monitoring systems, there are 3 out of 40 listed NMS platforms with trend prediction capabilities, GroundWork Community, Osmius and Zenoss.  Unfortunately for us, Zabbix does not yet have trend prediction capabilities; however, the Zabbix project leader says he plans to add this functionality “in the future,” which is not very encouraging, since we don’t know what “this future functionality” will be.

Osmius claims to be event-oriented software with a “realistic and practical platform” to apply research and investigative results including AI and event correlation processes.  Osmius aims reduce the volume of “final events” to process to identify the root cause of problems, including predicting problems before they occur. Osmius boasts off-line data mining capability with a pattern language to discover event occurrence patterns.    We need to look into Osmius more and see if there is any substance to the marketing claims.

Unfortunately, we could not find any concrete trend prediction capabilities in GroundWork, especially in the free and open community version of the software. This makes sense since GroundWork is based on Nagios, and Nagios does not have built-in forecasting and predictive analytics. Also, a preliminary look into Zenoss was not very encouraging, as we could not find solid evidence of predictive analytics and forecasting functionality.

As for next steps, I think we’ll look a bit deeper into a few of these software platforms and see if we can find out exactly what forecasting methods they use, if any, for outage prediction.  If anyone has any knowledge or experience in these NSM event processing platforms and their capabilities regarding predictive analytics and outage forecasting, please comment.  Thanks!

Also, I still have some blogging to do on TIBCO’s integration of Spotfire and Insightful’s S-Plus, both acquired by TIBCO last year, as I recall.   I am interested to see when and how TIBCO integrates off-line analytics (Spotfire, Insightful, S-Plus) with real-time event processing.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Technorati
  • Facebook
  • Mixx
  • Google
  • Slashdot
  • Furl
  • Reddit
  • Spurl
  • LinkedIn

20 Responses to “Trend Prediction in Network Monitoring Systems”

  1. Hello Tim,

    I’m Constantino Malagon, a researcher from Osmius development team. Let me tell you about predictive analysis in Osmius. We are developing a serious research project funded by the Spanish Ministry of Industry and carried out among several Spanish universities. In this research, we are applying sequential pattern mining algorithms in order to predict instance failures based on a historical event database, as well as the use of artificial neural networks in time prediction modeling. Using these predictions, Osmius could anticipate and take corrective actions before the predicted event occur.
    Actually we are on the final phase of this research project, and we are going to make public these results in a coming monitoring international conference.
    We also expect to incorporate these prediction modules in the Osmius architecture, making them available for the next software releases.

    Of course, any ideas or suggestions are welcome.

    Yours faithfully,
    Constan.

  2. Hi Constan,

    Thanks for visiting and for your comment.

    When will this event prediction capability be available in Osmius? Will this capability be freely available and open source?

    Yours faithfully, Tim

  3. Hello, Tim.

    A very interesting blog!

    The next release of Osmius will be on July 2009, and we’ll make a demo to the main developers around that date. If everything fits they’ll start adding the new capabilities the next month so expect it to be implemented before the end of 2009.
    Regarding the license, the Spanish company behind Osmius - Peopleware Spain - release it with a GPL2.0 license. Anyway, if you want more information about this you can ask the project manager directly: joseluis[dot]marina[at]peopleware[dot]es.

    Yours faithfully,
    Constan.

  4. Hi Constan,

    Thanks for visiting and for your reply. I am finalizing my evaluation of Zabbix now and will look into Osmius next, when time permits. So far, I have been pleased with Zabbix for monitoring a production web server (not for predictive analytics). I’ll write more on Zabbix soon.

    A few questions:

    (1) Do you have a user forum where users can ask questions and get answers?

    (2) Are you aware of any paper that compares and contrasts the various open source NMS platforms like Zabbix and Osimus?

    Thanks!

    Yours sincerely, Tim

  5. Hi Tim,

    Regarding the first question, there is a forum where users can discuss and ask any question about Osmius (installation, configuration, …):

    http://sourceforge.net/forum/?group_id=160907

    Maybe you already know it, but there is a very interesting study in which the author, Jane Currym, makes a comparison among Nagios, OpenNMS and Zenoss.
    http://www.skills-1st.co.uk/papers/jane/open_source_mgmt_options.pdf

    If you are going to evaluate Osmius, please, let us know so we can help you in the process.

    Yours sincerely,
    Constan.

  6. Hi Constan!

    Thanks for visiting.

    Regarding the Osmius forum at sourceforge.net, that forum shows about 33 posts over a three year period. So, you are basically informing that there is not an active Osmius user community compared to say, Zabbix, where the Zabbix user community has around 42,500 posts over a 4 and a half year period.

    How do you account for such a dramatic difference in both the size and user activity between the Osmius user community, and say, the Zabbix user community?

    Thanks.

    Yours faithfully, Tim

  7. Tim,

    the reason is because we have just released a full version of Osmius with a platform-independent installer so anybody can download and install it in a few steps. In fact, you can note that Osmius is actually ranked #311 on Sourceforge most active projects ranking, while Zabbix is ranked #711.

    Yours faithfully,
    Constan.

  8. Hi Constan!

    Sorry, my apologies for being so dense; however, I am not following your reply.

    Are you saying this is the first Osimus release?

    Also, the Sourceforge ranking is more related to development and downloads, my question is more specific about the User Community (the social ecosystem of users). So far, I have not found an active Osmius user community on line.

    Folks who use Osmius don’t enjoy discussing it?

    Yours sincerely, Tim

  9. Hi, Tim.

    Don’t worry, at least I really enjoy discussing about Osmius and everything related to Open Source software :)

    No, this is the first easy-to-install release, and we expect that this fact will improve a more active Osmius user community. And of course, it is directly related to download ranking position, though it doesn’t garantee that we are going to have an active user community.
    Anyway, I appreciate your suggestion because we know that this social aspect that you have point out is very important for open source sofware.

    Thank you for this (I hope so) exciting discussion,

    Constan.

  10. Hi Constan!

    Great to get a reply from you. I wonder how much of my probing and questioning you can take!?

    Anyway, I was working on the Osmius install this evening; and I did not find the install to be “easy to install” as you described.

    Basically, we already have LAMP up and running with Tomcat. Osmius does not even provide a basis PHP script to set up the MySQL Osmius database; all has to be done by hand, manually importing a series of .sql files.

    This type of manual, step-by-step process does not fall into the realm of “easy to install” :-)

    Anyway, I put the installation aside after I got to the boring “create your own MySQL database, step-by-step part”, that was just too boring for a “fun night out on the server”.

    On point about “User Community”, there is very little more important (my opinion) than a very healthly and active user community. Zabbix is nice in that area, with a busy and vibrant community of users and developers, all interacting on various pieces and parts in a nice vBulletin forum/community.

    Cheers!

    Yours sincerely, Tim

  11. Hi Tim!

    Thanks for your interest in Osmius. As you noticed, Osmius is a young product, and that’s why there are not a big community around today.

    You’re right: you have the easy way installer with a basic one-click button wizard that install everything you need in a couple of minutes, and the hard-guys way doing every thing from the source code and following the instructions you can find into the Osmius Wiki (http://osmius.net/osmwiki).

    Whatever way you choose, please use the forums, posting there your questions and suggestions instead in this site so we can:
    - Improve the visibility of the problems.
    - Make it accesible to more people.
    - Centralize the search to common problems.
    - Increase our not-so-big knowledge base.
    - Let the Osmius team optimize efforts (we have plenty of thing to develop or improve)

    We’ll try to answer your questions ASAP and we’ll do our best to keep you satisfied.

    Thanks again for your interest and give us some time to grow up ourselves. our research lines and the users and developers community ;)

  12. Hi Joselu!

    Thanks for visiting! I think I will wait until you guys get a stronger user community in place.

    No, I never finished the installation because I was too lazy when I got to the “add MySQL manually” part :-)

    I did not find any “one-click install file” sorry. Maybe you can post the link to it?

    Yours sincerely, Tim

  13. Here you are the link to one-clik install file

    http://sourceforge.net/project/showfiles.php?group_id=160907&package_id=311902&release_id=664430

    We wait for your comments about the installer and Osmius, thanks for your interest,

    David

  14. Dear David,

    Thanks. I get back to you in more details later.

    Yours faithfully, Tim

  15. Dear David, Joselu and Constan,

    I installed Osmius and, so far, have found the actual capabilities very minimal compared to Zabbix. Osmius is like a high level dashboard that provides very little information compared to Zabbix, which really lets us understand the details of every aspect of our host. In addition, because Osmius runs as a Tomcat application, it is very slow compared to Zabbix, which runs as a “bare bones” web app (http). In fact, I can configure Zabbix on my mobile over GPRS if necessary!

    So far, frankly speaking, I could not imagine using Osmius compared to Zabbix, I am sorry to say.

    Yours sincerely, Tim

  16. Hi Tim!

    Thank you very much for installing Osmius and for your comments, so we can learn from your experience and we can know where to focus with Osmius capabilities and features.

    I agree with you. The Osmius java console needs to improve and show more information and it is now built to work on mobile devide. Java and TomCat give us great scalabiity in big installation environments.

    The main Osmius features that makes it different (not better, but why not?) are:
    - Easy to monitor new”things” using C++ and native APIS. This lets Osmius to be fast and extendable. And is really multiplatform… thaks to ACE we use the very same code for Linux, Solaris and Windows.
    - Users can see not only that server X is having problems, but also if these problems affects the Intranet and/or the book selling application. Business integrated.
    - Round Robin Database. You decide the detail you need for your data. For this week I want CPU load every minute. In last year data I only need hour average, etc.
    - Silent mode. Don’t send me events if the severiity is unchanged. Only send events when there are significative changes, saving net resources and preventing starvation.
    - Benchmarking. We’ve tested Osmius with thousands of instances and services, collecting 1130 events/secs for 24 hours. We don’t allow a query to last more than 2 seconds.
    - SLA. You can define your own availability and severity objectives, and check them into the dataware house Osmius offers. This data is calculated in batch mode without affecting the operation activities.
    - Connection Pools. Osmius saves connections pools, so when monitoring Oracle (or everything that needs a conection) the agent does not need to connect and disconnect every time it needs to get “free memory” or whatever, preventing resource starvation.
    - Notifications: You can subscribe to changes on services or instances and recieve e-mail, sms, jabber messages, etc, depending on working times, holidays, time/date, etc.
    - ITIL: Osmius integrate some of the ITIL best practices. You can see wich elements needs to improve because of capacity planning trends for instance.

    Osmius cannot compete with network oriented tools as we cannot do it against database administration tools. We built it to be generic, to monitor everything in the same way. This has disadvantages, but also allow us to reach other areas like Industry monitoring, solar panel industry, etc.

    Let’s see what happen in the next future, we’re young and there are plenty of thing to do.
    We’ll use your suggestions to improve Osmius. Make a light view perhaps, so you can see the console from your mobile? ;)

    Thanks again.

  17. Hi Joselu,

    Thank you for visiting and for your follow-up comments and patience. I will be posting a review of Zabbix in the main forums soon.

    One point - it is refreshing to see your comments that these “user experience” reviews and opinions are constructive; FWIW. the “CEP/EP event processing community” likes to “beat up” the (non financial services) users who find limitations in their software (go figure!). At least you guys are willing to listen, that is a very good point. Thank you!

    In a nutshell, Zabbix can do most, of not all, the things you outline above, and more. In fact, I can quickly extend Zabbix to monitor just about anything “our heart” desires. Currently, we are monitoring 10 Apache2 KPIs, 53 KPIs related to available of the server, 46 KPIs related to the server file system, 6 KPIs related to file system integrity, 5 KPIs related to memory utilization, 171 MySQL parameters related to performance, configuration, administration, user accounts, caching, and much more (truly amazing!), 6 KPIs related to network performance, 8 parameters related to OS configuration, users, connections, etc., 13 additional performance related KPIs, 8 KPIs related to monitoring Linux processes, 8 related to CPU load and more.

    I think we are currently monitoring about 340 parameters/KPIs. Many I built myself within a day or two of our Zabbix initial installation. I can graph each one easily and it is very fast. Triggers and alarms are also very easy. We can set sliding time windows over just about any KPI to reduce false alarms, basic rules-based event-event correlation is very easy, and alarms via email and SMS are equally easy.

    What is really great about Zabbix is how easy it is to add and extend the functionality, plus, as I mentioned earlier, it is quite fast.

    Also, when I write a review on Osmius, I will discuss other issues, such as terminology. I think you guys should adopt a more “agent based” vocabulary, versus calling your agents “master services” in one place, and “master agents” in another. In other words, your basic architectural vocabulary is not easy to understand. You use terms like “service” and “agent” to describe the same functions in different places, for example, and this makes Osmius confusing for the Osmius beginner, even then they have 20 years + experience :-)

    Zabbix, on the other hand, is the best network and service monitoring tool I have seen so far. It is not limited to NMS, it only limited by the user needs and ability to write simple scripts and build an XML template, which is really easy.

    Good luck and thanks for being so friendly and helpful! I hope Osmius grows to a very large community. We agree that there is a lot of work that needs to be done to get there.

    As a side note: I wish the CEP/EP software community would be one tenth as helpful as you guys!

    Yours sincerely, Tim

  18. Hi Tim!

    All your comments, suggestions and even the things your don’t like about Osmius ;) are TREASURES for us. So, thanks!

    As a result we’ve decided to install Zabbix, Zenoss, Hyperic and OpenNMS in our labs testing environment and try to build the same monitoring infraestructure a typical Osmius customer needs, make a comparison report and take a humble look to the results and conclusions to make our product easier to the user to deal with. We’ll share the results with you, if you don’t mind. Give us a couple of weeks.
    We’re also revisiting the documentation to standarize names and concepts as you suggest.

    Yours sincerely, Joselu

  19. [...] Trend Prediction in Network Monitoring Systems [...]

  20. [...] recibido algunas críticas constructivas por parte de Tim Bass desde su blog en una entrada sobre predicción de tendencias en herramientas de monitorización. Está claro que si monitorizas algo, recojes valores que procesados, te pueden ayudar a encontrar [...]

Leave a Reply

Copyright © 2007-2008, The CEP Blog, All Rights Reserved.