Event Details

Please note: All times listed below are in Central Time Zone


Efficient Monitoring and Root Cause Analysis in Complex Systems

Real-life scenarios show that searching for the root causes of failures in complex systems can be very complicated and time consuming, leading, in the worst case, to lengthy outages. Therefore, operational real-time monitoring of the infrastructure is crucial to be able to quickly identify and alert on potential problems. But monitoring is not enough; when the failure occurs in a backend component such as memcached or database, configured alarms will typically cause an avalanche of notifications on all affected services and resources. A flexible mechanism for defining and analyzing relations between them is urgently needed.

In this presentation, we will show you how this can be achieved with Monasca and Vitrage, two OpenStack projects working together under the umbrella of the Self-Healing SIG. We will also refer to other possible integration points to implement fully automatic remediation.


What can I expect to learn?

In this presentation we demonstrate how Monasca and Vitrage can be efficiently used together to proactively detect problems and prevent possible outages.

Finally we show how the observability of OpenStack services and our solution can be improved by implementing Healthcheck APIs.

Tuesday, November 5, 3:20pm-4:00pm (7:20am - 8:00am UTC)
Difficulty Level: Beginner
Senior Software Developer
Witek Bedyk is senior software developer at SUSE (Munich, Germany) and Project Team Lead of OpenStack Monitoring Service (Monasca). He holds an MSc in Computational Physics from the Lodz University of Technology. He has also studied Computer Science at FernUniversität Hagen. Prior to working with OpenStack he was a developer of GIS software. Witek has spoken about Monasca at OpenStack... FULL PROFILE