Fujitsu Cloud had a performance issue of OpenStack API. Average of API response time was usually good(less than a few seconds), however, once the trouble happened, a large amount of time out errors occurred. We tried to detect the trouble with metrics monitoring(CPU, memory...), but could not configure the threshold for each metric properly. We just got a ton of alarms after the trouble happened. It's very hard for operators to check whether the alarm is necessary or not for all alarms.
We introduced Monasca-analytics to reduce the load of operators. Monasca-analytics could find only the alarms which should be checked by configuring the alarm thresholds.
In this presentation, we will show you how Monasca and Monasca-analytics work and how our system is configured.
The archtecture and data flow of Monasca and Monasca-analytics
