Event Details

Please note: All times listed below are in Central Time Zone


Unified monitoring and second-level alarm scheme for OpenStack and Kubernetes

With the increasing trend of hybrid cloud, a unified, loosely coupled, rapid response monitoring and alarming scheme is urgently needed in the production environment.

Telemetry cannot be extended to unified monitoring of infrastructure such as underlying infrastructure (e.g. physical resources on hosts), Kubernetes resources; With the increase of cluster size, carry out rapid alarm response is also a key issue.

We propose an architecture, through optimization of prometheus-operator, uniformly collecting and managing all kinds of resources in the multi-cloud scene, to solve the problem of persistence of monitoring data. We can alert the fault in a second level, and provide automatic deployment schemes of monitoring.

At present, we have applied it to a production environment with 500 hosts.


What can I expect to learn?
  • Prometheus Arch
  • Prometheus operator
  • Kubernetes
  • Telemetry
  • Docker
  • Ansible
Monday, November 4, 10:50am-11:30am (2:50am - 3:30am UTC)
Difficulty Level: N/A
Openstack R&D Engineer
Roger Yu currently works as openstack R&D Engineer for 99cloud.  He is active contributor in OpenStack development in many projects mainly in Nova and kolla-ansible. He has worked in different domains such as power-electronics, system-drives, cloud and virtualization. He started his career as software developer in Power-electronics domain with C for around 3 years and then has... FULL PROFILE
Openstack R&D Engineer
Fan Guiju (Mickey Fan) is currently working as a full-time R&D engineer at 99cloud OpenStack and is actively involved in the code contributions of the Kolla project. Love open source, embrace open source FULL PROFILE
OpenStack R&D Engineer
 Bai Yongjun currently works as openstack R&D Engineer for 99cloud.He is actively involved in the code  contributions of the Kolla project. FULL PROFILE