OpenStack security leaders have detailed the capabilities and best practices for security, compliance and privacy.     Learn More

Maintaining and Operating Swift at Public Cloud Scale

In this session we will explain how we manage and operate the Object Storage service in HP’s Public Cloud. We do this at large scale involving 1000s of servers and two geographical regions. The management of the system is located on different continents with only a small local presence within each data center to handle break-fix activities. Swift is very robust and incorporates numerous high availability and resiliency features. In fact, it is almost too good. A Swift system is capable of operating even when many drives, servers or networks are down. While the system may appear to operate normally, in fact your users will suffer from erratic performance. Worse yet, your data resiliency and availability is being compromised. We will review in detail how we monitor the system. We use a mixture of Swift’s own monitoring capabilities complemented with our own tooling.  In addition to obvious things such as drive status, we will discuss warning signs that Swift itself reports. Our experience with the system has shown that small warning signs are often an indication of impending problems. We use Icinga for the majority of our monitoring. However, this session will be useful to any monitoring system. We will explain how to correct and resolve common issues. Tips for upgrading software so you can avoid downtime will also be covered. 

Speakers: Lorcan Browne,Donagh McCabe