"Service Resiliency Doesn't Always mean "HA"" or ""Cluster""""

By: Randy Bias, Dan Sneddon

While traditional HA and clustering techniques work for many cases, they are also frequently at the root of catastrophic failures. In this presentation we will cover other well understood and proven approaches to providing greater uptime.  Load balancing and service distribution patterns provide an alternative that allows for better horizontal scaling, greater aggregate throughput, and have failure characteristics that reduce the chance of cascading failures.  In addition, these alternative approaches are typically simpler to implement and have fewer moving parts, which means they are less prone to failures and have significantly less operational overhead.  In this session we'll discuss general principles around using equal-cost multi-pathing (ECMP) for IP flow load balancing, using routing protocols judiciously for managing ECMP flows, and show what happens in various failure conditions and how this is different from traditional load balancers, HA pairs, or clustering.

