A guide to the key principles of chaos engineering

November 12, 2018 / Nazareno Feito

Chaos engineering can be defined as experiments over a distributed system at scale, which increases the confidence that the system will behave as desired and expected under undesired and unexpected conditions. The concept was popularised initially by Netflix and its Chaos Monkey approach. As the company put it as far back as 2010: "The Chaos Monkey’s job is to randomly kill instances and services within our architecture. If we aren’t constantly testing our ability to succeed despite failure, then it isn’t likely to work when it matters most  -  in the event of an unexpected outage." The foundation of chaos engineering lies in controlled experiments; a simple approach follows.