Chaos Engineering: Key Future Trends

In our previous blog post https://www.consulteer.com/the-ultimate-guide-to-chaos-engineering/ we have introduced the term Chaos Engineering, which risks and opportunities come this approach as well as its applications and prerequisites for introducing Chaos Engineering into software development.

Chaos Engineering is the discipline of experimenting on a system in order to build confidence in the system’s capability to withstand turbulent conditions in production.
https://principlesofchaos.org

As the Principles of Chaos suggest, Chaos Engineering examines systems, which can also be related to Artificial Intelligence or Cyber Security. In this article we would like to give an overview of some areas in which this discipline is emerging, namely:

Security Chaos Engineering
Chaos Engineering applied in Artificial Intelligence

One of the biggest trends in Chaos Engineering is the focus on cyber security. As our world becomes increasingly digitized, the need for strong cybersecurity measures becomes more and more apparent.

Vast amounts of money are spent developing new information security solutions, but it is hardly ever checked if those investments are working as planned. Staying up to date with the changing assaults has led to ongoing growth in security investments.

As humans, with our limited processing capacity, we cannot prepare ourselves fully for the unknown, especially not continuously and at scale. Whereas traditional testing expects a specific output for a given input, in cyber security it is not clear which shape a novel attack might take.

Aaron Rinehart, CTO of Verica has identified 4 use cases in his article with ShiftLeft:

Incident Response - Testing efficacy of run books, response plans, and determining whether data collection and alerting systems work as intended
Security Control Validation - Introducing conditions that should induce behavior from one or more controls and observing behavior
Security Observability - Introducing unexpected attacks or breaches to see if your systems report and observe the incident accurately
Compliance Monitoring - Applicable across all Chaos Engineering events by documenting that systems do what they say they are supposed to do

Rinehart brings Security Chaos Engineering to the point when he states:

Worse, incidents cause stress, fear and irrational behaviors. People freak out. Not surprisingly, a security incident is not a good learning environment. Yet security incidents serve the same purpose of asking questions about a system, and helping teams see how to improve their systems. Chaos engineering offers a better way to create these questions under conditions where it is easier to learn and observe system behaviors, providing clues to improving security.
Aaron Rinehart, CTO of Verica

Chaos Engineering applied in Artificial Intelligence (AI)

Another major trend is the incorporation of Chaos Engineering into Artificial Intelligence (AI) systems. With the ever increasing amount of data being generated every day, it's becoming more and mor difficult for humans to make sense of it all. AI, specifically Machine Learning (ML) helps to discover trends and patterns by using mathematical and computational approaches. Famous examples include detection of objects in images or prediction of patient outcomes for instance. ML supports answering questions that traditional tools or even humans cannot.

The rise of AI also means that there are new risks to consider. After all, if a machine can learn to do something on its own, it can also learn to do something wrong. Thus, incorporating AI systems into a business or engineering process introduces potential points of failure.

Everything Fails, All the Time
Werner Vogels, CTO of Amazon

ML workflows are very dynamic and complex. Even though research is being conducted in interpretable Machine Learning, most models are a black box which takes an input (e.g., an image) and produces an output (e.g. an annotation), yet humans cannot understand how or why the model came to that result. This makes ML very vulnerable for attacks.

Changing the input slightly by adding noise to or swapping pixels in an image can lead to unexpected results. Examining this potential chaos is therefore extremely important. Think about a self-driving car, which uses AI to detect objects in front of the car such as road signs, people crossing the street. Unintentionally modified software or unobserved scenarios could lead to disastrous outcomes.

Questions arise who would be responsible for these? Is it the car manufacturer, the supplier of the software or the car passengers that should have still intervened? Can these situations be dramatically avoided by incorporating Chaos Engineering in the development and testing process as well as while the car is driving?

The latter might be very interesting to explore. Perturbations could be applied on a copy of the real-sensor inputs (cameras, radar etc.) and the results be analyzed on the fly and compared to the values of the production system. Furthermore, the gathered data could be streamed to the development team for additional investigations. This is were iterating has its purpose to learn and improve continuously.

Wrapping up

Chaos Engineering is on the rise, and for a good reason. As the digital world becomes more complex, novel tools and techniques are needed to help understand and manage that complexity. Security Chaos Engineering is one such tool, and it's quickly gaining popularity as organizations strive to keep their data safe in an increasingly hostile online environment.

Moreover, as we move into an age where Artificial Intelligence are more prevalent, Chaos Engineering will become even more important in order to ensure that our systems remain stable and reliable.

We have seen the power of Chaos Engineering first-hand, and we believe that it will only continue to grow in importance in the years to come. Are you ready for the challenges that lie ahead? Let us help you get prepared.