Building Resiliency into Application Security

Clea Ostendorf
September 3, 2024
3 min read

Let's talk about resiliency in software.

Application security resiliency refers to the ability of software applications to withstand disruptions, adapt to issues, and quickly return to normal operations. Sounds pretty relevant these days after the world witnessed just how interconnected we are and how a break in a critical software application can disrupt business to the tune of billions of dollars in losses.

Resilience however, goes beyond just the tech and bleeds into all aspects of the business. This concept is crucial for maintaining uninterrupted performance and enhancing user experience, ultimately protecting revenue and brand reputation.

Juliet Okafor, JD, CEO & Founder, RevolutionCyber, a resilience specialist says “When organizations prioritize clear communication and integrate resilience into their core digital processes, they build a foundation that can withstand challenges today and evolve with the threats of tomorrow.”

Let’s go one step further and think about how you actually test your resilience: enter the concept of chaos engineering.

Building security chaos engineering (SCE) into an application security program focused on resiliency, requires a systematic approach that integrates resilience principles throughout the software development lifecycle. The recent Crowdstrike outage serves as a stark reminder of the critical importance of such practices.

The first step is to identify and understand the critical functionality of your application. This involves mapping out key components, dependencies, and potential failure points. Think about moving your threat model into the realm of reality.

At a high level, SCE “experiments” should simulate real-world attack scenarios and failure modes. Of course the framework and testing done should follow best practices.

Think about this:

  • Simulating a DDoS attack on critical APIs == How does the application respond?
  • Injecting latency into database queries == Are your controls working?
  • Corrupting configuration files == When something breaks, how do you respond holistically, that means people, process and tech.

Ryan Maynord, Managing consultant for Wolfpack, not only applies these concepts in his security assessments when in scope, but also applies them to Wolfpack's infrastructure. "Chaos Engineering is not only a practice but also a mindset. Applying these principles during security testing will do nothing but further results, however, embracing this philosophy as an organization, no matter the task, can help build a resilient and adaptive team ready to tackle the most unpredictable challenges."

Think about some of the most well-known companies in the world. Their value in part depends on availability.

Netflix, a pioneer in chaos engineering, developed Chaos Kong to simulate the failure of an entire Amazon Web Services (AWS) region. This SCE experiment helps Netflix ensure their systems can seamlessly failover to other regions without impacting user experience. While not exclusively security-focused, it demonstrates resilience against large-scale outages that could result from cyberattacks.

Twilio has integrated chaos engineering principles into their application security testing processes, simulating various security events to uncover vulnerabilities and improve incident response.

Finally, LinkedIn uses SCE techniques to test the resilience of their authentication and authorization systems against potential attacks or failures.

As organizations move away from the if but when mentality they should look for partners who build resiliency into their products and services. While pen tests are not going away, what we expect from our consulting partners should change to include tests that expose weaknesses outside of only OWASP Top 10 and act dynamically as a real world attack would.

Sources:

https://www.securitychaoseng.com

https://www.amazon.com/Security-Chaos-Engineering-Sustaining-Resilience/dp/1098113829 https://kellyshortridge.com/blog/posts/security-chaos-engineering-sustaining-software-systems-resilience-cliff-notes/

https://www.oreilly.com/library/view/security-chaos-engineering/9781492080350/

https://www.mitigant.io/en/blog/security-chaos-engineering-101-fundamentals

https://www.linkedin.com/pulse/chaos-testing-software-engineering-examples-tools-templates-ghodke-tcxlc

https://maddevs.io/blog/chaos-engineering/

https://www.mitigant.io/en/blog/security-chaos-engineering-101-fundamentals

https://www.gremlin.com/community/tutorials/chaos-engineering-the-history-principles-and-practice

https://www.oreilly.com/library/view/security-chaos-engineering/9781492080350/