public digitalThe public digital logo

Chaos testing: Build like it’s already broken

    Chaos-testing.png

    Before transformation, there is often disruption. Before innovation, there is usually a crisis that forces organisations to change. When things go wrong – whether through systemic failure, external attack or sudden market shifts – organisations are forced to re-examine their ways of working. This isn’t merely about fixing what’s broken; it’s about adapting to ensure the same mistakes aren’t repeated.

    Chaos testing is the term used to describe the process of software engineering that deliberately introduces failures into a system to test its resilience. Companies like Netflix pioneered the concept with tools like Chaos Monkey, which randomly disrupts infrastructure to expose vulnerabilities. However, the same principles can apply to whole organisational design – and not just to resilience.

    For example, a company can test the strength of their strategy by asking: What would happen if their largest customer suddenly walked away? They can also assess the stability of their leadership by evaluating how the business would fare if key senior team members were to leave. Moreover, it's vital to test a company's ethical framework by considering the potential reputational risks associated with emerging technologies – such as AI – and asking: could the ineffective implementation of an AI system —such as failing to upskill workers— lead to significant damage to the company's reputation? Chaos testing allows businesses to identify vulnerabilities, improve risk management and enhance their ability to adapt in the face of uncertainty.

    True organisational chaos testing isn't just about infrastructure – it’s cultural. It asks uncomfortable questions. What if your leadership team was publicly compromised? What if your product went viral for the wrong reason? What if your AI tools discriminated without you knowing? How reliant is the data we use to inform decisions? How quickly are we able to respond to change? 

    These questions aren’t hypothetical—they’re scenarios playing out across industries right now. The best-prepared organisations are those that simulate disruption, forcing themselves to respond to worst-case scenarios before they occur in reality.

    In an age of AI hallucinations, deepfakes and climate-induced migration, the real risk is assuming that business-as-usual will carry you through. With severe disruptions from AI coming in the next few years, it's critical to act to survive.

    Why crisis is a catalyst

    As Public Digital’s Andrew Greenway writes in his book Digital Transformation at Scale, most organisations do not change unless they have to. In the commercial sector, existential threats demand action: delay means customers, revenue or relevance may be lost. In government, where institutions can often weather crises without immediate existential risk, inertia is a greater challenge. Yet, as the digital era reshapes services and expectations, even the public sector must acknowledge that ‘the way we’ve always done things’ is no longer good enough.

    Some crises stem from technology failures, such as when Canada’s payroll system left 80,000 employees underpaid or the NHS faced a ransomware attack that resulted in a blood shortage. Others result from policy implementation breakdowns, like The Post Office scandal or Centrelink’s controversial debt recovery programme in Australia. And in the corporate world, even household names have crumbled under complacency—Nokia, Thomas Cook, Debenhams, Woolworths and HMV serve as cautionary tales of failing to embrace digital change in time. Barclays, Tesco and Aviva, meanwhile, show how other companies got it right.

    Breaking the cycle of complacency

    Crises reveal the flaws in existing processes, but the response determines whether an organisation learns and improves. Too often, organisations patch the immediate problem and move on, rather than addressing the underlying structural weaknesses. This is particularly true in government, where the pressures of day-to-day firefighting can make long-term reform seem unattainable.

    As Greenway writes, in the commercial world, crises tend to focus the mind because they can be genuinely existential. “Fail to respond, and all of a sudden your company name is no more than the punchline to a bad joke. Sony's reluctance to develop a competent digital Walkman left space for Apple’s iPod. Video rental giant Blockbuster airily dismissed Netflix, then went bankrupt when it couldn't compete. Many companies don’t heed the call – often those that have become so big they can’t imagine a world without them in it. All too often, the rest of the world has no such difficulties.”

    Companies tend to be most at risk when they are enjoying comfortable profitability, Greenway reminds us. With good profits and a rosy outlook, the need for dramatic change is obscured. This sort of complacency is only ever justifiable if your organisation doesn’t use technology and is immune to fundamental social changes, which is exceedingly rare today.

    “Delay is a mistake," Greenway writes. “As disruptors begin eating into profitability, companies find they have reduced manoeuvre for making the investments they need to pivot into new markets and digital ways of working. The opportunity has been lost. Talented people have moved elsewhere. Margins get thinner. All a company can do now is cross its fingers and hope the tide turns. Usually, it doesn’t.”

    Real transformation happens when leaders seize moments of disruption as opportunities to implement lasting change. This means:

    • Investing in resilience – ensuring digital systems are flexible, modular and built to withstand disruption.

    • Encouraging experimentation – creating safe spaces for testing new approaches and behaviours, even if they challenge traditional norms.

    • Learning from failure – embedding a culture that analyses missteps constructively, rather than defensively.

    • Allowing time to transform – embedding new ways of working doesn't happen overnight; employees need space and time to change their habits and behaviours.

    A crisis is too good to waste

    As history shows, crises force organisations to move faster than they ever thought possible. Whether it’s adapting to remote work overnight during a pandemic or rapidly deploying digital services when legacy systems fail, necessity drives action. The challenge is to ensure these moments don’t result in short-term fixes but lead to long-term change.

    Chaos testing is not about inviting failure – it’s about preparing for the inevitable. Organisations that embrace disruption as an opportunity, rather than a threat, will be the ones that thrive in an era of continuous change. Chaos testing is vital for organisations wanting to flourish today.

    Written by