The CrowdStrike Outage Is the Canary in the Coal Mine for the Entire Tech Industry

Manohar Goli

CTO, TRUNDL

orange line graphic

Chirp, chirp🐤

We should be grateful to CrowdStrike for that update.

If you are not perpetually online or chained to a computer for work, you may be blissfully unaware of the cataphoric failure of one defective software update that crashed millions of operating systems and shut down swaths of the world on Friday July 19th. CrowdStrike, a cybersecurity software company for enterprises, released an update to their Windows users that pushed computers around the globe into a reboot death spiral. The looping blue screen of death impacted air travel, hospitals, banks and so much more. The scale of the shutdown matches what was predicted for the Y2K bug back in 2000. Anyone old enough to remember those days knows this was a big, freaking deal. A thing, if you will.

You can read excellent recaps of the details of CrowdStrike’s outage here and here, we’re going to examine the bigger picture and discuss what the industry should learn.

The consequences of an increasingly globalized world

It’s mindboggling that one update from one company forced hospitals to reschedule surgeries and revert to pen and paper to stay functional, to say nothing of the chaos it created for airlines forced to cancel flights around the world. This isn’t humans being overly reliant on technology, this highlights that a handful of companies are responsible for powering and maintaining the world’s digital infrastructure, and we’re at their mercy.

Grounding flights is one thing, but this outage dropped 911 calls, which means it’s not unreasonable to imagine that some people may not have gotten help in time. CrowdStrike vividly demonstrated how high the stakes of technology are for humans and just how rickety our underlying digital infrastructure is in the hands of these tech giants.

When culture and capitalism collide

It’s unfair to solely blame CrowdStrike. They are part of a tech culture that has been moving fast and breaking things for decades now. It was only a matter of time until the ramifications of this reckless mantra coupled with the monopolistic nature of big tech, which demands the prioritization of profit over people, detonated. Don’t get it twisted. There’s nothing inherently wrong with striving to make profit—that’s business, baby. It becomes an issue when companies are incentivized to cut corners in the short term rather than invest in the future, which ultimately calls into question the trustworthiness of their services. Can we really trust and depend on big tech to properly maintain their systems when they continue to lay off workers in droves to deliver shareholder value? Who’s at the wheel behind the scenes?

…tech giants need regulation and to prioritize people over profit.

This isn’t an attack on the staff at CrowdStrike, they’re getting squeezed like everyone else in the tech ecosystem. With fewer employees facing more work to complete, it’s easy to ask ChatGPT for code in a moment of desperation or let bugs slide to meet an aggressive deadline. Two things are clear: tech giants need regulation and to prioritize people over profit. That’s how we can avoid the next outage. Easier said than done, I know.

I, human

The irony of the CrowdStrike outage is that it requires a significant amount of human effort to fix. Since the computers are stuck in a reboot loop, many need a human to manually remove the broken code and get it working. No over-the-air software updates for you!

Quality control used to be a job that took developers weeks to complete well, but it has become increasingly automated. Of course, automation can be useful for many tasks, but it cannot replace human insight. Nor should it, when the stakes are this high for us. It likely won’t surprise you to learn that CrowdStrike has pushed automatic updates for years now.

During the years I’ve spent in tech, I recall that many development teams adhered to the “don’t f*ck it up Friday” philosophy, which meant they never pushed updates before the weekend, just in case something went wrong. Even in PR, publicists know to dump news they want buried on Friday afternoon because journalists have left the office to start their weekend. Depending on your time zone, CrowdStrike’s update was pushed out late Thursday night or early Friday. Yikes. In addition to breaking this cardinal rule of update releases, CrowdStrike pushed the update to everyone at once, instead of staggering it to flush out any bugs that may crop up while live. It’s a miracle this type of outage hasn’t happened sooner.

Beyond tomorrow

Have you ever wondered why the Y2K bug never materialized? It’s thanks to the hard work of tech pros (actual people!) who understood the assignment and worked to ensure humanity wasn’t pushed back to the Stone Age. Twenty-four years later, this lesson is more prescient than ever. The trillion-dollar enterprises that underpin our entire lives have a moral and societal duty to take a long, hard look at their digital infrastructure, given what is at stake for us. With great power comes great responsibility, right?

Once upon a time, tech made our lives easier and better by solving real problems. More human intervention in tech, proper regulation, less power concentrated in the hands of a few companies, and choosing people over profits is how we can avoid the next outage.

CrowdStrike is only the beginning. After decades of unchecked recklessness in the tech industry, we are now watching the inevitable consequences. Our chickens have come home to roost, and thanks to CrowdStrike, we know we aren’t the slightest bit ready. However, it doesn’t have to unfold this way if the tech industry spots the canary and learns from the experience.

Do you think CrowdStrike’s outage will change anything in the tech industry?

Leave a Reply

Your email address will not be published. Required fields are marked *