Celebrating Failure
More and more companies move away from stigmatizing failure, saying things like “We have a safe-to-fail culture” or “Only those who don’t work make no mistakes”. I like to take this a step further because I don’t think failure should only be tolerated. It’s not just something that’s a necessary part of life, that an empathetic manager should look over and forgive. No, I believe that building a culture where we can honestly celebrate failures is better for the individuals, teams, and the whole organization.
Why Celebrate Failure?
The answer becomes more obvious if we look at the opposite: a culture where failure is feared and frowned upon. Where can this lead to?
- Change-aversity: If failure is a mistake caused by bad behaviour only, then there’s less opportunity to examine and improve the environment that allowed it to happen.
- Risk aversion: If decisions are over-indexed for safety in fear of failure, it can lead to increased time, cost, and system complexity.
- Low innovation: Being afraid of failure strangles creativity and innovation because ideas that don't work out are seen as mistakes.
- Increased issue severity: Fear of repercussions or humiliation leads to hiding mistakes, increasing their negative impact. (”Damn, I brought down this production microservice! Let me try to fix it quickly before someone notices! … Arrgh, I can’t fix it, now I have to tell someone!” — and we lost half an hour, and the person trying random things in panic can make the situation even worse.)
- Low trust: When people are incentivized to keep secrets, transparency will be low, impacting all communications, decreasing alignment, and most importantly, trust.
- Knowledge stagnation: Trial and error is a good way to learn about something. If mistakes are not in the open and discussed, then this learning opportunity is missed.
- Low responsibility: If the incentives are set up to hide failure, people will avoid responsibility, finger point, and shift blame.
Opposing this, in a culture that celebrates failures, you have a chance to see:
- High adaptability: Looking candidly at the root cause of failures, discovering their true reasons, and improving the system to make them less likely in the future increases organizational adaptability.
- Healthy risk-taking: The chance of failure is not paralyzing decision-makers, who can weigh the risk-mitigating efforts against other considerations like cost, time, and system complexity.
- High innovation: Not fearing to come up with a bad idea increases creativity and freedom to innovate.
- High resilience: People and teams can focus better on quick and efficient issue mitigation if they are freed from the stigma of failure and can communicate and collaborate without judgment.
- High trust: Being comfortable discussing shortcomings and working together to get through them increases team health, belonging, and trust.
- Culture of Learning: Failures are analyzed to find their root causes, focusing on finding what could be learned from them. This increases the knowledge of the individual, team, or organization — and prepares everyone to have a better chance to overcome similar challenges in the future.
- Accountability: When the personal impact of the failure is decreased, people are not incentivized to avoid ownership.
These are the two extremely opposing cultures. More and more companies are leaning towards the second one. Still, since failure avoidance is a natural behavior, that can lead to the consequences described in the first example, organizations need to go one step further from simply tolerating failure. Saying explicitly that “We Celebrate Failure” compensates against the reflexes to avoid, hide, be ashamed of, and move on quickly after mistakes, missing the valuable aspects of learning, growth, and potential systematic improvements.
How to Celebrate Failure?
Set up the environment
Within your remit, set up the processes and opportunities to candidly discuss failures. You need mutual trust to encourage people to open up, so start with 1:1s, where hopefully you have this safe space already created. If someone on your team makes a mistake, misses a project deadline, or causes an incident, use this as an opportunity to be grateful for the chance to learn, grow, and improve processes. People naturally want to be seen as good performers, so they should be trained to embrace the benefits of failures. This can be a good topic in the safety of 1:1 discussions first.
Once people on your team are comfortable talking objectively about failures with you, extend this approach to team processes. Retrospectives and postmortems are obvious candidates to include failure celebration, but you can add candid, objective analysis of failures in various other ceremonies from planning to sprint reviews. Similarly, in written communication: status updates, internal newsletters, blog posts, etc. are a great platform for this. Consider including a “What did we learn?” section where it makes sense, in which you can honestly explain what you did, why, what led to it being a failure, and how you and the team are in a better shape for the future because of this experience.
Start with yourself
Gandhi said, “Be the change that you wish to see in the world”. The best way to change behavior is to behave accordingly. Next time you have an opportunity, open up candidly about a mistake you’ve made, and what you’ve learned from it. It’s a balance: make sure you don’t seem like you’re asking for the pardon or pity of others — but allow yourself to be seen as vulnerable. Your behavior should be credible and honest, not out of character.
This is especially hard if your failure impacted the team negatively, for example, you couldn’t get approval for tech debt work or missed a deadline for submitting the team building budget. Still, these can be key leadership moments, to show an example of remaining humble, calm, objective, and focused on learning and systematic improvements.
Framework for failure analysis
Failure can be frustrating, upsetting, scary, and demotivating. These emotions make analysis and the search for learning harder. Distance feelings from the events, be objective, focus on the future, and look at the following aspects:
- What were the circumstances, and the situation in which the mistake happened?
eg.: Our CMS doesn’t allow post URLs to be changed via the interface. We only have one or two requests for this every month, and the team takes care of them manually. - What were our actions, what did or didn’t we do?
eg: A recently hired Support Engineer was handling a URL change ticket. They issued a manual SQL command with a bug in the condition part, and as a result, they overwrote multiple URLs. - Why? What made us choose this path?
eg: The Support Engineer is coming from a different company where working on the database directly as part of the day-to-day. They didn’t know we had a script for these URL changes and wanted to be helpful. The query seemed simple and solved the ticket. - What happened, exactly? What was the objective impact of our actions?
eg: The SQL filtered for the unique ID of the post, but the engineer didn’t know that these IDs are unique within the owner’s scope only, so their SQL overwrote post URLs from every owner in the database that had the same post ID. As a result, hundreds of URLs changed across various publications. We could quickly revert the change thanks to the engineer immediately spotting the problem and asking for help. - What did we learn from this experience?
eg: We discovered that Support Engineers are not properly onboarded to our helper scripts and database structures. - What will we change in our systems and processes to decrease the chance of similar failures?
eg: An Engineering Manager will review the Support team’s onboarding documents and work together with them to add information about our product architecture and existing maintenance tools. They will collect a list of scenarios where Support Engineers still need direct database access, and propose developing helper scripts for those remaining few cases. - How will we capture and share all the above, and follow up after the actions?
eg: Support Engineering will share the incident with the rest of the company during their quarterly update, and they’ll follow up with the conclusion of the above action items.
The Blameless Postmortem process can be a good inspiration to create your failure analysis framework.
Watch out for risks
- Avoid fake positivity. Your excitement about a failure should be credible and rooted in the possibilities this unlocks in improved systems and deeper learnings. Celebrating a failure shouldn’t mean that you only focus on the rosy future. If you stay superficial about what happened to avoid ruminating too much on a failure, you miss a chance to truly understand what allowed it to happen, and your conclusions and actions will also stay on the surface.
- Separate failure from underperformance. If the failure happened because of negligence, lack of effort, not having the necessary skills required for a position or personal reasons that have been discussed before, then this is probably a low-performance issue, not a failure to celebrate. It’s a hard balance to master, but if you delay handling low performance appropriately, the person will have less chance to grow in their role — or to move on to a job that better suits their motivation and preferences. This is detrimental for the team and the organization too, because it’s building a culture of no consequences.
- Be mindful of the personal impact. Failure is stressful, especially for someone less senior. Take extra care to give appropriate support to ensure the experience is sending the right message to the person, and they will find the balance between the extremes of being paralyzed and acting careless.
Key takeaways
Here’s a short cheatsheet you can rely on that summarizes the above in a few bullet points:
Celebrate Failure to
- Increase adaptability;
- Avoid risk-aversion;
- Foster innovation;
- Strengthen resilience;
- Build trust;
- Encourage learning.
Celebrate Failure by
- Relying on mutual trust;
- Being honest, credible and objective;
- Using a failure analysis framework to capture what happened, why; what you learned and what will you do in the future.
Watch out to avoid:
- Fake positivity;
- Overlooking underperformance;
- Missing the personal impact.
I write about Engineering Leadership topics similar to this one. Sign up here to receive my future articles by email.