Ever caused a ripple in the production pond? I found myself inadvertently starring in a real-life drama, causing a high-impact error for our partner (cue this GitLab déjà vu link 🤯:).
Now, how do we bounce back from such a situation? Here are some crucial factors:
How to Overcome an Error in Production
Understanding that Mistakes Happen
The first and most important step is to know and understand that these kinds of things happen, whether you're an experienced developer or someone just starting in this field. Surely, at some point in our lives, we will face a similar situation (I'm not wishing any harm upon you). Don't blame or judge yourself; take responsibility with your head held high and contribute to the solution.
Shared Responsibility
An error is not the responsibility of a single person. We must understand that beyond having someone who may seem "responsible", the entire team is accountable. In my team, code reviews were conducted, validations were performed, tests were run, and still, the error occurred. Developing a blameless culture is important to overcome and understand that behind errors, there are more things that may not be right, and a serious incident can bring more good than bad.
Familiarize Yourself with Company Processes
Inform yourself about how these processes are handled in your company so that you can act appropriately. A good process should have proactive monitoring for alerts on potential issues before they significantly affect the end user, effective logs and tracing, a rapid response team, good documentation, among other aspects. If your company doesn't have a process, it can be a great opportunity for you to propose one.
Conduct a Post-Mortem
There are many formats for post-mortems that can be used. My recommendation is to focus on determining the root cause without blaming anyone, have a section for lessons learned and things that can be done to avoid such incidents in the future, and have follow-up tasks with responsible parties to work on agreed-upon improvements.

