Türchen 3: Dying well: How to conduct an effective post mortem (failure analysis)
Being a software developer drives one fundamental lesson into you: Software is not reliable. Now, by and large the services that we use every day are extremely reliable. To my memory Google Search has essentially never been “down”, and when Facebook went down it was national news. Each of these services still experience regular failures. […]