The most recent annual “Causes and Impacts of Data Centre Outages” report by the Uptime Institute gives a fascinating snapshot of the industry. Key findings include:
- The frequency of data centre outages continues to be a source of concern for both customers and operators:
- Power issues remain the most common cause of disruption but constitute a declining share of outages; and
- There were fewer “severe” outages in 2020 than in 2019.
The overall frequency of data centre outages appears largely unchanged, with 76% or organisations reporting some form of outage in the last three years (compared to 74% in the previous report). As regards publicly-reported outages, there was a slight fall from 163 in 2019 to 119 in 2020. This may, in part, reflect a reduction in the impact of outages, with only 6% of organisations reporting a “severe” incident in the last three years (compared with 11% in the previous report).
Looking in more detail at the distribution of impacts, the picture over the last few years is somewhat confusing. I say “confusing” because the median duration, that is the length of a typical outage, has increased considerably over the last few years; but the likelihood of an extended outage has fallen. For example there was a 16% chance of an outage of greater than 24 hours in 2018, but only an 11% chance in 2020.
Power issues are, once again, the most common cause of data centre outages at 37%; but this is well down on the historical average (since 1994) of 80%. Within this overall category, failure of UPSs is the single biggest cause. Software and IT systems errors are now the second largest cause of disruptions at 22%. Whilst the recent major fire at OVH’s site in Strasbourg attracted much publicity; fires account for only a tiny number of disruptions overall.
Coming on the back of the OVH disruption, this is a further reminder to all of us of the need to manage the risk of a data centre outage; even if our systems are hosted in a top-tier facility.