Today proxycheck.io suffered what we believe to be our first total service outage and it was caused by Cloudflare, our Content Delivery Network (CDN) partner.
The outage did not only affect us but every single website that uses Cloudflare. This means around 1/3rd of all websites worldwide were affected too. That is an unfathomable amount but it is how many websites trust and use Cloudflare.
This outage lasted 27 minutes and we're incredibly sorry that this happened, unfortunately due to Cloudflare's own website and web based API being affected by the outage we were unable to move our services to a different content delivery network or to expose one of our own servers to the internet directly as a stop-gap measure until their infrastructure started to work correctly again.
This is a highly unusual and unexpected point of failure in our infrastructure design and although we had considered this could happen we deemed the risk so small that we did not put in place a mitigation strategy as Cloudflare is such a professional, large and trusted company which as noted above has 1/3rd of all the worlds websites using them we did not think an outage of this magnitude was likely to ever happen.
Clearly we were wrong about that and we must diversify our entire infrastructure to be resilient against these kinds of failures. We did already build redundancy into every pillar of our own core infrastructure but didn't do so for our CDN which is the last mile so to speak between our servers and our customers, it's also the only piece of our infrastructure that we pay another company to handle solely on our behalf.
We hope that all of our customers can accept our sincere apologies for this outage and that we take full responsibility for it occurring, no one forced us to use Cloudflare, we chose to do so believing they were the best way to bring our product to you and we still do believe that but clearly having no redundancy at the CDN level was a big mistake which we will rectify.
When Cloudflare releases an official statement about this outage we will link it within this blog post.
EDIT [ Cloudflare has now released a blog post here. ]
Thanks.