Cloudflare outage on November 18, 2025

This outage caused a lot of websites to fail, including two that I accessed that day.

https://blog.cloudflare.com/18-november-2025-outage/

Summary:

  • A permissions change in ClickHouse caused metadata queries to include both “default” and “r0” tables, doubling returned rows for the feature file generator.​

  • The sudden increase in features in the file pushed it over the 200-feature limit in the core proxy Bot Management code, triggering repeated failures.​

  • The system tried to handle and propagate new files rapidly, but panics and unhandled errors in Rust code (Result::unwrap() on Err values) caused the widespread 5xx errors.

Cloudflare’s Bot Management is a system designed to detect, score, and manage automated traffic (bots) across its network. It uses machine learning models that analyze incoming web requests to determine whether each request is likely generated by a human or by an automated script (bot).

This is a common problem – we implement something to protect something else, and it ends up causing a problem that compromises the original service. In hardware, we sometimes see this with watchdog circuits that go awry in various edge conditions.

was github downtime related to this as well ?

I think this was a separate incident.

Even using rust is not panacea

2 Likes

I am increasing wondering why I bother with Cloudflare .

Initially it’s so good but if you build your architecture such that you don’t need load balancing you can work around the Any cast pricing that Cloudflare provides.

It’s a deal with the devil if you proxy . They see all traffic unencrypted from them to your servers and do what they will . You get free global load balancing g to your regions and bots protection .

It’s not a great long term way is what I am saying .

Let’s encrypt and the grateful dead dude should create a BGP / Anycast network . It’s the perfect next step , because they are already doing the first step anyway

2 Likes