Cloudflare outage on November 18, 2025

This outage caused a lot of websites to fail, including two that I accessed that day.

https://blog.cloudflare.com/18-november-2025-outage/

Summary:

  • A permissions change in ClickHouse caused metadata queries to include both “default” and “r0” tables, doubling returned rows for the feature file generator.​

  • The sudden increase in features in the file pushed it over the 200-feature limit in the core proxy Bot Management code, triggering repeated failures.​

  • The system tried to handle and propagate new files rapidly, but panics and unhandled errors in Rust code (Result::unwrap() on Err values) caused the widespread 5xx errors.

Cloudflare’s Bot Management is a system designed to detect, score, and manage automated traffic (bots) across its network. It uses machine learning models that analyze incoming web requests to determine whether each request is likely generated by a human or by an automated script (bot).

This is a common problem – we implement something to protect something else, and it ends up causing a problem that compromises the original service. In hardware, we sometimes see this with watchdog circuits that go awry in various edge conditions.

was github downtime related to this as well ?