Incident & Breach Response , Managed Detection & Response (MDR) , Security Operations

Cloudflare's Cloudbleed: Small Risk, But Data Lingers

Exhaustive Cloudflare Postmortem Should Staunch Most Worries Jeremy Kirk (jeremy_kirk) • March 3, 2017

Cloudflare's recent data breach - dubbed "Cloudbleed" - was strikingly unique. To wit, a software bug caused a random regurgitation of data from the memory of its servers, potentially exposing passwords, cookies and chat logs. The breach stemmed from a small coding error, involving just a single character entered incorrectly into an HTML parsing program.

Although some of the data is still lingering on the web, it appears that organizations that had information leaked as a result of the breach face little real-world risk. Although as with anything involving computer security, there are no 100 percent guarantees.

Even so, a new postmortem blog post from Cloudflare CEO Matthew Prince should assuage victims' concerns. In the post, Prince likens the leak to someone eavesdropping on a conversation, in that most of what anyone might overhear wouldn't be useful, although there was an outside chance that they might hear a really juicy tidbit.

"The good news is the amount of information for any conversation that's eavesdropped is limited," Prince says. "The bad news is you can't know exactly what the stranger may have heard, including potentially sensitive information about your company."

Bug Details

As I've detailed, Google Project Zero's Tavis Ormandy stumbled on the bug after encountering strange data on the web (see Cloudflare Coding Error Spills Sensitive Data).

The bug, a buffer overrun, was triggered after Cloudflare's parser choked on a bit of broken HTML. Among other functions, the parser is designed to rewrite web pages to accommodate HTTPS encryption and masks exposed email addresses from bots. To be vulnerable to the related flaw, any given website also had to have enabled some specific Cloudflare features.

Ironically, the web page that triggered the bug didn't experience a data leak. Instead, the buffer overrun caused a regurgitation of data - stored in the adjacent server memory - from other Cloudflare customers. While that data was normally protected by SSL/TLS encryption, the bug caused it to be dumped in unencrypted form on the web page. Sometimes, the leaks involved an unintelligible jumble of data in Asian characters, as shown in a screenshot Prince published in his blog post. Other times, however, the leaks were much more serious, for example revealing cookies, which could then be used to log onto some services using someone else's identity.

Most data dumped on pages as a result of Cloudbleed featured random, binary data, which a browser would attempt to interpret as largely Asian characters, followed by a number of internal Cloudflare headers, as seen in this screenshot published by Cloudflare.

Damage Assessment

Because Cloudflare has such a large customer base, once Ormandy found the problem, the race was on to figure out which websites triggered the bug and then contained the damage. The bad news is that the faulty parser was activated on Sept. 22, and Cloudflare has now determined that the bug was triggered 1.2 million times across 6,457 websites. The good news, however, is that thankfully, most of these sites are small and infrequently accessed.

Attackers also don't appear to have picked up on the bug, says Cloudflare, which looked for signs of repeated requests to a susceptible page that might have revealed related exploit attempts. "For the last twelve days we've been reviewing our logs to see if there's any evidence to indicate that a hacker was exploiting the bug before it was patched," Prince says in his March 1 blog post. "We've found nothing so far to indicate that was the case."

Even if the flaw was exploited, the data that would have been returned would have been unpredictable, and in most cases completely useless. But in other cases the results "would contain very sensitive information," Prince acknowledges.

But most of the leaked information comprised internal Cloudflare headers and customer cookies, he says. And the company has not found examples of exposed passwords, credit card numbers or health records.

The Search Engine Problem

Another disconcerting kink with Cloudbleed, however, was that search engines had cached exposed data. So if attackers had clocked the vulnerability, they may have launched a data mining effort to scoop up the digital crumbs, and such a project would be well within the capabilities of a nation-state.

Once Cloudflare learned of the flaw, however, it raced to contact search engines and get the data removed. That involved the usual big players - Google, Bing, Yahoo - as well as other players, including Baidu, Yandex and DuckDuckGo.

"We were able to remove the majority of cached pages before the disclosure of the bug last Thursday," Prince says.

So far, 80,000 cached pages have been taken down. But Prince he says that is not the total number of pages, "because we've requested search engines purge and re-crawl entire sites in some instances." Hence some sensitive data is likely still at large. Cloudflare has created an email address, parserbug@cloudflare.com, to accept reports on live leaked data.

A redacted sample of the leaked data. Source: Tavis Ormandy.

Our Fragile, Fragile Web

Cloudflare should be commended for its quick action and detailed public outreach, and Prince was appropriately contrite. "We know we disappointed you and we apologize," he says.

But there are remarkable attributes of Cloudbleed that underscore the fragility of connected systems. As other vulnerabilities such as the Heartbleed OpenSSL vulnerability have shown, a flaw in one widely used component has ramifications for many others (see Cloudflare Coding Error Spills Sensitive Data).

Furthermore, Cloudbleed victims did nothing more than simply use Cloudflare's service. The nature of the bug meant that the comingling of data in memory on some servers led to some information being leaked. But that is a rare and perhaps never-before-seen data breach scenario.

From an information security perspective, of course, Cloudbleed is also a clear and present reminder that no matter how well an organization thinks it has architected and put in place data control and management procedures, outside risks or mistakes that might shred those procedures remain impossible to predict or always mitigate.