A bug in Cloudflare’s HTML parsing system leaked memory contents into web pages that were then cached by search engines.
For months, a bug in Cloudflare’s content optimization systems exposed sensitive information sent by users to websites that use the company’s content delivery network. The data included passwords, session cookies, authentication tokens and even private messages.
Cloudflare acts as a reverse proxy for millions of websites, including those of major internet services and Fortune 500 companies, for which it provides security and content optimization services behind the scenes. As part of that process, the company’s systems modify HTML pages as they pass through its servers in order to rewrite HTTP links to HTTPS, hide certain content from bots, obfuscate email addresses, enable Accelerated Mobile Pages (AMP) and more.
The bug that exposed user data was in an older HTML parser that the company had used for many years. However, it didn’t get activated until a newer HTML parser was added last year, changing the way in which internal web server buffers were used when certain features were active.
As a result, internal memory containing potentially sensitive information was being leaked into some of the responses returned to users as well as to search engine crawlers. Web pages with the sensitive data were cached and made searchable by search engines like Google, Yahoo and Bing.
The leakage was discovered almost accidentally by Google security engineer Tavis Ormandy while he worked on an unrelated project. As soon as he and his colleagues realized what the strange data they were seeing was, and where it was coming from, they alerted Cloudflare.
This happened on February 18th. Cloudflare immediately assembled an incident response team and killed the feature that was causing most of the leakage within hours. A complete fix was in place by February 20th. The rest of the time, until the incident was publicly disclosed Thursday, was spent working with search engines to scrub the sensitive data from their caches.
“With the help of Google, Yahoo, Bing and others, we found 770 unique URIs that had been cached and which contained leaked memory,” said John Graham-Cumming, Cloudflare’s CTO, in a blog post. “Those 770 unique URIs covered 161 unique domains.” A URI (Uniform Resource Identifier) is a character string that identifies a resource on the web, and is sometimes used interchangeably with the term URL (Universal Resource Locator).
According to Graham-Cumming, the leakage might have been going on since September 22, but the period of greatest impact was between February 13 and February 18, when the email obfuscation feature was migrated to the new parser. Cloudflare estimates that around one in every 3.3 million HTTP requests that passed through its system potentially resulted in memory leakage. That’s about 0.00003 percent of all requests.
Even so, because of the nature of the exposed data the incident was very serious and Cloudflare customers might decide to take action, like forcing users to change their passwords.
“I’m finding private messages from major dating sites, full messages from a well-known chat service, online password manager data, frames from adult video sites, hotel bookings,” Ormandy wrote in an entry on Google Project Zero’s bug tracker during the incident. “We’re talking full https requests, client IP addresses, full responses, cookies, passwords, keys, data, everything.”
This bug is similar in its effect to the HeartBleed vulnerability in OpenSSL, which could have allowed attackers to force HTTPS servers to leak potentially sensitive memory contents. In fact, Ormandy even said that it “took every ounce of strength not to call this issue CloudBleed.”
But unlike HeartBleed, which had the potential to expose SSL/TLS private keys, no such keys have been affected in the Cloudflare incident.
“Cloudflare runs multiple separate processes on the edge machines and these provide process and memory isolation,” Graham-Cumming said. “The memory being leaked was from a process based on NGINX that does HTTP handling. It has a separate heap from processes doing SSL, image re-compression, and caching, which meant that we were quickly able to determine that SSL private keys belonging to our customers could not have been leaked.”
One private key that was leaked, however, had been used to secure connections between Cloudflare machines.
To be on the safe side, internet users might want to consider changing their online passwords, something they should do on a regular basis anyway to keep ahead of data breaches.
“Cloudflare is behind many of the largest consumer web services (Uber, Fitbit, OKCupid, …), so rather than trying to identify which services are on Cloudflare, it’s probably most prudent to use this as an opportunity to rotate ALL passwords on all of your sites,” security researcher Ryan Lackey said in a blog post.
source: pcworld.com