The BREACH attack, abbreviated from Browser Reconnaissance and Exfiltration via Adaptive Compression of Hypertext, is an attack similar to the CRIME attack. Both attacks are compression side channel attacks, however CRIME targets information compressed in HTTP requests through TLS compression, whilst BREACH targets information compressed in HTTP responses through HTTP compression.
HTTP compression is normally performed through the deflate algorithm, which is a data compression algorithm that is made up of a combination of Huffman coding and LZ77 compression. When compressing data using this algorithm, any repeated byte sequences in the input are detected and are not repeated in the output. Instead, the repeated byte sequence is stored only once, along with pointers that point out where the same sequence is found again. This will reduce the number of bytes that are being sent, and thus will also reduce the time taken to send that data.
However, even when encrypted, the length of the compressed data is still visible and this is one of the fundamental elements that makes the BREACH attack possible. Furthermore, for an application to be vulnerable to the BREACH attack it must be served from a server using HTTP compression, and it must also include a user-input and a secret, such as a CSRF token, in the HTTP response body. An attacker exploiting a BREACH attack vulnerability would need to have a means to view the victim’s traffic and also have the ability to enable the victim to send HTTP requests to the vulnerable server, which could be done by persuading the victim to visit a malicious site that is controlled by the attacker. This site would be crafted in a way that does not arouse suspicion from the victim’s end.
The BREACH attack works by performing an oracle attack in order to gain information about secrets in a compressed and encrypted response, in the sense that it sends a number of requests to the vulnerable web server, observes the data returned from the responses, and deduces a secret from these responses that they never intended to disclose. For example, if a site uses the known key prefix ‘token=’ to store a CSRF token, and that CSRF token is also reflected in the response body, then BREACH can be used to extract the value of that token byte-by-byte. At this point the victim would be tricked into sending a number of requests that try to guess the first byte of the ‘token=’ secret. This can be done by persuading the victim to visit a site that has an embedded script that includes a number of hidden iframe’s performing requests to the server. The HTTP responses from these requests are then measured by the attacker and due to the compression from the deflate algorithm, the response that has the smallest length would be coming from the request that included the correct payload for the first character of the secret. From here the attack can proceed byte-by-byte to guess the other characters in the secret.
So, if for example the CSRF token string is bbfb30b5771b9f916587d41d41d46671, then the BREACH attack would start by sending various requests, each with an attached payload such as ‘token=a’, ‘token=b’, etc. The attack would then compare the lengths of the response for each request sent, and again due to the HTTP compression taking place, the response with the smallest length would be coming from the request that contained the correct guess as a payload. Therefore, in this case the response with the smallest length would be the one coming from the request with the ‘token=b’ payload attached to it. Since the attack is based on LZ77 compression, the first character must be guessed successfully before attempting to guess the second character, and so on. The attack would then continue byte-by-byte until all the characters in the secret are revealed.
Most BREACH attacks can be completed in 30 seconds or less and only by making a few thousand requests to the vulnerable application.
However, Acunetix Web Vulnerability Scanner can scan for and identify BREACH attack vulnerabilities on your website and also provide ways on how to resolve this vulnerability.
As discussed by the researchers that first uncovered this vulnerability, the techniques for mitigating this attack are:
- Disabling HTTP compression
- Separating secrets from user input
- Randomizing secrets per request
- Masking secrets (effectively randomizing by XORing with a random secret per request)
- Protecting vulnerable pages with CSRF
- Length hiding (by adding random number of bytes to the responses)
- Rate-limiting the requests