Encrypting compressed data is susceptible to CRIME

Beyond being cool backronym, reading about CRIME attack lead me to a journey of revisiting what I (should have) learned back in college.

Posted Apr 1, 2025

By Hrvoje

3 min read

Encrypting compressed data is susceptible to CRIME – Long version

WARNING

This is work in progress…

When I was in college, one of the classes I was intrigued by the most was certainly cryptography. We introduced many (frightening) terms and concepts some of which I wasn’t able to fully understand (nor memorize) at the time, it was shown how something seemingly impossible is indeed possible, how something seemingly possible is indeed impossible, and it demonstrated one of the most interesting usages of mathematics. I had fun figuring out concepts but it was difficult.

So what is cryptography?

I won’t spend much time on explaining what cryptography is because internet is full of great and simple explanations. I advise you to look for them and learn what is “symmetric key cryptography”.

I’ll keep it simple and incomplete (this is Older Brother explaining stuff to younger siblings after all), cryptography class thought us how two persons (lets call them Bob and Alice) can communicate in a way that no other person (lets call her Eve) can tell what they are talking about – even if Eve is there with them listening! Something like establishing some secret language which only Bob and Alice can understand. In cryptography, establishing and communicating with such language is called an encryption scheme.

The simplest encryption scheme is when Bob and Alice have the same “key” with them, and that Eve doesn’t posses it. We assume that there is some message written in language that Bob, Alice and Eve can understand and that Bob wants to send a message to Alice without Eve knowing what it is about. With that key Bob can “lock” that message. When the message is locked we get a message written in different language (scrambled data), and this kind of message we call a ciphertext. Bob sends that ciphertext to Alice, and Alice unlocks it with the key and gets the original message. Even if Eve intercepts the data being sent, it can only read ciphertext which she cannot understand because she doesn’t have the key.

So what?

To keep the story short, we were learning about what makes encryption schemes “safe”, when we introduced the term (prepare your brain for not reading next 6 words) “Ciphertext indistinguishability under chosen plaintext attack”, abbreviated to IND-CPA. Basically, IND-CPA says that for encryption scheme to be safe that adversary (Eve) cannot tell what the original text is even if she can control parts of original text. And I thought “Yeah… Like that would ever happen…”. Boy, was I wrong…

Years later (couple of months ago) I was thinking about how compressing encrypted data doesn’t make sense, and then I wondered “Well, what about encrypting compressed data?” and that’s where CRIME showed up in some StackOverflow answer.

You should also know how to compress…

Wikipedia says that In information theory, data compression […] is the process of encoding information using fewer bits than the original representation. So after compressing some data (like text, image or audio) it should take less space on our hard drive Or take less for data to be sent over the network.

If data quality is preserved, then we talk about lossless compression, and it is always applied to textual files. Lossless Compression takes advantage of patterns – like repetition of words or Sentences in text.

If data quality is lowered then we talk about lossy compression. It relies on discarding some information from data without pissing off Users. Trivial example is removing frequencies outside of human hearing range from an audio file.

CRIME – Compression Ratio Info-leak Made Easy

“Compression Ratio Information Leak” suggests that information (which should’ve been secret) leaked by using compression.