We love end-to-end encryption for a reason! It protects your data so no one can access it other than you, therefore it stays private. Though, even with end-to-end encryption, your metadata can still leak a lot of information about you.
Here are a few examples of what (personal) information metadata can leak:
IP Address
By getting your Internet Protocol Address, the server gets plenty of information about you (these are some examples), and the people running the server can then contact your service provider to know exactly who you are.
One popular way of preventing this is by using software such as Tor or VPN to hide your origin.
Social Graph
The social graph represents all of your social relations. One way messaging apps can create such a graph is by checking who you communicate with and who communicates with you. This means that even when using end-to-end encrypted messengers like Signal, the server can know who you know, and get information on people it didn’t even know existed just by the fact you have them on your contact list.
Let’s use the above graph as an example and assume that A is a journalist currently working with a group of dissidents (B, C and D) and an unrelated source (E). A, B and C have already been flagged by the government and are being specifically targeted. Even though D is not connected to the journalist, it’s easy to infer that D is a part of the group whose other members are in touch with the journalist, which makes D a person of interest. Source E becomes a target just by their association with the journalist. Until the journalist’s social graph leaked, the government didn't even know D and E existed.
Here is a post by Signal illustrating how hard this problem is to solve.
Access Patterns
Solely from the fact that we use certain data at a certain time, information on where we live, when we are at work, etc. can be deduced. What’s worse, accessing our data rarely and only at specific times can get us targeted.
In this example, A and D are, again, the journalist and the careful dissident. Even if the journalist’s social graph hadn't previously leaked, the dissident could have been flagged by the government because of their access pattern. The journalist (like most people) has their phone turned on at all times, while D mostly keeps their burner phone off to avoid being tracked. D turns on their phone only around 6 am, 10 am and 7 pm for short periods of time to talk to other dissidents. Because of the unusual access pattern, D sticks out and is easily identifiable as a person of interest.
Therefore you should be aware that sometimes even taking precautions, such as turning your phone off when you're not using it, can make you stick out.
Usage Volume
A good thing is that the leakage of metadata can also work in your favor, since the tracking entity also leaks metadata while tracking you. Just knowing the size of some content (even though it’s encrypted) can be enough to know whether you’re being tracked.
Imagine that the careful dissident D didn’t become a person of interest in the previous examples, but still has a hunch that they are being tracked. They assume that bugs are planted somewhere in the room. If they could listen to radio signals, even if encrypted, they would know whether they are tracked or not based on the amount of data being transmitted in the air. The signal would be very different in the case of a lot of noise and that of silence. The dissident can therefore test it by alternately being loud and silent, and then correlating their actions with the amount of data transmitted no matter the fact that they can’t access the data itself.
Using this method the dissident can at least establish whether they are being listened to or not.
Closing Words
These are some of the ways in which metadata may indirectly leak private information. Just remember, the more everyone is cautious with their (meta)data, the less anyone sticks out and can get targeted.
That's why we also need more privacy respecting solutions that were designed from the ground up to protect user privacy!
Thanks to David Anakin Visuals for the graphics.
Text corrected thanks to mrkoot from Reddit.