, 36 tweets, 11 min read
My Authors
Read all threads
A thread on 1st-party vs. 3rd-party cookies, and how tracking companies are actively trying to undermine users' privacy preferences.

First, some basics (this is simplified):

A cookie is a key/value pair (a variable) that allows websites to save data in users' web browsers. 1/
Cookies aren't always bad: they allow settings to be saved, as well as enable things like shopping carts. But they can—and are—used for tracking people: a website can assign a user a unique identifier, so that they know what pages that user visits on a site over time. 2/
Each cookie is bound to a domain name: every time the browser requests a URL from that domain name, it automatically sends that web server all previously-set cookies (that haven't expired or been manually deleted) for that domain name. 3/
A website can contain content from other domains, each of which can set its own cookies when that content is loaded. For example, cnn.com embeds an image from facebook.com. When the browser loads that image, facebook.com sets a cookie. 4/
These are known as "third-party" cookies. That is, cookies set by domains other than what appears in the browser's URL bar. 5/
When a user returns to cnn.com, CNN knows that it's seen this user before (based on its first-party cookie), and Facebook also knows that this user previously visited CNN (via its third-party cookie). 6/
If a company has its 3rd-party content embedded in many different websites (as Facebook/Google/Adobe do), they can track users across all of them, allowing them to build a detailed profile of a user's behavior.

Since this happens automatically, it's mostly invisible to users. 7/
Worse, embedded 3rd-party content often isn't part of the user experience (e.g., useful images and widgets), but is explicitly for tracking: images that are a single transparent pixel, so that they are (1) invisible to users and (2) allow the 3rd-party to set tracking cookies. 8/
To give users some modicum of control over this, browsers allow users to block third-party cookies, if they so choose.

Until very recently, third-party cookies were stored by browsers by default, and a user had to take proactive steps to configure their browser to block them. 9/
Similarly, this is how a lot of browser privacy extensions work: they simply don't allow third-party cookies to be set. Many will use lists of known tracking domains and block cookies being set by those domains (e.g., this is one of the things that @disconnectme does). 10/
This brings us to "cookie syncing":

For security, cookies are only sent to the domain that set them: a browser won't send facebook.com cookies set by cnn.com. 11/
Thus, if companies want to "share" identifiers they've created to track users, they need to be creative.

If CNN wants to automatically and uniquely identify its users to Facebook, they can't simply share the same cookie. They need to do what's known as "cookie syncing." 12/
There's a good explanation of the details of cookie syncing by @s_englehardt here: freedom-to-tinker.com/2014/08/07/the…

13/
This brings us to syncing of first-party cookies:

Because many third-party cookies are blocked by privacy tools when users don't want to be tracked, many of the biggest players have shifted to colluding with websites to set first-party cookies instead. 14/
The way this works is:
1) CNN sets its own first-party cookie to uniquely identify a user.

cookie name: "_fbp"
cookie value: "fb.1.1574792864436.1680904631"
cookie domain: "*.cnn.com"

("_fbp" is the name of Facebook's tracking cookie.)

15/
2) CNN appends the value of this cookie to a URL pointing to Facebook's web server. (The extraneous information after the "?" in a URL is known as a "query string"; it allows additional information to be embedded in a URL, readable by a web server and scripts on the page.) 16/
3) This URL is then embedded in CNN's website, which points to a 1x1 transparent pixel, hosted by facebook.com. The browser sees this as an image it needs to retrieve, and in so doing, transmits the query string containing the value of CNN's cookie to Facebook. 17/
Before, sites would embed trackers—often 3rd-party images—so when browsers load them, a 3rd-party server would send a "Set-Cookie" HTTP header to set a 3rd-party cookie.

Now, tracking companies use 1 of 2 ways to set these as 1st-party cookies (less likely to be blocked): 18/
1) JavaScript on a webpage can set cookies for that webpage's domain. So if a tracking company convinces a website operator to include their JavaScript, they can set 1st-party cookies without having them appear in HTTP traffic. Facebook offers this:
facebook.com/business/help/… 19/
2) If the 3rd-party's server appears to actually be within the 1st-party domain, the browser will be tricked into seeing these cookies from a 3rd-party as 1st-party cookies. This is what Adobe is asking websites to do; a CNAME is a DNS alias.



20/
So, by setting 1st-party cookies instead of 3rd-party cookies, trackers are able to more easily evade users' preferences to block them. 21/
However, when cookies are set by 1st parties, cookie syncing is needed. That is, these 1st-party cookies will get shared with the 3rd-party via the method I described above (e.g., including their values as query strings for 3rd-party URLs embedded in the page). 22/
That is, Facebook's JavaScript will set a new 1st-party cookie for every 1st-party website the user visits. All of these will get sent to Facebook as query strings, but Facebook needs a way of determining that they all came from the same user. 23/
There are multiple ways that this can be accomplished:

1) The 1st-party website can append additional identifying information to the query string that gets sent to Facebook. For example, an email address, name, etc. 24/
In fact, Facebook explicitly instructs website operators to do this:
developers.facebook.com/ads/blog/post/…

However, the websites setting the cookies may not have this information. 25/
2) Another way that they can do it is via "browser fingerprinting," which basically just means uniquely identifying users based on the characteristics of their web browsers. The EFF provides a great demo of how this works here: panopticlick.eff.org

26/
A fingerprint generated in the browser can have relatively high entropy (i.e., be uniquely identifying). In conjunction with an IP address, more so.

We previously showed that users can be identified based on the fonts they have installed: guanotronic.com/~serge/papers/… 27/
3) Finally, trackers can use their own 1st-party cookies to do this cooke syncing:

When software blocks 3rd-party cookies, they generally do so by preventing cookies from being *set* by 3rd-parties; they often don't prevent existing cookies from being *sent* to 3rd parties. 28/
That is, imagine going to Google and logging in. This will result in a 1st-party cookie being set. If you're blocking 3rd-party cookies, that won't prevent this.

29/
Once set in the 1st-party context, this cookie could later be transmitted in the 3rd-party context (i.e., another website may embed URLs pointing to google.com, which when loaded, will automatically be sent the previous google.com cookie). 30/
This is why Google and Facebook are pushing for using 1st-party cookies and have no qualms about blocking 3rd-party ones: it's a competitive advantage for them since users are likely to have their cookies stored in 1st-party contexts (i.e., many users log into their sites). 31/
While I pick on Google, Facebook, and Adobe here—and I'm not sure "picking on" is the correct term, since this is all empirical—there are likely many others doing this. This is an industry-wide shift. 32/
What's galling about all of this is that it's being done *explicitly* to prevent users from exercising control over their privacy.

In most of these cases, the companies' marketing materials make this very clear. 33/
For example, from Google's documentation: support.google.com/analytics/answ…

Here, "more reliable" means making it harder for users to opt out.

34/
Facebook writes about using first-party cookies to "reach more customers": facebook.com/business/help/…

...using only third-party cookies is "less effective." 35/
But they care very deeply about the privacy of their users!

Fin.
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Serge Egelman

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!