Tweet

Fah Soh Lati Doh

15 Mar, 35 tweets, 15 min read

https://twitter.com/isocroft/status/1371424560617304068

Okay, let's begin: We're about to enter the technical details of ad publishing and ad targeting/marketing which is of course the business that Facebook is into. Now, Facebook needs to learn more about the user (likes, choices, behavior) if they are ever going to target ads...

1\

https://twitter.com/isocroft/status/1371424560617304068

...to users that will be valuable to their advertisers. Now, the method employed by Facebook is called "Intelligent Web Tracking" and is based on HTTP Cookies. Now, cookies are those little snippet of data stored on a computer via the browser in use and is associated to...

2\

...the website url that initiated the storage of the piece of data. Now, there are 2 broad categories of cookie from a lifespan perspective - Session cookies and Persistent cookie. Session cookies are created by your fav back-end frameworks like Django/Laravel/Express/Nestjs

3\

for the purposes of storing a session id for auth managment for detecting the currently logged-in user or even for detecting a guest user. The easiest way to create a session id is to not set it to expire. So, now you know that HTTP cookies expire. Persistent cookies are...

4\

...on the order hand are set to many years or months or weeks or days in the future to expire. We will be seeing examples soon (on Facebook itself). Now cookies have some other settings that are crucial to understand. These include "domain", "path", "secure". Let's look at...

5\

... "domain" and "path"; now these settings makes it possible to restrict access to the value of the cookie according to which url wishes to access it. There are two ways to set the "domain".

1. full
2. tld

full domain setup is as follows: "domain=facebook.com"

6\

"tld" stands for "top-level domain" and the setup is as follows: "domain=.facebook.com". Now, you would see that i omited the "www" also notice the "." that precedes "facebook.com". This means that any subdomain of "facebook.com" can access the cookie

7\

"domain=facebook.com" means that only "facebook.com" can access the cookie. so "web.facebook.com" cannot access the cookie or "developers.facebook.com" can't access it.

Now, "path" takes it further. By default "path" setting has a value of "/"...

8\

...which has no restrictions at all. But you can set it to "/photo" further restricting access to that path. So, "domain=facebook.com" and "path=/photo" means the cookie can only be accessed by facebook.com/photo only. "facebook.com" can't access it

9\

"secure" setting basically binds the cookie to a secure url (i.e. the protocol must be "https" before the cookie can be accessed). you set it as follows: "secure=true" (It's a boolean value).

Okay, we have the preliminaries out of the way. Let's move ahead!

10\

If you'd like to know more about HTTP cookies, see this article: valentinog.com/blog/cookies/ (We only went over the need-to-know aspects of HTTP cookies as it concerns the topic on ground - How Facebook tracks you).

Okay enough theory, let's look at Facebook cookies (live).

11\

Here you go. This are actually the cookies set when you log into FB. This is my information and as you can see Facebook stores a lot of cookies including my Facebook ID (c_user)

Can you see the "domain" and "path" columns ? can you see the expiry of the highlighted cookie ?

12\

It expires in 2023 (This is an example of a persistent cookie). Others expire this year (2021) and next year. In the image, you can also see a session cookie too named "presence"

Now look at this particular cookie (x-src). This cookie stores facebook urls i have visited...

13\

...while logged-in. As you can see, the value is HTML URL encoded (a.k.a Percent encoding) and this is the value when decoded:

As you see, cookies can store just about anything in any format. you can store a JSON string inside a cookie if you like. cs.mcgill.ca/~rwest/wikispe…

14\

Now, i need to mention a particular behavior of browsers as it concerns cookies. Whenever a browser loads a url that matches the "domain" and "path" that was set when the cookie was created the value of the cookies are sent along with the HTTP request for that url...

15\

...via the HTTP request header "Cookie" to the server. The "bogus-english" name for this browser behavior is called #Ambient #Authority.

Like a mindless robot, the browser doesn't check the identity of who initiates the url HTTP request and will send cookies for that url.

16\

Facebook exploits the SHIT out of this browser behavior to make Off-activity Tracking (what Facebook calls it) possible using their Intelligent Web Tracking product called Facebook Pixel. You know about Facebook Pixel ? Well, if you don't here you go: facebook.com/business/learn…

17\

So, just to reiterate, here is the combo that makes Facebook tracking possible (and makes it look like magic):

1. Persistent Cookies (with "domain=.facebook.com" and "path=/")
2. #Ambient #Authority (browser behavior)
3. Facebook Pixel JS tracking libraries/techniques

18\

BTW, Facebook is not the only entity that takes advantage of #Ambient #Authority. White and Black Hats take advantage too to attack websites/webapps (using CSRF - Cross-Site Request Forgery). There are ways to mitigate but that's beyond the scope of our topic for today.

19\

Let's talk about Facebook Pixel tracking for a bit. but before we do. I found a short medium article about the set of cookies Facebook creates: techexpertise.medium.com/facebook-cooki…

The "fr" and "sb" cookies are the ones used for ad tracking purposes. See the description of the 2 cookies

20\

"fr" cookie: Used by Facebook to deliver series of advertisement products such as real time bidding from third party advertisers. See: cookiedatabase.org/cookie/faceboo…

"sb" cookie: Used by Facebook to store user browser activity across Facebook services. See: cookiedatabase.org/cookie/faceboo…

21\

Here is an excerpt from facebook.com/policies/cooki…

So, we have dealt with #Ambient #Authority (sorry for the big english) and Persistent Cookie parts of the equation. Let's move the last piece: Facebook Pixel tracking; let's see how it works (under the hood) and other things

22\

This is the part i am excited about to explain cos this is where the "magic" finally happens. Now, Facebbok Pixel is basically JavaScript that website owners copy and paste into their webpages and track stuff (same way using something like Google tag manager).

See code:

23\

It's important to note that if the website doesn't use the Facebook pixel code, then the tracking will not work at all.

Now, you see that <img> tag in the <noscript> section (in the former tweet) that points to facebook.com/tr?id={pixel_id} ?

That's the koko of Pixel!

24\

That image loads a transparent pixel that is 1 px by 1 px but in the process, the HTTP request for that image sends all the persistent cookies stored (remember "domain=.facebook.com", "path=/") when i logout.

Persistent cookies don't get deleted when logged out of Facebook

25\

Remember that the persistent cookies are sent when the pixel image is requested using the <img> tag due to #Ambient #Authority.

Now, ad-blockers could block access to the cookies or on say a browser like Safari, ITP (Intelligent Tracking Prevention) may kick in

26\

When ad-blockers or ITP (only on Safari browser) kick in, they prevent #Ambient #Authority and hence make tracking impossible.

Safari once had this fancy JavaScript API for seeking permission before Intelligent Web tracking cookies could be accessed: webkit.org/blog/8124/intr…

27\

When these persistent (Intelligent Web Tracking) cookies ("fr" and "sb") reach the Facebook server (facebook.com/tr) from the Pixel code (<img>), they're interpreted and associated with you and the browser used to login to Facebook (even though you're now logged out)

28\

And so completes the cycle of tracking but there's more....

Now Facebook Pixel drops cookies of it's own too on your browser. These are called third-party cookies and unlike the "fr" and "sb" cookies, they are session cookies.

They're used when you get back to Facebook

29\

So, what are third-party cookies again (sorry - i won't go into details) you might ask. Well,read about them here (if you like): clearcode.cc/blog/differenc…

Just know that "fr", "sb" and "tr" cookies when on a website other than Facebook are third-party else they're first-party

30\

Okay, it's important to note that the "tr" cookie is created by Facebook Pixel but with "domain=facebook.com" and not Facebook itself.

The "tr" cookie is used to advertise to you on a service/product for which Facebook Pixel has just tracked you on.

31\

This usually happens when you visit a website using Facebook Pixel and come back (login back to Facebook) to Facebook and you find an ad placed in front of you that matches the website you just visited.

No wonder some call it AI or ML but it's not. Just good old cookies

32\

In certain instances, graph.facebook.com (think graph data structure - which is how Facebook represents all friendships/associations on Facebook) does utilize the "fr" and "sb" cookies to show an ad you have seen OR might see to your friend(s)

33\

So, now you know how tracking happens. Fun Fact: back in the day before Facebook Pixel became a thing, the LIKE button (on several websites) was the official tracking tool of Facebook.

I hope you enjoyed the read ? If you have questions, just tag me

34\

THE END.

@threadreaderapp

@threadreaderapp unroll

• • •

Missing some Tweet in this thread? You can try to force a refresh

Share this page!

Fah Soh Lati Doh

Try unrolling a thread yourself!

More from @isocroft

Fah Soh Lati Doh

Fah Soh Lati Doh

Fah Soh Lati Doh

Fah Soh Lati Doh

Fah Soh Lati Doh

Fah Soh Lati Doh

Did Thread Reader help you today?

Like this author's thread?