✨ Thread about UNIX, open() & the filesystem ✨

I think everyone can benefit from a more thorough understanding of FS, even if not applicable to Windows because Windows is shit

you can avoid races, bugs & edge cases
so. you might have heard how in UNIX "everything is a file". you have many kind of files. what we usually understand by 'file' is called 'regular file', but directories are another kind of file
files do NOT have names. directories give names to files.

just like a regular file is a "tape" of bytes that allows random access, a directory is a table of *pointers* to other files, where each pointer has a name
💡 the name can be an arbitrary sequence of bytes... as long as it does not contain the 0x2F byte (/) and the 0x00 byte (this is a consequence of the OS interface, which uses nul-terminated strings)

also the names "" (empty), "." and ".." are reserved and can't be used
💡 the bytes are expected to contain valid UTF-8 text. it's just a convention, you do you, but please don't be a bad person. also, please don't put control characters in there
anyway. directories are, conceptually, a dictionary mapping these names into pointers to other files. these files can also be directories, and that lets you have a hierarchy of names
kernels like Linux will make sure these pointers never form loops, so it won't let you point a directory to itself (for instance)
directories let you *refer* to a file by name, and that's useful for exactly one thing: getting a reference to a file

once you get that reference (a number that is called File Descriptor) directories become entirely useless, you'll just use that number
in particular, directories let you tell the Operating System: "hey, given this directory, can you get me a reference to the file with this name?"

this is what the openat() syscall does. it takes the FD of a directory, the name, and returns a new FD that refers to the file
now, you may need to first access an intermediate directory (say foo) and then a file from that directory (bar)

you could do that with:

file = openat(dir, "foo");
file = openat(file, "bar");
you first get a reference to the "foo" directory, and use it to get a reference to the "bar" file at that directory

but that's a bit clumsy, so the OS lets you do it in one operation by joining both names with a "/" byte:

file = openat(dir, "foo/bar");
but make no mistake: the OS is doing pretty much those 2 operations internally, it's just an API convenience
hold on. to get a reference to a file, we already need a reference to a file (and it has to be a directory)

so where do we get that first reference from? when a process has just started, it has no FDs to directories, right?
it does! processes always have a reference to a directory, which is called the Current Working Directory. when a process spawns another process, it inherits that reference

that reference is the number -100 and it's usually made into an AT_FDCWD constant
the CWD reference is important because it's always there, and it makes sure that processes can always open files

in fact people usually know the open() call, which is equivalent to openat(AT_FDCWD, ...)
💡 that -100 reference isn't really a File Descriptor like the rest, because you can't drop it (through close()), though you can use the special operation fchdir() to replace it with another directory

fchdir(x) is conceptually like
close(AT_FDCWD);
dup2(x, AT_FDCWD);
okay, so far we've learnt about
➡️ 2 kinds of files: regular files & directories
➡️ directories give names to other files, including directories, thus creating a hierarchical name structure
➡️ references to files are numbers called FDs (File Descriptors)
➡️ to get FDs to new files, we need an FD of a directory + a name
➡️ all processes have a special FD called the CWD, at number -100, which always points to a directory
In a future thread, we'll see:

❓ How to mutate directories. So far we've only seen how to query an existing entry of a directory (through openat())

❓ The "." and ".." special names that magically exist in all directories.

❓ File permissions. What are they & how do they work

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Alba 🌸

Alba 🌸 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mild_sunrise

5 Aug
github pages caido en @movistar_es otra vez, alguien puede confirmar?
@movistar_es las dos CDNs de fastly más importantes (github pages y githubusercontent.com) caídas. las IPs no son accesibles...
@movistar_es please @github @movistar_es these downtimes happen constantly, please fix them

in the meanwhile you can point to origin directly
140.82.121.13 *.githubusercontent.com
140.82.121.11 *.github.io
Read 4 tweets
4 Aug
los dramas me recuerdan lo fácil que es que 2 o más personas se hieran y luego no tengan la empatía o habilidad suficiente como para arreglarlo. es deprimente pensar en lo altas que son las posibilidades de que las cosas se tuerzan cuando se crean lazos
se me quitan todas las (pocas) fuerzas que tengo para socializar, de verdad
también tenemos muy idealizada la mediación, hay muchos casos en que simplemente *no se puede* arreglar algo por mucho que se intente, solo acabas más quemada
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(