Read on Twitter

12,399 views

Nick Schrock

@schrockn

, 15 tweets, 4 min read Read on Twitter

1/ Yesterday I tweeted about this paper: www-cs.stanford.edu/~matei/papers/… I want to explain why I think it’s so interesting. It’s re: gg, a framework for executing “everyday apps” (e.g. distributed compilation, video encoding) on FaaS platforms (e.g. aws lambda).

2/ They lead with well-known workloads amenable to distributed computation like build systems and video encoding. They are familiar and good for benchmarking. But this is a much more generalizable system/technique that should be applicable across a bunch of different domains.

3/ The core of the system is their cloud function IR. The primitive unit is a “thunk”. It is a lightweight container with an executable, and arguments of that executable. All files/objects required by the thunk are embedded as symbols/content-addressable hashes.

4/ Thunk outputs themselves can be hashes. This is how gg forms its dependency graph. The thunks themselves can also be hashes. This is how gg does dynamic/control flow constructs, via tail recursion.

5/ Tail recursion piece gets *very* light treatment in paper. Would love to hear more. I’m sure there are lots of practical issues. Debugging/monitoring could be very challenging.

6/ These graphs can be executed on arbitrary compute engines: lambda/google cloud functions/azure cloud functions. Intermediate outputs/artifacts (referred to by hash) can be stored on arbitrary cloud store (object stores like s3, gcs, azure blob storage, etc).

7/ They do some *real* witchcraft to make their Chromium build example work by doing a “dry run” of sorts. Very impressive but I think this method is difficult to work generally. Bazel/Buck/any build system with explicit dep graph would be seemingly easier to deal with.

8/ One thing I love about this is that I think this demonstrates proper use of these compute engines. “Functions” in FaaS kind of implies wrong level of granularity: a normal function in a programming language. These are process invocations. Appropriate granularity level.

9/ Related to above, a lot of people seem to think the purpose of a FaaS is an even more fine-grained version of microservices. This just doubles down on microservices madness (don't get me started). By contrast, they are distributing the computation of a single monolithic app.

10/ Huge infra difficulty of this stuff is the slowness of object stores to pass intermediates between thunks. They try to mitigate but doing some scheduling/data locality cleverness. Clear that a fast object store “memory for datacenter” is going to be critical.

11/ This issue was discussed in some detail in the excellent Berkeley serverless paper (www2.eecs.berkeley.edu/Pubs/TechRpts/…). Methinks these two research groups talk a lot.

12/ This is base technology for thinking of a cloud service as one big computer. Thomas Watson, IBM guy who said "I think there is a world market for maybe five computers." in 1943 maybe end up being right if you replace “computer” with “hosted cloud”. Just a few generations off.

13/ I’m interested in high-level programming models stacked on these systems. Dagster (see medium.com/@schrockn/intr…) is a species of this. We currently do more static graphs meant for coarser-grained batch compute. But targeting gg as a compute engine is interesting possibility

@marius

@marius

14/ Reflow by @marius similar. He noted that reflow does dynamism via continuations, not tail recursion. Reflow more coarse-grained. But content-addressable object substrate w/ referential transparency/incremental compute essentially identical github.com/grailbio/reflow

15/ This ecosystem is really exciting. Gg is a *super* interesting system in that ecosystem. Thanks so much to Stanford SNR for publishing!

Like this thread? Get email updates or save it to PDF!

Subscribe to Nick Schrock

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Like this thread? Get email updates or save it to PDF!

Subscribe to Nick Schrock

This content may be removed anytime!

Try unrolling a thread yourself!

More from @schrockn see all

Related threads

Trending hashtags

Did Thread Reader help you today?