Real developer problem that gets solved by #graph. Today's thread: JSON document nesting

🧡 Image
so it's common to have data elements like a "food" with some related information (here, it's "nutrition information"). When using a document database you basically have two choices:

- Store the two documents separately and lookup the other when needed
- Store them together
Storing them together is great for efficient read operations. But it's bad for updates. If many foods share the same nutrition facts, and you need to update the facts, you have many updates to do to change one simple thing

because you duplicated data in sub-documents
what if you store them separately and then do lookups?

Well that makes updates easily, but then you pay the price on the reads

So to some extent, you're in a bit of a fix either way Image
note that we're in #graph territory here because fundamentally the question is about *the relationship between two things* and that's where graphs excel: in the relationship management. So how would a graph do this?

(:Food)-[:HAS]->(:NutritionFacts)
If you need to update nutrition facts, you only update one element (all foods are connected to it). And of course reads are fast too because optimizing for rel traversal is the whole point of a graph database. Best of both worlds
the schema (i.e. what properties we store about foods and nutrition facts) can independently evolve; unlike in a JSON database where they'll tend to be coupled either by ID or by structure.

Why does this work? Because relationships are "pre-materialized lookups"
when you write a relationship to a graph database, it's similar to doing a lookup or a join because you had to know what to connect.

The difference is that the cost is paid one time, up-front, and never again. Contrast with paying every time you query, or every time you update
the meta-pattern here is that when the problems you're struggling with are *about the relationships* then #graph will likely help and when the problems are about dealing with large numbers of homogeneous not necessarily connected individuals, then graph probably isn't the fit

β€’ β€’ β€’

Missing some Tweet in this thread? You can try to force a refresh
γ€€

Keep Current with π”»π•’π•§π•šπ•• π”Έπ•π•π•–π•Ÿ

π”»π•’π•§π•šπ•• π”Έπ•π•π•–π•Ÿ Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mdavidallen

20 Oct
The cat, as a metaphor for software engineering
Sub-systems -- and "System of Systems" Image
Abstraction Image
Read 7 tweets
18 Oct
For some reason, some mutuals got me thinking about smoking

Memory: working as a teen, I knew a guy who was 100% sure that it was the glue in cigarette papers that caused cancer, not the tobacco. He smoked natural tobacco leaf-wrapped cigars, thought he was all good
at the shop where I worked, there was a 90-yr old guy who walked with a cane. Was the spitting image of William S. Burroughs. Every day, he came in and got his usual, 2 packs of Lucky Strikes unfiltered Image
when you're a teenage smoker yourself, a 90-yr old with a 2-pack a day Lucky Strikes habit is like a hero, but even then I knew he was more lucky than "right"
Read 8 tweets
1 Oct
#SQL vs. #Cypher for hierarchies

Recursive common-table expressions (SQL)

vs.

built-in path variables in #graph ImageImageImage
This only shows part of the picture; with recursive CTEs in #SQL I believe it is the case that:

- You can constrain max recursion, but not minimum without writing extra code

Contrast to cypher:

MATCH (e:Employee)-[:REPORTS_TO*3..5]->(:Employee)
recursive CTEs are a good example of "implementing a graph abstraction on top of something else". Namely the employee table represents a bunch of node instances; the managerID is a relationship, and the CTE implements a graph traversal.
Read 5 tweets
14 Sep
Example thread on how to load #JSON into #Neo4j Aura -- working up from simple to more complex. Let's use the .@TheHackersNews public API to load a mini-feed of stories.

First: head endpoint with best stories, and simplest JSON load: Image
the apoc.load.json call always returns "value" with whatever came back. HackerNews is sending results, an array of post IDs.

We can extract out just the post IDs with a bit of extra cypher like this. Nice clean array of long values. Image
One step further; now we will UNWIND the array, turning the nested array into each individual item, and then build the URL we'll ask of HackerNews to get the detail of each story. This is how we build URLs one by one; we just take the story ID and concat it into a string URL Image
Read 9 tweets
19 May
A function is a DB that maps a key/input set to a value/result that's why they memoize so well

A DB is an impure function that returns a value given a particular input/query

GitHub is a database of programs

And data.gov is a program that returns DBs
Streams and tables are kinda the same thing looked at through different lenses

🀯

docs.confluent.io/platform/curre…
Tables and graphs are kinda the same thing looked at through different lenses

🀯
Read 5 tweets
19 May
Batch vs. streaming data ingest into #graph and .@neo4j

(mini thread)
So the main typical tradeoff is latency. Batch when you need fresh data in larger volumes, say once per hour/day/week/month

Stream when time value of data is high/immediate and you can't afford to be more than minutes behind
The overall event queue (so to speak) that's being ingested has a total velocity. Let's say it's

- 1M events/day
- ~42k events/hour
- ~694 events/min
- ~69 events/sec

Let's say 2kb per event, or roughly 2gb/day, 138kb/sec.
Read 20 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(