Example thread on how to load #JSON into #Neo4j Aura -- working up from simple to more complex. Let's use the .@TheHackersNews public API to load a mini-feed of stories.

First: head endpoint with best stories, and simplest JSON load: Image
the apoc.load.json call always returns "value" with whatever came back. HackerNews is sending results, an array of post IDs.

We can extract out just the post IDs with a bit of extra cypher like this. Nice clean array of long values. Image
One step further; now we will UNWIND the array, turning the nested array into each individual item, and then build the URL we'll ask of HackerNews to get the detail of each story. This is how we build URLs one by one; we just take the story ID and concat it into a string URL Image
Now, let's just add an apoc.load.json call on to each of those URLs we constructed, and see what the resulting JSON of each story is. This gives us a result set of the JSON payload of every story on the site.

#BuildingStepByStep #neo4j Image
Simplest graph format we can do here is just a user submitted a story. Something like:

(:User)-[:SUBMITTED]->(:Story)

So let's do that.
Here's the load code with the MERGE statements that put our Users and Stories into the graph. And it works! Image
resulting snapshot of the graph in #Neo4jAura Image
some slightly fancy things in that merge statement explained:

(1) hackernews reports times in UNIX time (seconds since the epoch) so

datetime({ epochSeconds: value.time })

turns that into a neo4j datetime
it's possible for the URL to be missing from a story! So:

coalesce(value.url, 'REMOVED')

this returns the first non-null value; so if value.url is missing, you get a url of 'removed'

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with 𝔻𝕒𝕧𝕚𝕕 𝔸𝕝𝕝𝕖𝕟

𝔻𝕒𝕧𝕚𝕕 𝔸𝕝𝕝𝕖𝕟 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mdavidallen

19 May
A function is a DB that maps a key/input set to a value/result that's why they memoize so well

A DB is an impure function that returns a value given a particular input/query

GitHub is a database of programs

And data.gov is a program that returns DBs
Streams and tables are kinda the same thing looked at through different lenses

🤯

docs.confluent.io/platform/curre…
Tables and graphs are kinda the same thing looked at through different lenses

🤯
Read 5 tweets
19 May
Batch vs. streaming data ingest into #graph and .@neo4j

(mini thread)
So the main typical tradeoff is latency. Batch when you need fresh data in larger volumes, say once per hour/day/week/month

Stream when time value of data is high/immediate and you can't afford to be more than minutes behind
The overall event queue (so to speak) that's being ingested has a total velocity. Let's say it's

- 1M events/day
- ~42k events/hour
- ~694 events/min
- ~69 events/sec

Let's say 2kb per event, or roughly 2gb/day, 138kb/sec.
Read 20 tweets
26 Nov 19
Seems like a lot of #graph visualization stuff cues off of humans' tendency to want to reason about things in terms of either time, or space.
In a force-directed layout, effectively you have an x/y axis and you're reasoning about the graph in space, where "distance" is used as a proxy for path length.
there are also a lot of Google Earth representations, that try to render the spatial view as more tangible

"Crap on a map" has worked really well for a long time because brains are good at reasoning about known physical spaces
Read 13 tweets
20 Nov 19
Halin v0.12.0-beta was just published, and open source monitoring tool for Neo4j. Biggest new thing? Support for Neo4j 4.0 milestone releases! Want to know more? Thread 👇
Neo4j 4.0 is in the testing phase. You can read some more here, but 2 biggest new things are:

✅ Multi-database support
✅ Fine-grained security

neo4j.com/blog/neo4j-ent…
This means though that Neo4j is no longer one big graph. It's multiple graphs, strongly separated, and so that's how Halin looks at it.

Each graph can be independently started, stopped, and deleted.

And of course you can make new ones.
Read 10 tweets
3 Jun 19
Halin v0.11 was just released, with significant new stuff! Also a new UI design. Let's jump in (thread)
Cluster members exist in their own slide-out menu. The "tab per member" approach wasn't working with bigger clusters. Now you have room to grow.
It's now possible (with most recent #APOC) to get storage capacity metrics, so you can see how close you are to filling your disk which tends to make @neo4j very unhappy. Thanks to @mesirii and @santand84 for several things that helped make this possible.
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(