The #julialang twitter data network was supposed to be part of this lecture but unfortunately I didn't have enough time -- so here's a thread about it.

How I built it: (1) take the #julialang tweets w >5 likes, (2) get the usernames, (3) find who they follow and build a network.
When I first visualized the network, I noticed that there was an apparent separation of some clusters, so the next thing I did is color-coded all the nodes based on whether the words "julia", "python", "rlang"/"rstats" appear in their bios. The resulting figure is pretty amazing!
Now if you're like me, you'll probably wonder what's going on with the "two arms" branching out from the julia cluster. So here is an annotated figure w high degree nodes... Fun observation: Everyone I manually inspected in the first group (top in the figure) has a Japanese bio.
So far, I haven't really given any *numbers*... one thing I was very curious about is to find the clustering coefficients. Here's what I found: the global CC was 0.43, but when I extracted the julia subgraph, the CC jumped to 0.7! Figure: marker size is bigger if local cc >0.5
Here is another local clustering coefficient figure where the marker size is proportional to the local clustering coefficient value.
And last but not least... PageRank! Of course, I had to run PageRank on this network. Here is the PageRank visualization with node sizes proportional to the PageRank value... I guess not so surprising, a bunch of the bigger circles were purple circles. 💜
Finally, just some references:
- Code available here: github.com/nassarhuda/MIT…

- Visualization method used: GLANCE -- honestly I was very proud of the visualizations our method, GLANCE, produced (cc: @austinbenson @dgleich). Here's a link to the paper: cs.cornell.edu/~arb/papers/GL…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Huda Nassar

Huda Nassar Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @nassarhuda

27 May
I had so much fun working on this data science course!

One aspect of the fun I had was learning interesting information about the data I used. I share my learnings here and look forward to hearing about yours.

#julialang #datascience
The next time you visit Yellowstone National Park to check out the Old Faithful geyser, know that if you wait for too long for the geyser to go off... you are likely to witness a longer eruption.
We use a cars dataset of car models with features such as horsepower and cylinders (& 5 more). We perform dimensionality reduction on this data & find out that European/Japanese cars cluster together whereas American cars form their own two clusters. But why? I'd love to find out
Read 13 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!