As you may know I have been scraping & compiling tweets about Donald Trump weekly
#Vader Sentiment Analysis says that this latest batch of tweets, based on the same keyword list, classified the majority of the tweets as POSITIVE
pos 61608
neg 53446
This bucks the trend
The averages are interesting, a sort of inverted bell curve where most tweets have a few re-tweets, views etc. and then the opposite end of the spectrum where a minority have a TON of views, likes, etc. or none at all
(mean)Average
Likes: 12
Retweets: 3
Replies: 1
Views: 2,828
The replies seems weirdly low, I'll double check that. Views are easily the data point with the highest volatility.
most replies: 1,334
most retweets: 21,321
most likes: 72,795
most views: 133,790,475
So it still looks likes replies are the rarest interaction maybe.
I was wondering how there were so many tweets that had 1 view, no likes, etc. - and I think the answer is deleted tweets :)
So 5.3% of the tweets came from verified users.
I thought something was wrong with my data collection techniques. No, but there are many accounts like this that all tweet *exactly* the same thing on the same day, etc.
Most popular locations?
United States 3172
USA 1746
#Florida 952
Washington, DC 878
And then there are accounts that quote tweet and / or reply the exact same thing repeatedly...
In case anyone wanted to remove all the emojis from a #pandas dataframe:
df = df.astype(str).apply(lambda x: x.str.encode('ascii', 'ignore').str.decode('ascii'))
I was wondering how good the Vader sentiment analysis was, as it was hard to tell from the 10,000 ft. view. But it's easy to see how good it is when you sort by it:
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.