Why did I do it and what did the students (and the computer) learn?
[THREAD]
A prominent example of such data are "Militarized Interstate Disputes" or MIDs
journals.sagepub.com/doi/10.1177/00…
tandfonline.com/doi/abs/10.108…
correlatesofwar.org/data-sets/MIDs
Well, 2,315 MIDs in version 4.3 of the data
correlatesofwar.org/data-sets/MIDs…
That is why @dmgibler and his team at @UofAlabama have been hard at work correcting and cleaning these data
dmgibler.people.ua.edu/mid-data.html
For instance, consider just the amount of "events" recorded each day by the @gdeltproject
gdeltproject.org/data.html
But could computers help? I mean, they're dumb, but they work hard!
That's what led us to talk about Machine Learning
Instead, I love how @kozyrkov describes machine learning in this @hackernoon post
hackernoon.com/the-simplest-e…
Maybe they could do this by hand. But life would be easier if they could somehow tell a computer to "figure it out" for them!
1) Training data (i.e. lots and lots of examples of events where we know the MID level).
For that, we used all MIDs from 1816 to 1998.
The data came from here:
correlatesofwar.org/data-sets/MIDs…
For that, we used the MID data from 1999
Even though the machine is "learning" it still needs to be told what it's looking at and how to "think" about what it's looking at -- again, computers can learn, but they are dumb!
Yep, Russia is involved in A LOT of high level MIDs!
To keep it simple, we told the computer to simply assume that every event with Russia was a high level conflict.
We discussed the possible lessons:
1) Maybe not assume ALL events involving Russia ended in a high level MID (maybe just some fraction)
Doing so would require
training -> checking -> training -> checking ....over and over again.
Again, the computer will work hard.
So machine learning isn't really "learning" in the sense of deeply contemplating the issue before making a decision.
It's more "brute force"
...and are no longer intimidated by the phrase “Machine Learning”
[END]