, 39 tweets, 10 min read Read on Twitter
The claim by @EcoSenseNow that Google has removed him from their results for "founders of Greenpeace" raises a number of interesting (and epistemologically important) issues about the representation of facts in a knowledge graph. 1/31
Typically - and this is almost certainly the case here - the facts provided by the Google Knowledge Graph in response to a query weren't created by Google, but derived from other sources (most notably Wikipedia and Wikidata). 2/31
That is to say, in this case, the probability that any human at Google has reviewed the literature pertaining to who may be legitimately considered a founder of Greenpeace, made a determination, and populated the Knowledge Graph with that information is essentially zero. 3/31
So what do their sources say? Wikidata does not list Moore as a founder. wikidata.org/wiki/Q81307 4/31
Wikipedia, by contrast, provides rich information on the question of founders, enumerating the competing claims and providing citations. en.wikipedia.org/wiki/Greenpeace 5/31
There are technical differences here worth noting. To satisfy the query "who are the founders of Greenpeace", the Knowledge Graph needs to extract a triple (subject-predicate-object) here along the lines of "[Organization] founded by [Person]". 6/31
Wikidata provides that precisely: "Greenpeace founded by [Irving Stowe, Dorothy Stowe]". 7/31
It's obviously more of a challenge for a machine to extract that same information from Wikipedia's paragraphs on the subject, and Wikipedia's structured data representation, DBPedia, has no "founded by"-type property dbpedia.org/page/Greenpeace 8/31
Moores asserts, of course, is that he *is* a founder, and details his claim in a 2012 post (now only accessible on the Internet Archive) bit.ly/2Fp6Knd 9/31
The machine challenge of extracting "[Organization] founded by [Person]" from this source is, needless to say, similar to the challenge of deriving that triple from the Wikipedia article. 10/31
In any case we now have two competing assertions. Wikidata (baldly, being just data) asserts Moore was not a founder, as does (currently) Greenpeace, as recorded in Wikipedia. Moore asserts he was a founder, as recorded in Wikipedia and his 2012 post. 11/31
The $64K question for Google is seemingly "which assertion is factually correct?" But it's much more complicated than that. 12/31
To wit, a bigger underlying question is whether we should consider "facts" provided by the Google Knowledge Graph to be factually correct (and whether Google itself considers Graph-provided "facts" to be factually correct). 13/31
While that question may seem counter-intuitive, there is a good case to be made that the Google Knowledge Graph, as a web-scale semantic knowledge base, is operating under the Open World Assumption. bit.ly/2CsNL9m 14/31
As per the framing of that piece's author, @juansequeda, if the OWA applies to the Knowledge Graph then "the assumption that what is not known to be true is simply unknown". 15/31
In other words there's a sense in which "facts" provided by the Knowledge Graph may be regarded only as assertions. "Hey", says the Graph, "here's a assertion that seems to satisfy your query, but I don't know whether it's factually correct". 16/31
Which leads us to the real $64K questions (yes, there's two of them:) raised by this dispute: which specific assertion should the Graph return for a query, and how should that assertion be represented? 17/31
In the first question I say "specific assertion" because the way Google uses the Graph is to provide direct answers to queries - in this case the providing values for [X] in the equation "Greenpeace has founders [X]" 18/31
This is because Knowledge Panels and other Graph-provided query responses are, by design, almost always a list of values or property/value pairs. By design because this is an efficient response: think mobile, think voice. 19/31
The efficiency of this response then informs the limits of how a "fact" can be represented. Knowledge Panels don't do nuance. In an "founders of Greenpeace" type carousel, Google's UI doesn't account for "entities in dispute". Values are either present or absent. 20/31
So the Graph needs to lean on the trustworthiness of its sources in determining what assertions to make. So much so that Google has mulled over methods of improving on how they determine the trustworthiness of a source. bit.ly/1zRQtdG 21/31
Ultimately the provided assertion is "from somewhere." And this becomes problematic when the payload of a response (especially voice) doesn't include provenance information. 22/31
As @jamesvlahos details, the demands of a providing a "one-shot answer" where, as here, "there are multiple legitimate perspectives" raises a number of epistemological questions and societal conundrums. bit.ly/2C1GVqY 23/31
All of this long thread to say, first, that whether or not Google "removed" Moore as a founder is deserving of some debate: it's probably more correct to say "the Google Knowledge Graph no longer lists Moore as a founder." 24/31
That not to absolve Google of responsibility here (as Vlahos says, "Whether or not these companies wish to play the role of Fact-Checker to the World, they’re backing themselves into it."), but rather to put an asterisk next to "removal." 25/31
That is that in all likelihood there was no manual action taken by Google to "remove" Moore from this list, but rather that Google's founders' list was modified on the basis of what data their most trusted source(s) provided. 26/31
Second, that both knowledge aggregators (here Google) and knowledge users (here searchers) need to work at improving how we process these questions and answers. 27/31
This means, on one hand, putting pressure on those aggregators to provide better solutions around questions of provenance (and other aspects of a Graph-based response), especially as they pertain to obviously disputed facts. 28/31
On the other hand, it also means better educating technology users on these issues, and - as I've attempted to do here - dispelling the notion that "facts" returned by search engine knowledge graphs are typically minted by partisan knowledge managers. 29/31
By and large these "facts" are programmatically extracted from heterogenous data sources. It doesn't mean that there aren't important issues concerning those sources and how these "facts" represented in query responses. 30/31
But it does mean that if you peek behind the curtain what you'll normally find there is a computer running routines, and not a person making edits. 31/31
Thinking further on this (in part of light of comments), I now think the probability of human editing is much higher than zero, and may even be likely. That is Google may act in response to events (e.g. Greenpeace's disavowal of Moore) when its important to them that...
... Graph results are current (which is not to say the reason for those edits, if they happened, were ideological and/or political). I hold to my main point, which that Graph facts aren't in the vast majority of cases minted by Google, but there may be times where Google...
... *does* need to make a very specific determination, as in "do we list Moore as a Greenpeace founder or not?" As doge would say, wow - much minefield, so danger.
Ye olde big cop-out for Google here, and I may have seen this before, is to avoid controversy by throttling Graph results all together and simply returning 10 blue links (which may more may not include a featured snippet, which is NOT a Graph result). Don't know if anyone...
... else who has been following this thread sees this change from yesterday but for me, anyway, I'm no longer getting a carousel for the query "who are the founders of Greenpeace" (and variations). :)
ADDENDUM and UPDATE - in which I reconsider the possibility of hand editing, and provide what is seemingly an update to the query result that started this whole thread.
UPDATE 2: What do you know, Snopes has a fact check on this - and as you'd expect the answer is equivocal (which brings us full circle about just how Google might best handled such claims) > FACT CHECK: Did Patrick Moore ... Co-Found Greenpeace? bit.ly/2UMDHz6
And, by the way, this is data that is literally designed to be readily digestible and understandable by search engines (fact check publishers provide fact check information with structured data markup, using the schema #schema.org/ClaimReview).
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Aaron Bradley
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!