Taking @vgr’s challenge:

1 like = 1 opinion (actually, fact :) on #MachineLearning and the nature of knowledge.
@vgr (not to mention the hype machine...)
@vgr All machine learning algorithms are biased, the only question is do we know what the biases are and do we care?
All statistical machine learning algorithms are essentially devices for interpolating between data points already seen. They can’t generalize to novel situations.
85% of the progress in machine learning over the past 15 years is due to increases in hardware performance and availability.
The true science of machine learning is the science of figuring out which biases are needed to learn for which tasks.
Nobody knows how to build systems with common sense. We don’t even have the equivalent of our chemical theory about this
Nobody knows how to build systems with common sense. We don’t even have the equivalent of alchemical theory about this yet.
Knowledge = justified true belief.

But justification requires communication, truth can be a matter of degree, and belief is a matter of the causal structure of an agent, not sentences in its head.
#MachineLearning on one foot: Use an appropriate representation and bias to update your priors from the data. All else is commentary. Now go and learn.
Yes, Bayesianism is correct.
So is minimum description length.
12. All good theoretical frameworks for machine learning are equivalent to Bayes, or approximately so.
13. Connectionist methods will need to either incorporate or simulate symbolic processing to get beyond perception/response tasks.
14. We will need fundamentally new conceptual breakthroughs to get beyond current "interpolative" #ArtificialIntelligence. These will be very different from anything currently conceived of, and will not achieve #SOTA on any known tasks for some time.
15. SOTA- and publication-chasing are bad for science and bad for scientists.
16. Knowledge (such as it is) inheres in the system as a whole, not any particular representations or algorithms.
17. A "brain in a vat" knows nothing, as it is causally connected to nothing.
18. Imagine what we could accomplish using GOFAI with the knowledge-base equivalent of the computing power we can now devote to backpropagation learning! Incredible!
19.
#Statistics is about models.
#DataMining is about patterns.
#MachineLearning is about prediction.
#DataScience is about the data.

Can you figure out what the elephant really is?
20. Understanding is a triad:

Interpretation
/\
/___\
Action Explanation
21. Replace the phrases "artificial intelligence" and "machine learning" in any news article by the phrase "computer program", and remove the phrase "learns like a person/baby/brain", and see if the achievement seems as cool or impressive.
22. Then read the original research article to see what was actually accomplished.
23. Language learning is not just machine learning applied to sequences. Nor is automated genomics. Nor is time series analysis. Ad infinitem.
24. You can get 80-plus percent of the possible accuracy by applying advanced machine learning to a problem without knowing anything about the domain. You can also do that using logistic regression.
25. Does deep learning solve problem X? First, compare it to logistic regression and naïve Bayes. Then you might get a clue.

(If the authors didn’t, be suspicious, very suspicious.)
26. As a field, #MachineLearning is stuck in a local minimum. Not all learning is function approximation, and I warrant most of the really interesting kinds of learning are not. Let’s go back to exploring the full space of learning tasks and methods.
Bonus.

STOP CHASING SOTA!

</rant>
27. #MachineLearning is the science of finding the right bias for the problem. So… KNOW YOUR DOMAIN!
28. If your fancy #MachineLearning system gets 99.5% accuracy (wow!!), either:

a. You have a bug in your evaluation procedure, or

b. Logistic regression will get 99.3% accuracy.
29. #DataScience is sensemaking.

Thus, if you create a black box model with perfect prediction accuracy, you have FAILED.
30. For any #MachineLearning or #ArtificialIntelligence application, the “system” includes the people that use it, and the organizational matrix they are embedded in. Any analysis that does not account for that is woefully inadequate.
31. "Neural networks" are not brain-like. Unless you take your glasses off and squint. After a couple of beers.
32. The problems of "bias" and "fairness" in #MachineLearning are mainly problems of specifying implicit assumptions, and have no purely technical solutions. The real issue is a version of the old "is/ought" conundrum.
33. Most “impressive” accomplishments of #MachineLearning for #NaturalLanguageProcessing are advanced applications of the Eliza Effect. (I’m looking at you, GPT-2...)
34. Don’t be afraid of the algorithms getting too smart, be afraid of people giving them too much power before they do.
35. You know something if you can do something with it. That might involve action or communication, but beware: It is very easy to convince insufficiently critical observers that you know something, even inadvertently. Even yourself.
The first principle is that you must not fool yourself – and you are the easiest person to fool.

--Richard Feynman
36. Your #MachineLearning algorithm works - congratulations! Excellent accuracy on out-of-sample data - wonderful!

But do you know if it learned what you wanted it to learn? Does it recognize stop signs, or large-enough-red-regions-with-certain-specific-other-colors-nearby? Hm.
37. What statistical assumptions does your method make? More importantly, what assumptions does your evaluation procedure make? And do they match reality? (Answer: No.)

The proof of the model is in the eating.
38. Statistical #MachineLearning is algorithmic demagoguery - it is winner-take-all for (often subtle) patterns with a slight majority (plurality).

That is its power, and its danger.
39. As a general rule on #MachineLearning, representations matter more than algorithms.
40. The proper function to optimize in #MachineLearning is task dependent utility, even though this is nearly never done.
41. The results of a single study, however rigorous, never generalize on their own. Validity depends upon statistical assumptions, and until you replicate, you don’t know whether those assumptions match reality.
42. Forty-two.
44. Unless your #MachineLearning team includes both #ML and domain expertise, don't trust the results.

IOW, don't just use out-of-the-box machine learning "solutions". They aren't.
@vgr @threadreaderapp please unroll

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Shlomo Engelson Argamon

Shlomo Engelson Argamon Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @ShlomoArgamon

May 30, 2019
This paper, entitled "On Classifying Facial Races with Partial Occlusions and Pose Variations" appeared in the proceedings of the 2017 @IEEEorg ICMLA conference, in Cancun.
researchgate.net/publication/32…
As stated in the abstract, the goal of the work is to apply a face classification model "trained on four major human races, Caucasian, Indian, Mongolian, and Negroid." Needless to say, these categories have no empiric or scientific basis.
In the body of the paper, we see this table characterizing the supposed "four major human races" in terms redolent of the height of 19th century racist phrenology:
Read 10 tweets
Dec 17, 2018
Regulations, arguably, should not be based on detailed understanding of how AI systems work (which the regulators can't have in any depth). However, AI systems need to be able to explain decisions in terms that humans can understand, if we are to consider them trustworthy. 1/
Not explanations involving specifics of the algorithms, weights in a neural network, etc., but explanations that engage people's theories of mind, explanations at the level of Dennett's intentional stance - in terms of values, goals, plans, and intentions. 2/
Previous computer systems, to be comprehensible, and, yes, trustworthy, needed to consistently present behavior that fit people's natural inferences to physical models (e.g., the "desktop"). Anyone old enough to remember programming VCRs? Nerdview is a failure of explanation. 3/
Read 12 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(