What is the difference between a predictive problem and a causal inference problem? This is an essential differentiation for data scientists, and even very smart people botch the answer. 2 very smart authors did just that. Let me explain.
1/13 #DataScience#MachineLearning
They proposed 2 questions:
1. Should I hire more college graduates? 2. Should I subsidize college degrees for my employees?
2/13 #DataScience#MachineLearning
In the article, they said question 1 is a prediction problem and question 2 is a causal inference problem. They are both causal inference problems because they ask the data scientist to prescribe a policy.
3/13 #DataScience#MachineLearning
By answering yes or no to either question, I tell the decision maker that I have evidentiary support for the decision.
4/13 #DataScience#MachineLearning
Rephrase question 1 to “Are college graduates better (as defined by some set of metrics) employees?” and the question is now a predictive one. As a data scientist, I can use correlation to support my assertions. What’s the difference?
5/13 #DataScience#MachineLearning
In the rephrased question, I must support a relationship between variables. In the original questions, I must support a relationship between a decision and an outcome. Those have two distinctly different reliability requirements.
6/13 #DataScience#MachineLearning
Why? The relationship between variables implies my model accurately describes the data, and I have a confidence level that the description will hold in the future.
7/13 #DataScience#MachineLearning
The relationship between decision and outcome implies that my model is as accurate or more accurate than a person’s heuristics.
8/13 #DataScience#MachineLearning
The relationship between variables is me proposing a set of KPIs or metrics and claiming, “If you make your decision based on these, your decision quality will be high.”
9/13 #DataScience#MachineLearning
The relationship between decision and outcome is me claiming, “This decision is high quality.” To support that assertion, I must be confident that the model makes a better decision than a person using the best available KPIs.
10/13 #DataScience#MachineLearning
That’s rarely true with simple correlation unless the number of variables involved in the decision is so high that people given that data make worse decisions than people who are not.
11/13 #DataScience#MachineLearning
Otherwise, I need to establish causal relationships to support the model's understanding of the system being better than a decision maker's.
12/13 #DataScience#MachineLearning
#LinkedIn is the Botox of #socialmedia. It's all fake. My timeline's filled with corporate propaganda and reposts. I feel like I'm in a library and someone will tell me to keep it down if I post what I'm really thinking.
1/7
"Proud to be joining Google!" No, you're proud of that paycheck and you want your old company who wouldn't spring for a raise to feel it.
"No promotion, huh? Not good enough for a raise? Funny, GOOGLE thought I was!!!!! HAHAHAHA!!!!" Full send. 2/7
"It was a tough decision to leave my old company." No, you loved every second of writing your resignation and sending it to your idiot boss. The video I want to see is of you writing up that resignation email with a long slow-motion shot of your face when you hit send. 3/7
MIT Sloan - “The survey also found that AI yields strategic benefits, but they mostly accrued to companies that use AI to explore new ways of creating value rather than cutting costs.” Let me explain why that's critical.
1/11 #DataScience#ArtificialIntelligence#Strategy
Translation: Our field is transitioning from cost savings to revenue generation. The business is looking for Data Scientists to lead the discovery of opportunities and deployment of new products.
2/11 #DataScience#ArtificialIntelligence#Strategy
“Those that used AI primarily to create new value were 2.5 times more likely to feel that AI is helping their company competitively compared with those that said they are using AI primarily to improve existing processes”
3/11 #DataScience#ArtificialIntelligence#Strategy
If a Data Scientist has a Github with 3 Python projects, you don't need to give them a technical interview. If they've been working as a Data Scientist for 3+ years, they don't need a take-home project. 1/9 #DataScience#MachineLearning#Hiring
Do they have a blog with 1 or 2 years worth of posts on Machine Learning Engineering? Published research? A YouTube channel with tons of Data Science educational content? Significant open source contributions? 2/9 #DataScience#MachineLearning#Hiring
I get a better sense of a candidate's capabilities from those sources. In my experience, the generic methods have lower predictive value for employee performance. 3/9 #DataScience#MachineLearning#Hiring
Data Scientist Job Openings On LinkedIn:
March - 138K
Now - 134K
Hiring is slowing for mid to junior-level roles. That's the first sign of tightening budgets and more changes will come quickly. Let me explain what comes next.
1/14 #DataScience#MachineLearning#Leadership
Higher costs are compressing margins for businesses across industries. Revenue growth has stagnated. Both factors mean businesses must find ways to cut costs or they are in danger.
2/14 #DataScience#MachineLearning#Leadership
Missing on revenue projections or lowering guidance for the rest of the year is a death sentence for share prices. The C Suite is measured by share price so they're moving quickly to cut costs.
3/14 #DataScience#MachineLearning#Leadership
If the job description asks for a minimum of 5 years of experience, it needs to include an explanation of why 4 years isn’t enough.
2/11 #DataScience#MachineLearning#Hiring
After 2 rounds of interviews, the company needs to explain what additional information they expect to get from this round and why they didn’t get it during the last round.
3/11 #DataScience#MachineLearning#Hiring
Data Scientists looking for a new role and Recruiters looking for candidates speak 2 different languages. Miscommunication is the most common reason candidates disengage, drop out of the interview process, and reject offers. Why?
1/12 #DataScience#Recruiting#Hiring
Candidates eventually find out the role isn’t what they expected and there's not way to keep them involved in the process after that.
2/12 #DataScience#Recruiting#Hiring
Explaining a role to a Machine Learning Engineer vs. Data Engineer vs. Applied Researcher vs. Generalist Data Scientist vs. Data Analyst are all different conversations.
3/12 #DataScience#Recruiting#Hiring