Given the interest in PyScript and some of the common questions that are coming up, I figured I'd share some of my slides which may answer some questions.
For context: each square is 1 million people. The gray squares represent the population of earth.
For those who don't want to do the math, it's about 0.3%.
(The people who can be said to "know" DS/ML/AI is much, much smaller - maybe a few million?)
• • •
Missing some Tweet in this thread? You can try to
force a refresh
The last twenty years of software development have gotten away with not giving a flip about security, really.
Recalling some of the horror stories I've read on HN, reddit, and other forums, it's clear that the entire industry shamelessly plows ahead despite horrible tech debt.
This level of blithe carelessness should give all AI & ML practitioners pause.
The lesson of the past is this: Business decision makers will gleefully roll forward with whatever garbage you threw at a wall that happens to stick just long enough for bonuses to be paid out.
Today's tech architectures are more and more being built by teams that are amalgams of multiple specialists: data scientist, data engineer, software dev, infra ops, PMs, ML researchers,... Lots of Franken-elves piecing together a beast that no single party is responsible for.
@widenka@RealSexyCyborg The long explanation is that Westerners are raised in a cultural tradition built on Enlightenment-era concepts of individualism and liberty as the paramount concerns. This has become even more pronounced in the post-war consumer society of last 50 years.
@widenka@RealSexyCyborg Eastern cultures only started interacting with this philosophical mode of individualism about 100 years ago, with the advent of industrialization and labor economics.
So in China, India, and other heavily-socialized countries, ppl can’t just “do what they want”.
@widenka@RealSexyCyborg Americans are generally blind to the fact that our model of liberty and individual egocentrism is a privilege they earned through centuries of oppression, and having limitless resources and the good luck to be bordered by oceans and nice Canadians & Mexicans.
The day is here—we’re excited to share findings on the 2020 State of Data Science!
Every year, we check in with the data science community with a survey to see what’s on their minds re: responsibilities, challenges, & processes.
The TL;DR from this year is… (1/5)
Hype is cooling down, but there’s still work to be done to help data science achieve business maturity.
Here are my top 3 takeaways:
1️. Turning data into value is difficult. Data scientists report that almost half of their days are spent on data loading/cleansing. (2/5)
2. Data bias and privacy are top of mind. Yet, only 15% of surveyed universities include courses in ethics and 15% of data science teams are actively addressing issues of bias. (3/5)
My thesis is that since the value prop of ML/AI efforts are complex-valued function ƒ(Software component, Data component), the value of ML/AI companies are extremely non-linear with respect to either partial derivative ∂/∂Software or ∂/∂Data
The disruptive potential of an ML/AI startup depends a LOT on whether the hurdles to predictive success within a problem domain are primarily technological or organizational. Conway's Law applies to data systems even more so to software; and bad data arch can doom DS/ML/AI.
So a successful ML/AI startup requires lightning to strike TWICE. Virtually every ML/AI startup in B2B will suffer the organizational/GTM challenges of B2B software, IN ADDITION to needing to earn a bunch of early-stage revenue that looks an awful lot like services ("bad") ARR.
@hawkieowl@glyph Well, I think I understand what you're saying, but would be interested to learn more. Those libs provide intrinsically parallelized data structures, and encourage thinking vectorially. This may be less natural for those who have mostly programmed impressively, (1/N)
@hawkieowl@glyph but it's actually a better way to express math/quantitative code. Explicit for-loops are a code smell in most cases when it comes to this stuff.
For the minority of use cases that do fall outside the "vectorized-is-better" space, things like @numba_jit do a great job. (2/N)
@hawkieowl@glyph@numba_jit So I want to make sure I'm understanding you correctly: for you, does the "weirdness" of the numeric stack (from a dev experience/idiomatic usage perspective) arise because of its vectorial nature, or b/c of specific design choices in numpy, scipy, pandas, etc.?
@khinsen I think the vast majority of language development is too fixated on the execution of virtual machines, extensibility of the runtime, and expressiveness for higher level modularity. These are 1980s concerns, and they remain unchanged; Java & Python are 30 yrs old now.
@khinsen There is an entirely different, under-appreciated branch of potential development centered around data transformation. This basically stopped with Excel. (Anything more complex got sucked into "visual programming" hinterlands.)
@khinsen (This is not there same thing as numerical algorithm development.)