Latest Twitter Threads by @seanjtaylor on Thread Reader App

Oct 11, 2022 • 8 tweets • 4 min read

Professional news! I’ve joined @MotifAnalytics as co-founder and chief scientist. Building a company and a product is a new challenge for me. Just a few weeks in I can tell it’s going to be a ton of fun.

Thanks to @mikpanko and Theron Ji for letting me join their awesome team! I wrote my first post about what we’re doing at Motif here: motifanalytics.medium.com/bringing-more-…
I encourage you to read that and please do give feedback, I know people are going to have some opinions about it :D

If TL;DR, here’s a short thread.

May 27, 2022 • 4 tweets • 1 min read

It's 2022, language models can generate infinite streams of text that is indistinguishable from sophisticated human writing, and somehow autocorrect still sucks ass. 2015:

https://twitter.com/seanjtaylor/status/554867763269742593?s=20&t=YgYjGZy3EzNHfvz8q39rWA

May 26, 2022 • 4 tweets • 2 min read

I used to think datadriven.club was @mcfunley's best talk but now I think it's a tie with: boringtechnology.club

"Choose Boring Technology" generalizes beyond software eng. In all of my slow-moving projects, I tried to combine too many novel ideas together too quickly. @mcfunley A long time ago I was a novice web developer building a Yelp-like website. Every Python web framework launched with a shiny demo, causing me to decide to rewrite everything. I switched to a NoSQL database. When AppEngine launched, I migrated to that. No one ever used the site.

Aug 26, 2021 • 4 tweets • 2 min read

Being able to hire myself on @Upwork is going to unlock a lot of productivity for me.

There's a market solution for this.

Aug 25, 2021 • 6 tweets • 1 min read

Probably the most important skill you can have is the ability to get a big group of people to agree to working on the same thing for a prolonged period of time. The second most important skill is coordinating the effort of all those you convinced so that it scales well with the number of people.

Jul 1, 2021 • 6 tweets • 2 min read

This post rips Prophet (a forecasting package I helped create) to shreds and I agree with most of it🥲 I always suspected the positive feedback was mostly from folks who’d had good results—conveniently the author has condensed many bad ones into one place. microprediction.com/blog/prophet It’s really freakin hard to make statistical software that generalizes across many problems. Forecasting (extrapolation) is among the hardest statistical problems. I don’t think anyone who’s seen me present Prophet would think I’ve misrepresented these facts.

Feb 20, 2021 • 16 tweets • 4 min read

I really liked Paul Graham's recent essay "what I worked on" [1] so I'm going to write my version in thread form. Obviously incomplete.

[1] paulgraham.com/worked.html 12 yo: I work an informal summer job, helping a friend of my parents assemble generic PCs and install software (load/eject a bunch of floppies and babysit). I learned how to not give up in despair when computers don't do what you want.

Jul 21, 2020 • 7 tweets • 2 min read

Short thread on GPT-3.

I haven't worked on text models in a long time, because (TBH) I find them boring. I had been ignoring progress in that space because you could kind of see where it was heading. I don't feel *that* surprised by GPT-3 but it illustrates some useful ideas. To me, what's big is challenging current status quo of many specialized single-task models with one general multi-task model. Expensive, pre-trained embeddings are common at large cos, but mostly used as features for specific learning tasks. Multi-task models have small # tasks.

Jun 5, 2020 • 7 tweets • 2 min read

I'm procrastinating tonight so I'll share a quick management tool I use. It's close to the end of H1 so performance reviews are coming. I tell this to my reports: "Your work is going to be distilled into a story, please help me tell a good one so I can represent your work well" A good story must be easy to understand and compelling. At all times you should think about what story you'll tell about your work. It helps you do good work and it helps me (your manager) get you the credit you deserve. A story has three parts: a beginning, middle, end.

Mar 28, 2020 • 13 tweets • 3 min read

I think I had a tough time communicating with @yudapearl today. It’s worth sharing where I think we ended up misunderstanding each other. I don’t think he is likely to agree with me, but it's useful for me to articulate here.

Here’s the seed tweet:

https://twitter.com/seanjtaylor/status/1242843979873320960

I shared the Meng paper because it’s a nice discussion of how greater sample size doesn’t solve estimation problems. This is part of a strong opinion I have that collecting adequate data is the key challenge in most empirical problems. Some people will not agree with this.

Oct 19, 2019 • 27 tweets • 10 min read

I think this is an interesting topic but found this visualization hard to follow (no surprise if you've been reading my complaints about animated plots).

I have nothing to do tonight so i'm going to try to re-visualize this data. Starting a THREAD I'll keep updated as I go.

https://twitter.com/flowingdata/status/1184506325272678400

The original data is from the ACS. Nathan used a tool called IPUMS to download the data set: usa.ipums.org/usa/

Looks like there's a variable called TRANTIME that is "Travel time to work." The map uses PUMA as the geography, which are areas with ~100K people each.

Sep 6, 2019 • 8 tweets • 3 min read

This is a pretty short and unpolished <thread> on launch criteria for experiments. Hoping for feedback!

Background: one heuristic people use to decide to "ship" in an A/B test setting is p-value < 0.05 (or maybe 0.01). How important is "stat sig" for maximizing expected value? I simulated 10,000 A/B tests with effects drawn from Laplace(0, 0.05) (most effects are close to zero) with Normal(0,1) noise and N=2000. I'm going to ignore costs of "shipping" and assume effects are additive, both huge assumptions. Here's the distribution of effects:

Apr 30, 2019 • 17 tweets • 6 min read

📈Long thread on how Prophet works📈

- Instead of sharing slides I'm transcribing my QCon.AI talk to tweets with lots of GIFs :)
- Thanks to @_bletham_ for all his help. Prophet motivation:

- there are many forecasting tasks at companies
- they are not glamorous problems and most people aren't trained well to tackle them
- 80% of these applications can be handled by a relatively simple model that is easy to use

Jan 18, 2019 • 4 tweets • 2 min read

2nd and Short, a litmus test for NFL coaches [a short thread]
- On second and short, an NFL coach can call a run play or a pass play.
- Run plays are better for getting first downs and pass plays are better for getting yards.

A coach's tendency in this game situation tells us a lot about what they're optimizing for. (Credit to @bburkeESPN for this idea!) The goal of football isn't to produce 1st downs but to score, may expect >50% pass rate. Most don't unless they are losing and it's late in the game.

Dec 14, 2018 • 7 tweets • 2 min read

A couple days ago another team asked me to speak about Bayesian data analysis. I decided that instead of doing a nuts/bolts of how to fit/use Bayesian models, I would describe "Bayesian analysis: The Good Parts". <potentially controversial thread> Good Part 1. Explicitly modeling your data generating process

Bayesian analysis means writing down complete models that could have plausibly generated your data. This forces you to think carefully about your assumptions, which are often implicit in other methods.

Nov 23, 2018 • 6 tweets • 3 min read

What are the most interesting parts of building a ML classifier?
1. feature engineering
2. label engineering
3. experimenting with different models
4. hyperparameter tuning
5. summarizing/explaining model performance
6. automating training and deployment
7. <something I missed> I have my own opinions about this that I'll share later, but I don't want to influence responses.

Share this page!

Enter URL or ID to Unroll