Tweet

https://twitter.com/seanjtaylor/status/1123278380369973248

More from @seanjtaylor

Sean J. Taylor

@seanjtaylor

21 Jul 20

Short thread on GPT-3.

I haven't worked on text models in a long time, because (TBH) I find them boring. I had been ignoring progress in that space because you could kind of see where it was heading. I don't feel *that* surprised by GPT-3 but it illustrates some useful ideas.

To me, what's big is challenging current status quo of many specialized single-task models with one general multi-task model. Expensive, pre-trained embeddings are common at large cos, but mostly used as features for specific learning tasks. Multi-task models have small # tasks.

@sh_reya

As @sh_reya points out, big challenge becomes "how do you explain to the model what task it should be working on?" Probably a large design space here and require entirely new "meta-query" language. Also challenging to formally evaluate a model like this, hard to quantify value.

Read 7 tweets

Sean J. Taylor

@seanjtaylor

5 Jun 20

I'm procrastinating tonight so I'll share a quick management tool I use. It's close to the end of H1 so performance reviews are coming. I tell this to my reports: "Your work is going to be distilled into a story, please help me tell a good one so I can represent your work well"

A good story must be easy to understand and compelling. At all times you should think about what story you'll tell about your work. It helps you do good work and it helps me (your manager) get you the credit you deserve. A story has three parts: a beginning, middle, end.

Beginning of the story:
- Your work is well motivated. It addresses a clear need that you were smart to identify.
- Help me by being deliberate with project choice, finding good opportunities, and not chasing shiny objects. Generate buy-in and excitement before starting.

Read 7 tweets

Sean J. Taylor

@seanjtaylor

28 Mar 20

@yudapearl

I think I had a tough time communicating with @yudapearl today. It’s worth sharing where I think we ended up misunderstanding each other. I don’t think he is likely to agree with me, but it's useful for me to articulate here.

Here’s the seed tweet:

https://twitter.com/seanjtaylor/status/1242843979873320960

I shared the Meng paper because it’s a nice discussion of how greater sample size doesn’t solve estimation problems. This is part of a strong opinion I have that collecting adequate data is the key challenge in most empirical problems. Some people will not agree with this.

Most folks thought I was talking about causal inference from the start. I was actually talking about the tool of *randomization*. IMO, Meng’s paper is an example of measuring the value of randomization for an estimation problem. Randomness is a complement to sample size.

Read 13 tweets

Sean J. Taylor

@seanjtaylor

19 Oct 19

https://twitter.com/flowingdata/status/1184506325272678400

I think this is an interesting topic but found this visualization hard to follow (no surprise if you've been reading my complaints about animated plots).

I have nothing to do tonight so i'm going to try to re-visualize this data. Starting a THREAD I'll keep updated as I go.

https://twitter.com/flowingdata/status/1184506325272678400

The original data is from the ACS. Nathan used a tool called IPUMS to download the data set: usa.ipums.org/usa/

Looks like there's a variable called TRANTIME that is "Travel time to work." The map uses PUMA as the geography, which are areas with ~100K people each.

IPUMS is pretty annoying to use. You need an account and you create a dataset to add to your "data cart"(!!!). But I was able to download a file with the 2017 ACS responses for TRANTIME, along with PUMA, and STATEFIP. The latter two fields uniquely identify the geographic region.

Read 27 tweets

Sean J. Taylor

@seanjtaylor

6 Sep 19

This is a pretty short and unpolished <thread> on launch criteria for experiments. Hoping for feedback!

Background: one heuristic people use to decide to "ship" in an A/B test setting is p-value < 0.05 (or maybe 0.01). How important is "stat sig" for maximizing expected value?

I simulated 10,000 A/B tests with effects drawn from Laplace(0, 0.05) (most effects are close to zero) with Normal(0,1) noise and N=2000. I'm going to ignore costs of "shipping" and assume effects are additive, both huge assumptions. Here's the distribution of effects:

Since simulated data, I know the true effects. I order the experiments left to right by one-sided p-value (H0: effect <= 0). This p < 0.05 criterion would catch a lot of good tests, but ignore a lot of other positive ones. We have high precision but low recall.

Read 8 tweets

Sean J. Taylor

@seanjtaylor

30 Apr 19

@_bletham_

📈Long thread on how Prophet works📈

- Instead of sharing slides I'm transcribing my QCon.AI talk to tweets with lots of GIFs :)
- Thanks to @_bletham_ for all his help.

Prophet motivation:

- there are many forecasting tasks at companies
- they are not glamorous problems and most people aren't trained well to tackle them
- 80% of these applications can be handled by a relatively simple model that is easy to use

We approach time series forecasting as a *curve fitting problem.* This has some benefits:
- curves are easy to reason about and you can decompose them
- the parameters you fit have straightforward interpretations
- curve fitting is very fast so you can iterate quickly

Read 17 tweets

Share this page!

Sean J. Taylor

Try unrolling a thread yourself!

More from @seanjtaylor

Sean J. Taylor

Sean J. Taylor

Sean J. Taylor

Sean J. Taylor

Sean J. Taylor

Sean J. Taylor

Did Thread Reader help you today?

Like this author's thread?