1.1 Churn | Definition
"Far too often businesses define churn as no purchases after N days; typically N is a multiple of 7 or 30 days. Because of this time-limit, it arbitrarily buckets customers into two states"
But different customers buy at different frequencies?
1.2 Churn | Probability
Instead Shopify represents churn as a probability and infers it per customer by making some distributional assumptions and using only: recency, frequency, age
This avoids the problem of choosing N.
1.3 Churn | Estimating The Probability
Let lambda be the rate of purchasing, and p be the probability of a churn event.
Then you can use BG/NBD to infer p.
Assuming the assumptions in the next tweet are validated.
Even with seasonality though, BG/NBD does quite well.
1.4 Churn | BG/NBD Assumptions 1. Time between transactions ~ exp(lambda) 2. lambda ~ gamma(r, alpha) 3. After any transaction a customer becomes inactive with prob p 4. p ~ Beta(a,b) 5. lambda and p vary independently across customers brucehardie.com/papers/018/fad…
1.5 Churn | BG/NBD Evaluation (CD Purchases)
We can see that BG/NBD performs quite well.
Particularly impressive is the conditional expectation, which demonstrates how well this model tracks the expected # of transactions in weeks 40-78 given # in 1-39
Pay close attention to how your data needs to be structured on inputs, and use the l2 penalty.
Also be sure to viz decision bounds and fit
1.7 Churn | Why is Churn Important?
Using churn "merchants are in a position to drive smarter marketing campaigns, order fulfillment prioritization, and customer support."
"50% probability are at risk of churning, so targeted campaigns could be made to entice them back"
2.1 CLV | Motivation
CLV is one of if not the most import metric to understand about your customers.
And its very common to calculate it wrong.
Bc they don't properly account for churn in non-contractual settings.
After fitting the churn above, we can incorporate the economic value now.
Be sure to verify the independence assumption between monetary_value and frequency.
You should also discount the future cashflows.
2.3 CLV | How Can this Be Used? 1. Prioritizing customers (ex: contacting users with churn risk) 2. Targeted marketing (ex: how much am I willing to spend on this customer?) 3. Predict high CLV potential customers to acquire 4. Understand (current and future) customer base
2.4 CLV | How Can This Inform Recommendations? 1. Ensure the cost of the recommendation > CLV 2. Recommend specific benefits to high CLV customers
Forgot to mention that @Cmrn_DP was the one who wrote the exceptional post I stole most of this content from.
"How Shopify Merchants can Measure Retention"
And was a major contributor to the lifetimes package.
Thanks again for making great DS content Cam!
• • •
Missing some Tweet in this thread? You can try to
force a refresh
In this thread I'll highlight some important pieces from a variety @ShopifyData & @ShopifyEng blogs where they discuss applications they've built that I think would benefit Data Scientists.
Contents: 1. How Shopify Capital Uses Quantile Regression To Help Merchants Succeed 2. How to Build an Experiment Pipeline from Scratch 3. How to Use Quasi-experiments and Counterfactuals to Build Great Products 4. Categorizing Products at Scale
Other Threads On Shopify DS Applications:
- How Shopify Uses Recommender Systems to Empower Entrepreneurs: