Luckily, all these methods showed the same converging results, which are that adopting the descriptive dashboard yields an increase in revenues, diversity of products sold, number of repeat customers and number of transactions.
As usual, causality disclaimers apply.
>>
But can we say more about whether the descriptive dashboard causes these effects? And if so, how?
A cool and unique feature of our data is that we observe if the retailer actually _use_ the dashboard - do they login to look at reports and when.
>>
By comparing the results of dashboard users to non-users, we see that only users reap the benefit from the dashboard.
This allows us to rule out improved performance due to an unobserved and unrelated mechanism.
>>
Next, we investigate if the retailers learn directly from the dashboard.
We initially hypothesized that retailers are most likely to change pricing and advertising strategies based on the dashboard KPIs.
Surprisingly, we find that retailers do not change these strategies.
>>
Instead, descriptive analytics serve to help retailers monitor additional marketing technologies (martech) and amplify their value.
Most retailers adopt additional technologies, but only the retailers that use the dashboard are able to benefit from them.
>>
For example, we see that many retailers adopt CRM, personalization, and prospecting martech. But only those who use the dashboard, experience the benefits of the increase in diversity of products sold, increased transactions, and increased revenue from repeat customers.
>>
Why are descriptive analytics so popular then?
Although they often leave users to generate their own insights, they provide a simple way to assess different decisions, enabling managers to extend the range of actions they can take and to integrate new technologies.
>>
We would like to especially thank @avicgoldfarb who led a very constructive review process for the paper.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Same question was also asked by a reviewer. This is where peer review improves a paper, IMO.
So we did two types of analyses (in section 5.3): 1. We estimated what the power is in these experiments (spoiler: not so low). 2. We asked what the FDR would be with 100% power.
>>
For the effective power in the experiments, the table below shows that it is 50-80% depending on significance level used.
50% sounds low, but the following analysis shows you can't improve much on FDR.
>>
How are effects of online A/B tests distributed? How often are they not significant? Does achieving significance guarantee meaningful business impact?
We answer these questions in our new paper, “False Discovery in A/B Testing”, recently out in Management Science >>
The paper is co-authored with Christophe Van den Bulte and analyzes over 2,700 online A/B tests that were run on the @Optimizely platform by more than 1,300 experimenters.
A big draw of the paper is that @Optimizely have graciously allowed us to publish the data we used in the analysis. We hope this would be valuable to other researchers as well.
>>