How to determine if a change in (traffic or user behavior) has had a negative impact on the overall system

Question

How to determine if a change in (traffic or user behavior) has had a negative impact on the overall system? For ex: if we introduce a new line of shoes, the older ones will might see a decline in sales. But how do we know if it is purely due to cannibalization?

Similarly, in an ecomm company, if there is a change in user behavior, and there is a negative impact on ecommerce revenue. How do we know if it is due to change in user behavior? How do we rule out coincidences?

You probably want to use "causal inference". Coincidence and randomness can never be ruled out, but they can be ruled improbable. You will likely want to use a model to estimate the causal effect of whatever the change is. There will likely be a statistical test associated with the estimate. — Demetri Pananos
– Demetri Pananos, Commented Aug 19, 2022 at 0:40

usεr11852 · Accepted Answer · 2022-08-26 01:05:07Z

Proving that observed A/B results are unaffected by cannibalization is almost impossible but there are some ways to make cannibalization effects less likely. To achieve that we focus on the main source of cannibalisation: the interference between treatment and control units (in Causal Inference terms this is a violation of the SUTVA). The obvious ways to amortise that interference are:

Perform cluster-based randomisation. The working assumption here is that clusters have minimal interaction effects. This is more akin to "standard" DOE work; it is a strong assumption but can work reasonably well especially if we have strong communities/clustering in our samples. It is also easy to implement and has a huge body of literature in epidemiology/public health. (e.g. see How to design efficient cluster randomised trials (2017) by Hemming et al.)
A budget-split (instead of a member-split) design. While not immediately obvious this more akin to contextual bandits methodology where we have the tracker service forming a feedback loop to our ads servicing. This feedback loop adjust ads serving such that each arm A/B is "almost independent" of the other minimising their interactions. (e.g. see Trustworthy Online Marketplace Experimentation with Budget-split Design (2020) by Liu et al.)
Use of propensity scores to estimate the sensitivity to treatment and subsequently readjustment of the sample. Propensity scores are used to effectively "turn off the ads to users who are likely to be cannibalized". This is more akin to "standard" causal inference work. (e.g. see User-Based Cannibalization Mitigation in an Online Marketplace (2018) by Guo and Qu.)
Switchback experiments. (Not strongly recommended but included for completion) This is similar to what is known as crossover trials. Assuming that applying the treatment on one day has no effect on the outcome on another day, we could randomize the order of receipt of treatment, and take multiple measurements on each user. (e.g. see Crossover experiments (2010) by Johnson.)

Stack Exchange Network

How to determine if a change in (traffic or user behavior) has had a negative impact on the overall system

1 Answer 1

Hot Network Questions

How to determine if a change in (traffic or user behavior) has had a negative impact on the overall system

1 Answer 1

Related

Hot Network Questions