čtvrtek 26. května 2016

Is it possible to evaluate a campaign without a control group?

First, this is a problem of so-called "quasi experimental studies". But all these quasi experiments need a time dimension. Can we design an experimental study that does not use a control group or multiple measurements in time?

If we assume additivity of campaigns, all we have to do is to get as many independent equations as is the count of unknown variables. For example, let's imagine we have two campaigns, the first one with response rate A and the second one with the response rate B. Let's also imagine that if we do not expose a user to a campaign, then the response rate of the user is N. And let's imagine that we can combine campaigns, either because each is using a different channel or because each is to a different product. Than we can set up 3 campaigns:

  N+A = 5
  N+B = 7
  N+A+B = 9

The first group of users is exposed to the first campaign and the response rate is 5%. The second group to users is exposed to the second campaign and the response rate is 7%. And the last group of users is exposed to both campaigns and the resulting response rate is 9%.

Since we have 3 independent equations and 3 unknown variables, we can calculate that the first campaign has uplift 9-7 = 2 percent points and that the second campaign has uplift 9-5 = 4 percent points.

Of course, this approach neglects interactions between the campaigns. But we can model interactions as well. In this example, let's consider 3 campaigns with following response rates:

 N+A = 5
 N+B = 7
 N+A+B+AB = 9
 N+C = 8
 N+A+C = 3

where AB represents an interaction of two first two campaigns. We can put the equation into a matrix form:

 matrix = [1 1 0 0 0; 1 0 1 0 0; 1 1 1 1 0; 1 0 0 0 1; 1 1 0 0 1] 
 response = [5; 7; 9; 8; 3] 
 x = [?; ?; ?; ?; ?]

Where x is a vector of the unknown variables. It holds that:

 matrix*x = response

With linear algebra:

 x = inv(matrix) * response

We can get that x = [10; -5; -3; 7; -2].

To get higher confidence in the estimates it is possible to use more independent equations than is the count of unknown variables and fit the unknown variables with least squares estimate (or other method of your choice).

Žádné komentáře: