Uplift modelling

During our weekly innovation day on Wednesday, we started working on a new technique called Uplift modelling. This data mining technique is a way to predict on which individual a market outing is going to have effect. We are currently testing this technique in collaboration with scooter-sharing company Check.

How does it work?

Uplift modelling is a data mining technique used in digital marketing to predict the response to a marketing action. For a marketing action you can think of a newsletter, a promotional email or for example the promotion of a discount code. The idea is that we want to classify the current customers into several groups based on their specifics and behaviour. For every group we can determine the best action to take, and thus decide whether or not the marketing action needs to be sent out to that specific customer. But why don’t we just send the marketing outing to all of our customers? We will explain this with the help of the following diagram:

As you can see, customers are divided into four categories: persuadables, sure things, lost causes and sleeping dogs:

  • Persuadables: customers who only respond to a marketing action because they were targeted.
  • Sure Things: customers who would respond to the marketing action, whether they were targeted or not.
  • Lost Causes: customers who would not respond to the marketing action, whether they were targeted or not.
  • Sleeping dogs: customers on which a marketing action will have a negative effect, they might unsubscribe from your services and are more likely to respond if they are not targeted.

We only want to target the group with the persuadables, because sending out a marketing outing to one of the other groups will have no, or even a negative effect.

But how do we decide to which group a customer belongs? We divide the customers in groups based on two simple questions:
1. Will he or she buy the item or service with targeting?
2. Will he or she buy the item or service without targeting?


If we can predict the outcomes to the above two questions for every customer, we know whether targeting this customer will have an effect. To achieve this, we will compute two separate predictions:
1. The probability that a customer will convert, given he or she received a marketing outing.
2. The probability that the customer will convert, given that he or she did not receive a marketing outing.

For both predictions, a probability, a number between 0 and 1, is calculated. The difference between these probabilities is the effect of the marketing campaign, also called the uplift prediction. This prediction is made on all kind of customer specifics, like frequency, recency, monetary value and age group. The algorithm will base its prediction on customers with similar specifics. The more specifics are available about the users, the better the results of the prediction.

For example:

Sarah is a frequent customer, who rides a Check scooter very often. She lives in Rotterdam and is between 30 and 35 years old. The probability that Sarah will ride a Check scooter will be determined based on all other users who used a Check scooter. Correlations will be computed between Sarah and other customers. As a result, we get a prediction score for the likeliness to ride with and without targeting.

The results for Sarah are: a probability of 0.8 that she will use a Check scooter if she is targeted in a campaign, against a probability of 0.5 if she is not targeted in a campaign. The uplift score of Sarah is then 0.8 – 0.5 = 0.3.

What is the result?

If we compute the probabilities for all of our customers, we can make a histogram of the outcomes. On the x-axis one can see the probabilities, and on the y-axis one can see the number of customers who have this probability. In the graph on the right, you can see the uplift predictions.

If this score is higher than zero, sending out a marketing campaign will have a positive effect on the likeliness of buying an item or utilising a service. The customers in this part of the chart are the Persuadables. Is the uplift prediction equal to zero, then the marketing outing will not have an effect on the decision of the customer. This can either be because they are Lost Causes or Sure Things. Lastly, we have the group with an uplift score below zero, the Sleeping Dogs. This group will be negatively affected by a marketing campaign and they are less likely to buy your item or service or they will even unsubscribe.

How can you implement this at our company?

We are currently testing this technique with Check, which looks very promising. Do you want to know more? Please let us know, we love to schedule a call to discuss the possibilities at your company!

  • Koen Haenen

    Lead Data Scientist
  • Ellen Mik

    Senior Data Scientist
  • Roy Gomersbach

    Data Scientist