Retention Metrics Explained: Customer Future Value (CFV) Pt. 2 [RS Labs]

In the ‘Retention Metrics Explained’ series, our data science team goes into detail about the metrics and methods that are essential for data-driven retention marketing. Customer Future Value is one of the predictive metrics that determines which customers to nurture based on their future impact to a business. In this post, learn how our team applies Customer Future Value. To learn how we calculate Customer Future Value (CFV), read Part 1.

Applications of CFV

We’ve found that CFV can be useful in a number of ways: for instance, it can cohort your users into those that are projected to be “big spenders” and those that are not. In this way, a company might target each cohort differently.

It also allows for deep and actionable audience segmentation at a more granular level. For instance, we can break CFV by state, as shown in the figure below. In this figure, the state for each customer is shown on the x-axis, the left y-axis shows the number of customers from that state (represented as the dotted line) and the right y-axis shows the average CFV for customers from that state (represented by the blue bar).

cfv 1

Average CFV ($) by State

 

Here it is very clear that not only do the most number of customers come from California, but those customers are predicted as the biggest spenders, by far, as compared to the rest of the states. Rounding out the top states are New York, Texas, Illinois, and Florida. So if there are choices to be made for where to place marketing dollars or specific content to create (say, for engagement) it might be best to focus on those areas.

We could also compare CFV by registration source for the customers (as shown in the figure below). It is just like the figure above, except the x-axis represents the registration source for the customers. Using a chart like this allows marketers to pin-point their ad spend, allocating more resources to those registration sources that produce both a large number of customers and those that will spend more money.

 cfv 3

Average CFV ($) by Registration Source

 

RS CFV Validation Report

Although we update the CFV values every day, we can’t use that day’s score to validate the accuracy, since CFV is a prediction of the future. Therefore, in order to validate the accuracy of our CFV predictions, we compare the amount of spend we predicted against what actually occurred for those customers, usually looking out 90 days. For instance, for predictions made on Jan 1st, we would look at the actual spends on April 1st, and see how close our predictions match what actually happened. Since the prediction requires future spending data, we can also validate against historical data in a process known as “back testing”.

Below shows the example of our CFV validation report. It shows the result of CFV prediction on data up to 1/1/2015 and the validation of our CFV on 4/1/2015. The results demonstrate how well our CFV prediction performs on the validation date.

Screen Shot 2015-10-15 at 10.29.00 AM (1)

The validation report shows the actual revenue during that period (shown as “Site Actual CFV”) and our prediction “Site Predicted CFV” which is what we predicted on 1/1/2015.

The Site Level Mean Absolute Accuracy shows how far off our prediction of the revenue was from the actual revenue, in percentage. In this particular example we correctly predicted the revenue within 92% (we were only off by $292K) when predicting the specific company’s future revenue.

We also present the User Level Mean Absolute Error CFV, which shows how well we predict the future value (e.g., future spend) of each individual user. While we do a good job for the company as a whole (the site level), it’s much tougher to predict each individual. In the particular example we were, on average, off by $1.75 per customer.

CFV is a key metric for our predictive analytics. We are proud of it, and hope you find it as interesting as we do. Stay tuned for our next post where we cover our Welcome Purchase Probability model.

—————-

About the Author

Sang Su Lee is a data scientist at Retention Science. He is interested in solving less scientific problems in a scientific way. He received his M.S. and Ph.D. in Computer Science from the University of Southern California and B.S. in Electrical Engineering from Yonsei University.