Understanding Retention

Simply put, retention metrics indicate how often people return to your product. With a few exceptions, retention is one of the most important indicators and a great starting point for further research in understanding whether or not your product is addressing a particular market need. 

Let’s take a messaging app for example. On January 1, I acquired 100 new signups and the following day, 10 of them reopened the app. This means that my Day 1 retention is 10%. January 1 is Day 0, i.e. the day when I acquired the users and January 2, when I had 10 people return to my app is Day 1, which answers the question, how many people opened my app one day after signing up. 

On January 7, out of the original 100 signups, 15 people opened the app, hence my Day 7 retention rate would be 15%. There are two things we need to clarify here:

a) N Day vs. Unbounded Retention

b) Cohorts 

N Day retention simply means that we’re not concerned with the count of users that opened our app in the previous days, we’re only looking at the amount of users out of the original batch of signups that returned on Day N. That also explains why in our example above Day 7 retention is actually higher than Day 1 – this isn’t an anomaly. Perhaps there’s something in the product which brought users back after 7 days organically, or it could be a push notification or an email campaign, etc. 

Unlike N Day retention, which looks at the retention of Day N only, unbounded retention will answer the question of how many users were retained/performed a given action over a certain period of time. In our example, it makes sense to use N Day retention because generally speaking, messaging apps are used daily. However, if we’re looking at a product, which has a much lower frequency of use, such as a travel booking app, for example, it may make sense to use unbounded retention, which will help us understand how many people booked a second flight 6 months after their first booking on our website. 

In other words,  Day N retention is best suited for products which are used daily, like gaming or messaging for example. Unbounded retention is better suited for products which do not have daily engagement, like AirBnb or a food delivery app.

Cohorts are an essential tool for analysing user behaviour. They are created by grouping users by one or more properties – date when they signed up, device, performed certain action, demographic data, etc. Cohorts allow you to isolate a particular group of users and track their behaviour over time to understand how they are using your product. In our example above, our cohort was users who signed up on January 1.

The table below shows N Day retention of our fictional app. From what we’ve talked about so far, we can read it as follows: on Jan 31, 2020 we’ve had a total of 2,448 users who signed up, which is essentially our cohort. 7 days later, out of the total Jan 31 signups, 9 users came back and used our app. 

Keeping the data in absolute numbers may be a bit difficult to analyse and compare our app to industry standards and/or competitors, so we may want to convert it into percentages. To do so, our day 0 column (or “Users” in the table above) represents 100% of the population, so we’ll take the subsequent days and divide them by our total population, which should give us the following: 

When we talk about retention, we’re often referring to a retention curve, which is a visual representation of the cohort’s performance over time, or our table from above, which looks like this:

We could also use charts to visually represent how did different cohorts perform over a given period of time, which is particularly useful if there’s a major difference between the cohorts, like the channel where they came from, organic vs. paid, for example. 

In this example, we can see that users who signed up on January 29th had higher retention than all other cohorts (note that I’ve changed the data from the table above for this example). 

What if we wanted to take the average retention rate for the period January 23-31st? The most intuitive action may seem to take the average of Day 1 retention, which in our case would be 12.87%. This, however, may sometimes be misleading as our cohorts differ in size. To correct that, we can take the weighted average, which would be calculated by taking the sum of Day 1 Retention times each cohort divided by the sum of all cohorts. 

In other words, we’d need to do 

(12.16% x 1,784 + 12.69% x 1,750 + 13.12% x 1,860 + 13.46% x 1,991 + 14.01% x 1,942 + 11.87% x 2,274 + 15.00% x 2,156 + 12.66% x 2,275 + 10.91% x 2,448) / 18,480 

Which is equal to 12.83%. 

This may not seem as that much of a difference, but see what happens if your cohort sizes are different: 

If we take the average of Day 1 retention column, then we’ll still get the same number as before, or 12.87%. However, if we take the weighted average retention rate to make up for the difference in the cohort size using the same calculation we did above, we’d get 13.22%, which is quite different. 

The ultimate goal of looking at retention rates is to understand how your product is being used. If we go back to our fictitious messaging app, it is evident that we’re not doing a very good job if by day 7 less than 1% of our users return to use the app. Make no mistake, this is bad, but it’s much better to discover and diagnose this early on, so we can investigate further. 

Retention rates differ greatly depending on the industry, types of products, etc. Ultimately, what we’re looking for is a retention curve which flattens out over time, which is a sign that at least a certain percentage of users has found use in the product and continue to use it over time. 

Product B has very low retention and by day 7 it has lost nearly 99% of the users it acquired, which means that in order for that business to grow, it will need to continuously replace the churning users by an ever growing number – a task that will prove to be unsustainable over time. Product A, on the other hand, seems to flatline around 40%, which suggests that there are users who return to the product on a daily basis. Analogically, retention can also be used to compare and analyse features within a product to understand which features are users coming back to use frequently and which ones less frequently. 

Understanding retention is of crucial importance from day 1 and it can completely change the focus and direction of a company’s strategy and roadmap.