Ryan Ashton Portfolio

What is Customer Churn?

Background

Businesses are interested in retaining their customers as it ensures consistency of revenue. The larger and more diverse a business operates, the greater the variety of customers they have on their books. An important application from Machine Learning has been to identify certain customer types which are likely to stop buying a product or service. When you have data about the customer and their behaviour (leaving or staying) you can formulate predictions on whether a customer is a ‘flight risk’ for your business. From here, you can treat them differently by providing a customised approach to ensure they are retained within your business for longer. The following example is fictitious data for a bank that I have generated, inspired by other datasets that are publicly available. This bank is trying to determine what characteristics a fleeting customer has and develop a Machine Learning model to help identify if an individual customer will ‘churn’. As there has been a model developed for this short project, we can simulate different types of customers in the form at the bottom of this page. This form will send data over to the Machine Learning model which will then ‘predict’ whether the customer is likely to churn or not.



Performing Exploratory Data Analysis

To build a Machine Learning model correctly, it’s best to conduct an exploratory data analysis (EDA) to determine what ‘features’ are suitable for training. The key here is to identify the main features of the customer dataset which will influence whether a customer will churn (leave the bank). This is normally a long process performed in a Jupyter Notebook – so I have just highlighted the key findings below.

Kernel Density Estimation

We can try to find ‘hot spots’ in the data by utilising a Kernel Density Estimation (KDE) plot. When we visualise across two variables, we can determine what customer characteristics are a high flight risk.

Age vs. Tenure

Balance at the Bank vs. Number of Products with the Bank

In the first chart above, we see two hot spots, indicating that people aged between 35 and 43 who have been with the bank for 1-2 years are likely to churn. The second chart indicates that customers will only have one product (likely a credit card) with varying balances.

Do Customers who Churn have Credit Cards?

Are Customers who Churn are Active?

The two charts above indicate that customers who have a credit card and are not actively using the bank's services are likely to churn.

Observations

A customer who is aged between 35-43, with only a credit card and not actively using the banks services is likely only at the bank for a balance transfer. Meaning, they had no intention of staying with the bank from the start, but wanted to 'park' some credit at a lower interest rate.

Because I used the 'Random Forest' Machine Learning Algorithm, I have an interesting option to view how the decision tree is formulated. Please note, this is a truncated visual - there are many more decisions made (indicated by the (...) grey boxes).

What does the Machine Learning Model think is Important?

The 'features' are the data columns that the Machine Learning model needs in order to make a prediction. We now know that the most important features are the age (35-43), tenure (1-2 years) and number of products (1) which will likely predict whether a customer will churn or not. But of course, the prediction can be swayed by the other important features, such as, Balance, Estimated Salary and whether the customer is actively using the banks services.



The Machine Learning Model in Action

Based on the analysis above, we can now experiment with different features to see what the prediction will be. Please feel free to 'simulate' a customer's characteristics and determine whether the customer is likely to stay or leave the bank.