Customer segmentation is not merely a tool in the dynamic world of contemporary business; it is the compass guiding success. It’s analogous to painting a masterpiece, with customers serving as the colorful palette. By effectively categorizing them based on shared characteristics and behaviors, you infuse your marketing with vitality. It is akin to composing personalized experiences for each individual consumer – a symphony of customization.
This procedure is a pursuit for connection, not just for efficiency. Imagine offering customized solutions to your most loyal customers and rekindling your relationship with those who have drifted away. This is your treasure map to uncharted markets, your GPS for data-driven decisions, and the formula for enduring relationships.
In essence, customer segmentation empowers businesses to adapt to changing market dynamics, boost revenue, and create more meaningful interactions with their diverse customer base.
Table of Contents:
1. What is RFM Analysis?
2. Understanding RFM: Recency, Frequency, and Monetary Value
3. The Power of Data in Customer Segmentation
4. Creating a RFM Score
5. Interpreting RFM Segments
6. Benefits and Challenges of RFM Segmentation
7. Conclusion : The Future of Customer Segmentation
RFM Analysis, short for Recency, Frequency, Monetary Analysis, is a powerful marketing and customer segmentation technique used by businesses to gain insights into their customer base. It involves the evaluation of three key dimensions:
- Recency (R) assesses how recently a customer has interacted with the business. Recent interactions earn higher recency scores.
- Frequency (F) measures how often a customer engages with the business. Frequent interactions lead to higher frequency scores.
- Monetary Value (M) quantifies the total amount a customer has spent on transactions. Larger spending results in higher monetary value scores.
Each dimension is assigned numerical scores, which are combined to create an RFM score for each customer. RFM Analysis enables businesses to:
- Identify high-value customers who are more likely to make repeat purchases.
- Tailor marketing strategies to specific customer segments.
- Re-engage with inactive customers by targeting them with relevant offers.
- Optimize resource allocation by focusing marketing efforts on segments with the highest potential ROI.
- Gain insights into customer behavior and preferences.
By leveraging RFM Analysis, businesses can make data-driven decisions to enhance customer relationships, increase revenue, and improve overall marketing effectiveness.
Let’s investigate using an example and understand the concepts sequentially. An e-commerce business seeks to divide its clientele into several market categories and carry out a market strategy in accordance with those segments. To do this, the buying behaviors of various consumers will be defined, and clients will be categorized depending on with those behaviors. We will code in python and emphasize the concepts more throughout the article. You may definitely explore the data or develop your own features.
Let’s import the necessary python libraries required for data manipulation.
#Data manipulation and linear algebra
import pandas as pd
import numpy as np
#import visualization libraries
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
#for datetime operations
import datetime as dt
# Settings
import warnings
Let’s read the dataset CSV file into a dataframe and perform some basic functions to understand the data and features in the dataset.
#read the csv file and store in the df
df = pd.read_csv('online_retail_II.csv')
#take a look at first 5 rows in dataset
df.head()
#take a look at features in dataset
df.info()
Let’s have a look on the features available in the dataset and understand their significance.
Invoice number (Invoice): A unique number is assigned to each transaction, or invoice. If this code begins with the letter C, the transaction has been canceled.
StockCode: Identifier unique to each product.
Description: Name of the product
Quantity: It specifies the number of invoiced items that have been sold.
InvoiceDate: Date and time of the invoice.
UnitPrice: Price of a single unit.
CustomerID: Unique customer number
Country: Country designation. The country in which the customer resides.
We will do some basic data cleaning here. You can perform additional steps and come with a more granular version of data. It’s completely your choice.
#Drop the records where there is no Customer ID
df = df[df['Customer ID'].notna()==True]#We will drop the records who have Price as 0
index_name = df[(df['Price']==0) | (df['Price']<0)].index
df.drop(index_name, inplace = True)
#We will drop the records who have been cancelled or 0 values -> Invoice start with C
index_name = df[(df['Quantity']<0) | (df['Quantity']==0)].index
df.drop(index_name, inplace = True)
After the data cleaning part, let’s move forward to determine the recency value of each customer, we must set the date of the most recent invoice as the current date and subtract the date of the most recent purchase from this date.
# The type of Customer ID variable needs to be turned into an integer for following commands.
df["Customer ID"] = df["Customer ID"].astype(int) # The type of InvoiceDate variable needs to be turned into datetime for following commands.
df["InvoiceDate"] = pd.to_datetime(df["InvoiceDate"])
# last invoice date is assigned to today_date variable
df["InvoiceDate"].max()
# last invoice date is assigned to today_date variable
today_date = dt.datetime(2011,12,9)# Grouping the last invoice dates according to the Customer ID variable, subtracting them from today_date, and assigning them as recency
recency = (today_date - df.groupby("Customer ID").agg({"InvoiceDate":"max"}))
# Rename column name as Recency
recency.rename(columns = {"InvoiceDate":"Recency"}, inplace = True)
# Change the values to day format
recency_df = recency["Recency"].apply(lambda x: x.days).reset_index()
recency_df.head()
To determine the frequency value for each customer, we must determine the number of times they make purchases.
# Grouping unique values of invoice date according to customer_id variable and assigning them to freq_df variable
freq_df = df.groupby("Customer ID").agg({"InvoiceDate":"nunique"}).reset_index()# Rename column name as Frequency
freq_df.rename(columns={"InvoiceDate": "Frequency"}, inplace=True)
freq_df.head()
In order to calculate the monetary worth of each customer, we first need to figure out how much money each customer spends on their individual purchases.
# Multiplying the prices and quantities of purchased products and assigning them to the total price variable
df["TotalPrice"] = df["Quantity"] * df["Price"]# Grouping and sum up total prices according to each Customer ID
monetary_df = df.groupby("Customer ID").agg({"TotalPrice":"sum"}).reset_index()
# Rename Total Price column as Monetary
monetary_df.rename(columns={"TotalPrice":"Monetary"}, inplace=True)
monetary_df.head()
Let’s merge the recency_df, freq_df and monetary_df into single dataframe to compute the further operations.
rfm = pd.merge(recency_df, freq_df, on='Customer ID')
rfm = pd.merge(rfm, monetary_df, on='Customer ID')
rfm.head()
Let’s pause for a minute and focus on the function that data plays in RFM analysis and how AI and data analytics may improve it.
Role of Data in RFM Analysis:
- Data Collection: RFM analysis begins with collecting customer data, including purchase history, transaction details, and customer identifiers.
- Segmentation Basis: Data serves as the foundation for customer segmentation in RFM analysis. Recency, Frequency, and Monetary Value are derived directly from this data, providing insights into customer behavior.
- Score Calculation: Data is used to calculate RFM scores for each customer. These scores are determined by assigning weights to the three dimensions based on business objectives, allowing for the categorization of customers into segments.
- Segment Interpretation: The segments generated through RFM analysis are data-driven and offer valuable insights into customer behavior, preferences, and value to the business. This information informs marketing strategies, product offerings, and customer engagement tactics.
Enhancements through AI and Data Analytics:
- Advanced Data Processing: AI and data analytics tools efficiently handle large datasets, ensuring data accuracy and reliability through cleaning and preprocessing.
- Segmentation Algorithms: Machine learning algorithms automate the segmentation process, identifying complex patterns to create precise customer segments. This reduces human bias and enhances accuracy.
- Predictive Analytics: AI predicts future customer behavior based on historical data, allowing businesses to proactively target customers with personalized offers, increasing the likelihood of conversions.
- Recommendation Engines: AI-driven recommendation systems analyze customer behavior to suggest products or services, boosting cross-selling and upselling opportunities.
- Real-time Insights: AI and data analytics offer real-time insights into customer behavior, enabling agile marketing strategies and immediate adjustments based on changing customer preferences.
- Personalization: AI tailors marketing messages, content, and product recommendations to individual customers, increasing engagement and conversion rates.
Let’s make an RFM Score, shall we? Yes. In order to do this, we have separated the scores for recency, frequency, and monetary value into five categories, with the lowest possible score being 5, and the greatest possible score being 1.
# Dividing the recency values into recency scores such that the lowest recency value as 5 and the highest as 1
rfm["Recency_Score"] = pd.qcut(rfm["Recency"], 5, labels = [5, 4 , 3, 2, 1]) # Dividing the frequency values into frequency scores such that the lowest frequency value as 1 and the highest as 5
rfm["Frequency_Score"]= pd.qcut(rfm["Frequency"].rank(method="first"),5, labels=[1,2,3,4,5])
# Dividing the monetary values into monetary scores such that the lowest monetary value as 1 and the highest as 5
rfm["Monetary_Score"] = pd.qcut(rfm['Monetary'], 5, labels = [1, 2, 3, 4, 5])
rfm.head()
# Combining Recency, Frequency, and Monetary Scores in a string format
rfm["RFM_SCORE"] = (rfm['Recency_Score'].astype(str) +
rfm['Frequency_Score'].astype(str) +
rfm['Monetary_Score'].astype(str))# Customers with best scores
rfm[rfm["RFM_SCORE"]=="555"].head()
# Customers with worst scores
rfm[rfm["RFM_SCORE"]=="111"].head()
I hope you have understood all the things so far. Now, we have RFM Score and we want to segment them. For that, we need to have the definitions of all the segments.
1. Hibernating: Inactive customers who have not made recent purchases and require re-engagement efforts.
2. At Risk: Customers who were once active but have shown a decline in engagement, signaling a need for retention strategies.
3. Can’t Lose: Highly valuable and loyal customers who contribute significantly to revenue and should be carefully retained.
4. About to Sleep: Previously active customers displaying signs of decreased engagement, requiring proactive measures to prevent churn.
5. Need Attention: Customers with sporadic activity or declining purchase behavior, indicating a need for reactivation and attention.
6. Loyal Customers: Dedicated, repeat buyers who consistently contribute to revenue and brand advocacy.
7. Promising: Customers showing potential for increased engagement and loyalty with the right nurturing strategies.
8. New Customers: Recently acquired customers who are in the early stages of their relationship with the brand.
9. Potential Loyalists: Customers with positive but not yet consistent engagement, representing an opportunity for growth in loyalty.
10. Champions: Top-tier customers who consistently engage, spend, and advocate for the brand, making them valuable assets.
These definitions help clarify the characteristics and needs of each customer segment, guiding marketing and retention efforts effectively.
In this analysis, we will examine the code implementation for segmenting the score. A mapping of segments is generated based on their recency and frequency scores. In addition, it is also possible to consider the inclusion of monetary scores in the analysis. That’s an assignment for you.
seg_map = {
r'[1-2][1-2]': 'Hibernating',
r'[1-2][3-4]': 'At Risk',
r'[1-2]5': 'Can't Loose',
r'3[1-2]': 'About to Sleep',
r'33': 'Need Attention',
r'[3-4][4-5]': 'Loyal Customers',
r'41': 'Promising',
r'51': 'New Customers',
r'[4-5][2-3]': 'Potential Loyalists',
r'5[4-5]': 'Champions'
}# Recency and Frequency scores are turned into string format, combined and assigned to Segment
rfm['Segment'] = rfm['RecencyScore'].astype(str) + rfm['FrequencyScore'].astype(str)
# Segments are changed with the definitons of seg_map
rfm['Segment'] = rfm['Segment'].replace(seg_map, regex=True)
rfm.head()
# Mean, median, count statistics of different segments
rfm[["Segment","Recency","Frequency", "Monetary"]].groupby("Segment").agg(["mean","median","count"])
You can determine which section has the greatest number of consumers based on the output imagine that was just provided, as well as which area we need to focus our efforts on in order to ensure that we do not lose any of our customers.
Different customer segments lend itself to the development of a variety of marketing approaches. Here, I will concentrate on a few segments and provide the most efficient strategies for retaining current customers and expanding business.
1. Hibernating Customers:
a. Reactivation Campaigns: Send targeted email campaigns with special offers, discounts, or incentives to entice hibernating customers to make a purchase and re-engage with your brand.
b. Personalized Recommendations: Use data-driven recommendations based on past purchases to suggest relevant products, rekindling their interest in your offerings.
c. Exclusive Loyalty Program: Offer hibernating customers exclusive access to a loyalty program with rewards and benefits to encourage repeat purchases.
2. At Risk Customers:
a. Win-Back Campaigns: Create win-back email series with compelling messaging and time-sensitive offers to re-attract at-risk customers who may have started disengaging.
b. Feedback Surveys: Solicit feedback to understand their concerns and preferences, then tailor your products or services accordingly to address their needs.
c. Personalized Engagement: Develop personalized content and communications that highlight new features or updates that cater to their interests and pain points.
3. Can’t Lose Customers:
a. VIP Treatment: Extend special privileges such as early access to new products, exclusive events, or premium customer support to solidify their loyalty.
b. Tailored Cross-Selling: Recommend complementary products or services based on their past purchases to increase their lifetime value.
c. Satisfaction Surveys: Regularly gather feedback to ensure their needs are met and address any concerns promptly, reinforcing their trust in your brand.
4. About to Sleep Customers:
a. Re-Engagement Campaigns: Create automated re-engagement email series with personalized content and limited-time promotions to rekindle their interest.
b. Product Updates: Inform them about new product releases or enhancements that align with their previous purchases or preferences.
c. Customized Offers: Send targeted offers based on their browsing history and past interactions to reignite their enthusiasm.
5. Need Attention Customers:
a. Drip Email Campaigns: Implement drip email campaigns to provide valuable content, educate them on your products or services, and nurture their interest.
b. Customer Support: Offer proactive customer support to address their concerns promptly, building trust and loyalty.
c. Loyalty Incentives: Introduce loyalty programs or tiered rewards to incentivize more frequent and higher-value purchases.
Benefits:
- Personalization: RFM segmentation enables personalized marketing, increasing the relevance of offers and messages to specific customer segments.
- Resource Allocation: It optimizes resource allocation by focusing marketing efforts and budgets on high-value segments, leading to a higher ROI.
- Customer Retention: Identifies at-risk segments, allowing proactive retention strategies and improved customer satisfaction.
- Improved Targeting: Customized campaigns resonate better with segments, resulting in higher response and conversion rates.
- Market Expansion: Uncovers untapped markets by identifying new segments with unmet needs, potentially expanding the customer base.
- Data-Driven Decisions: RFM relies on data analytics, ensuring that marketing strategies are grounded in empirical evidence.
- Competitive Advantage: Businesses gain an edge by adapting to changing market dynamics and customer feedback more effectively than competitors.
Challenges:
- Data Quality: Accurate and complete customer data is crucial for RFM analysis; poor data quality can lead to inaccurate insights.
- Segment Size Variability: Varying segment sizes make it challenging to allocate resources proportionately.
- Privacy and Compliance: RFM analysis must comply with privacy regulations, necessitating data security and permissions.
- Omnichannel Consideration: Customer interactions occur across multiple channels, requiring comprehensive analysis beyond individual channels.
As businesses navigate the dynamic landscape of data-driven marketing, the future of customer segmentation shines brighter than ever. RFM segmentation, with its data-centric approach, is poised to play a pivotal role in shaping personalized customer experiences. With the advent of advanced analytics and AI-driven insights, companies can expect even more precise and real-time segmentations. The future promises deeper integration of omnichannel data, enabling a holistic view of customer behavior. As privacy concerns evolve, ethical and transparent data practices will be essential. In this era of hyper-personalization, RFM segmentation remains a cornerstone, empowering businesses to forge lasting customer relationships and drive sustained growth.
Thank you!
You can download the dataset here.