Machine Learning News Hubb
Advertisement Banner
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us
Machine Learning News Hubb
No Result
View All Result
Home Machine Learning

Sales prediction using deep learning — Rossmann pharmaceuticals | by Amanuel Zewdu | Sep, 2022

admin by admin
September 10, 2022
in Machine Learning


Rossmann Zentrale in Burgwedel bei Hannover

The finance team of the Rossmann Pharmaceuticals wanted to forecast sales in all their stores across several cities six weeks ahead of time. Managers in individual stores rely on their years of experience as well as their personal judgement to forecast sales.

However, the data team identified factors such as promotions, competition, school and state holidays, seasonality and locality as necessary for predicting the sales across the various stores. In this project, we built and served an end to end product that delivers prediction to analysts in the finance team.

The data sets with a sufficient description of the features can be found here.

Data cleaning

The most important analysis to understand the data is EDA. In order to conduct a good EDA, its a must to clean the data first. This process involves building pipelines to detect and handle outlier and missing data. This is particularly important because we don’t want to skew our analysis.

As shown in the pictures below, after reading the given datasets, we looked for missing column values and their corresponding percentages. Then missing data were filled with the respective column’s median value. After that outlier detection took place, and the outliers spotted were replaced. Finally we merged the store dataset, which contains important information about individual stores, with the training and testing datasets.

outlier detection
Fixing outliers
Merging and saving cleaned data

EDA — Customer purchasing behavior

Exploratory data analysis is the lifeblood of every meaningful machine learning project. It helps us unravel the nature of the data and sometimes informs how you go about modelling. A careful exploration of the data encapsulates checking all available features, checking their interactions and correlation as well as their variability with respect to the target.

In the data exploration phase, we conducted a check for distribution of promotions, compared sales behavior and looked for any seasonal purchase behaviors. We also inferred that there is a high correlation between sales and number of customers. The effects of assortment type and competition distances were analyzed as depicted in the pictures below.

Promotion distribution for train dataset
Correlation — sales and number of customers
Sales per month per customers
Type of store affects sales
Correlation among all features

Prediction of store sales using machine learning and DEEP LEARNING approach was the central task of this project. We want to predict daily sales in various stores up to weeks ahead of time. To effectively do this, we first preprocessed the data into a format where it can be fed into a machine learning model. we also generated new essential features like ‘season’ and ‘week day’, and scaled the data. The output of this preprocessing is shown below.

Model ready data

Building models with sklearn pipelines

A reasonable starting point will be to use a random forest regressor, and for working with sklearn pipes. This makes modeling modular and more reproducible. working with pipelines will also significantly reduce workload when moving a setup into files for the next part of the project.

The next step is choosing loss functions. loss functions indicate how well a model is performing. This means that the loss functions affect the overall output of sales predicition. Different loss functions have different usecases. For this project we used rmse(root mean square error), mae(mean absolute error) and R-squared scores.

Evaluation metrics

* RMSE of 0.53 means the data is somehow concentrated around the line of best fit, the model can relatively predict the data accurately.

* Adjusted R-squared value of 0.72 which is greater than the acceptable value(0.4) is also a good value for showing accuracy.

Finally, we made some post prediction analysis, and saved the model named with the timestamp as serialized model.

Feature importance

Deep learning techniques can be used to predict various outcomes including but not limited to future sales. In this project we created a deep learning model of the LSTM(Long Short Term Memory) which is a type of Recurrent Neural network. According to google, and (RNN) is a class of artificial neural networks where connections between nodes can create a cycle, allowing output from some nodes to affect subsequent input to the same nodes.

To build a LTSM regression model and predict the next sale, the following tasks are necessary.

  1. Isolation into time series data and checking weather the data is stationary or not — Using the ltsm_helper script prepared prior, the data was prepared for the modeling and scaled. Using the line plot of sales vs date, it was shown that the data is stationary.
preparation

2. Transformation of the time series data into supervised learning data by creating new y(target) column using the sliding window for time series.

Sales counts
Batch size and ephocs

3. Preparing the model and predicting one step ahead, (next sale)

The model was trained using the above EPOCH size, and we created a method to reset all of the weights in case we want to re-train with different parameters. The history object stores model loss over the epoch, which can be plotted to evaluate whether an adjustment is needed in the training process. And then by setting the window size to 45, and splitting into 80/20 train/test data, the LSTM model was initialized.

loss for the history object

4. Forcasting

This is the final step where we predected the next sale as presented in the following plot.

prediction

git clone https://github.com/Amanuel3065/pharmaceutical_sales_prediction.git

cd pharmaceutical-sales-prediction

sudo python3 setup.py install

https://amanuel3065-pharmaceutical-sales-prediction-app-0ldb9f.streamlitapp.com



Source link

Previous Post

Autodidact’s path to AI/Machine Learning (part 1) | by Marios Kokmotos | Sep, 2022

Next Post

Write composable Spark SQL analytics in JupyterLab | by Jean-Claude Cote | Jul, 2022

Next Post

Write composable Spark SQL analytics in JupyterLab | by Jean-Claude Cote | Jul, 2022

Deep Learning Roadmap & Learning Paths | by Nerd Dev | Sep, 2022

What is modulo in Python. We use multiple arithmetic operation in… | by Hiren Patel | Sep, 2022

Related Post

Artificial Intelligence

Exploring TensorFlow Model Prediction Issues | by Adam Brownell | Feb, 2023

by admin
February 2, 2023
Machine Learning

Different Loss Functions used in Regression | by Iqra Bismi | Feb, 2023

by admin
February 2, 2023
Machine Learning

How to organize bills? – 3 ways to track bills

by admin
February 2, 2023
Artificial Intelligence

How to decide between Amazon Rekognition image and video API for video moderation

by admin
February 2, 2023
Artificial Intelligence

The Future of AI: GPT-3 vs GPT-4: A Comparative Analysis | by Mohd Saqib | Jan, 2023

by admin
February 2, 2023
Deep Learning

6 Ways To Streamline Tech Hiring With A Recruitment Automation Platform

by admin
February 2, 2023

© 2023 Machine Learning News Hubb All rights reserved.

Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Privacy Policy and Terms & Conditions.

Navigate Site

  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us

Newsletter Sign Up.

No Result
View All Result
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us

© 2023 JNews - Premium WordPress news & magazine theme by Jegtheme.