Machine Learning News Hubb
Advertisement Banner
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us
Machine Learning News Hubb
No Result
View All Result
Home Machine Learning

Pre-Approval Model — AI Explainability | by RG | Sep, 2022

admin by admin
September 11, 2022
in Machine Learning


The objective of this exercise is to develop a deep learning model (Multi-layer Perceptron) to predict the probability of a customer to default on the loan. The pre-approved loan means that the lender has already evaluated the financial standing and credit history of the applicant. Hence the processing time for the loan is short and the disbursal is quick.

There are 3 stages in model explainability:

· Feature selection (Section 2)

· Model selection (Section 3)

· Explainable output (Section 4)

There are 68,294 observations and 10 columns (please refer appendix A). There are 4 categorical variables, 5 numerical variables, and 1 dependent variable.

The model is developed on 68,294 observations. There are 61,770 (90.4%) non-defaults and 6,524 (9.6%) defaults.

Based on the default rate the categories are combined together. The chart below shows the default rate for the combined categories.

Decision trees are used to bin the numerical variables (max depth of the tree is 3 and min samples leaf is 1,360). Based on the default rate the bins are combined together. The chart below shows the default rate for the combined bins.

Amount and Duration are not used in the model as they are not monotonic in nature.

Correlation matrix is used to identify if the variables are correlated with each other. It is observed that the correlation between the independent variables is more than -0.7 and less than 0.7. Hence there is no multi co-linearity.

Deep learning is the science to allow computers to learn just like humans, particularly learn patterns from information. Machine learning has supervised, unsupervised and semi-supervised algorithms. Deep learning is a part of machine learning. There are specific algorithms that are a part of deep learning. Deep learning consists of a stack of layers consisting of neurons and activation function.

· Supervised algorithm: Teaching the algorithm using inputs and outputs. The output is the label identifying fraud and not fraud.

· Feature extraction: Extracting the most valuable features.

GridSearchCV exhaustively considers all parameter combinations. It is used for tuning the hyper-parameters of an estimator. The GridSearchCV instance implements the usual estimator API, when “fitting” it on a dataset all the possible combinations of parameter values are evaluated and the best combination is retained.

The optimal hyper-parameters are determined using iterative process (please refer appendix B). The best hyper-parameters are:

· hidden_layer_sizes: 4,4,4 (The ith element represents the number of neurons in the jth hidden layer). There are 3 hidden layers in the model and each layer has 4 neurons.

· activation: logistic (Activation function for the hidden layer. The logistic sigmoid function, returns f(x) = 1 / (1 + exp(-x)))

· solver: lbfgs (The solver for weight optimization. The optimizer lbfgs is from the family of quasi-Newton methods)

· It is observed that the cut-off is at 0.55. Cut-off is decided based on Accuracy.

· The model has accuracy of 0.933 (Accuracy = (True positive + True negative ) / All)

· The model has F1 score of 0.518 (F1 = 2 * (precision * recall) / (precision + recall))

· The model has AUROC of 0.803 (measures how well the model is able to distinguish between good and bad)

For each of the binned variables, the imputed value and the direction are shown below. It is observed that there is monotonic trend for all the independent variables.

There are 56 combinations (7 x 2 x 2 x 2 = 56). Score (7 bins), Acc type (2 bins), Payment (2 bins) and Month (2 bins). For each of the combination the predicted PD is calculated (please refer appendix C).

Since categorical variables and binned numerical variables are used in model it is possible to get all the possible combinations of the input data. For each of the combination of the input data the predicted PD is calculated.

· Inputs: Score (TU) — numerical bin, Account type — category, Payment type — category and Month — numerical bin are selected from the drop down menu

· Output: The Probability of default (PD) and Decision. If PD > 0.55 then BAD else GOOD.

· Interpretability: Each observation gets its own predicted values. This helps to explain why a case receives its prediction and the contributions (direction) of the predictors.

Categorical Variables:

· id — the account id of the applicant (not used in model development)

· Payment — the manner of payment code (1, 2, 3, 4, 5, 7, 8, 9, 8A, 8P, 9B, 9P and UR)

· Acc type — the account type code (A, B, C, D, F, G, H, I, L, M, N, P, R, S, T and U)

· Pay type — the payment type code (B and U)

Numerical Variables:

· Year — the year when the loan was taken (not used in model development)

· Month — the month when the loan was taken

· Score — the Trans Union (TU) score (external score)

· Amount — the amount of loan amount in USD

· Duration — the duration of the loan in months

Hyperparameters control the over-fitting and under-fitting of the model. For each proposed hyperparameter setting the model is evaluated. The hyperparameters that give the best model are selected.

There are 56 combinations. For each of the combination the predicted PD is calculated



Source link

Previous Post

Machine Learning Explainability — Worked out Example (Balance Consolidation) | by RG | Sep, 2022

Next Post

An Introduction to LSTM (Long short Term Memory) for Time Series Forecasting | by Azmine Toushik Wasi | Sep, 2022

Next Post

An Introduction to LSTM (Long short Term Memory) for Time Series Forecasting | by Azmine Toushik Wasi | Sep, 2022

Why AI is Trendy In 21st Century ? | by saurabh kale | Sep, 2022

Smart India Hackathon 2022: Tech Journey for “Future Skills Recommendation” | by Ayush Solanki | Sep, 2022

Related Post

Artificial Intelligence

Exploring TensorFlow Model Prediction Issues | by Adam Brownell | Feb, 2023

by admin
February 2, 2023
Machine Learning

Different Loss Functions used in Regression | by Iqra Bismi | Feb, 2023

by admin
February 2, 2023
Machine Learning

How to organize bills? – 3 ways to track bills

by admin
February 2, 2023
Artificial Intelligence

How to decide between Amazon Rekognition image and video API for video moderation

by admin
February 2, 2023
Artificial Intelligence

The Future of AI: GPT-3 vs GPT-4: A Comparative Analysis | by Mohd Saqib | Jan, 2023

by admin
February 2, 2023
Deep Learning

6 Ways To Streamline Tech Hiring With A Recruitment Automation Platform

by admin
February 2, 2023

© 2023 Machine Learning News Hubb All rights reserved.

Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Privacy Policy and Terms & Conditions.

Navigate Site

  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us

Newsletter Sign Up.

No Result
View All Result
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us

© 2023 JNews - Premium WordPress news & magazine theme by Jegtheme.