[ad_1]

Building model is tough, maintaining and building a product around it is tougher. There are some steps that involves in a life cycle of a model.

## TYPES OF MACHINE LEARNING MODELS

SUPERVISED LEARNING — Data with labels. It can either ML or DeepLearning

UNSUPERVISED LEARNING — Data without labels.

Example: Pass tweets to get topics

RECOMMENDATIONS — Goal to present an item to a user such that user will click, buy or view etc.

Example : Youtube

RANKING —

Example : Tiktok

## TERMINOLOGIES

Features and Observations : Features are columns that contains values from the observations. There are three major types of data

1. Continuous : Integers

2. Categorical : Labels

3. Ordinal : values like High/Low

## PERFORMNANCE

Finding how good our model perform in real time very important.

User should decide which performance metric to use based on need.

There are multiple ways to evaluate a model based on its performance. One way is decide on metrics to use. Calculate metrics for different levels of threshold and compare the best based on business case.

VALIDATION : There are many procedures to validate a model using metrics. Famous ways are

- validating best model on holdout set
- Trying multiple models by hyper parameter tuning and evaluating each model by pre defined metric.
- K fold cross validation set . Two famous methods are

1. Using whole dataset for K fold validation

2. Using leave one out method

**SUPERVISED LEARNING METRICS**

Compared to actual result, prediction will fall in one of the the following — True positive

True negative

False positive

False negative

**Confusion Matrix:**

**Accuracy** : correct predictions/total predictions

**Sensitivity** : Proportion of **true positives** which are correctly classified

Formula : TP / (TP+FN)

**Specificity/Recall** : Proportion of **true negatives** which are correctly classified

Formula : TN / (TN + FP)

**Precision** : Proportion of actual positives in to total positive predictions.

Formula : TP/(TP+FP)

**F1 Score **: Harmonic mean of the precision and recall.

2 * (Sensitivity*Precision) / (Sensitivity + Precision)

**ROC (Receiver Operator Characteristic) Curve **tells how specificity and sensitivity change as the decision threshold changes. Higher the area under ROC better the model.

ROC plot is plotted between sensitivity on y-axis and 1-specificity on x-axis

In the below image, model plotted in blue is better than white.

[ad_2]

Source link