[ad_1]

In the Supervised Learning algorithm, the learning approach uses labeled data to train a model. The algorithm is fed with the input data features(X)and their corresponding target values (y)( the true input/output pairs), then trained on this data. the learning algorithm generates a function (hypothesis function) that can map new unseen input features to a target value(or the probability of belonging to a class) with minimal error.

Hypothesis function is a function that takes in input features and map it to a target value

A model is simply a trained algorithm

A good model accurately captures the patterns in the data and also generalizes well on new data. A common problem encountered when building a model is the daunting task of developing a model that both recognizes the patterns in the data and also generalizes on new unseen data. This problem is due to two errors, the bias error and the variance error of the model which are complementary to each other.

**What is bias?Bias** in simple terms means “inclination towards a way of reason based on some assumptions/ presupposition”. being biased affects your clear judgment of certain concepts.

In machine learning, Bias is the assumptions the algorithm makes to make the hypothesis function easier to learn.

**it is the assumptions an algorithm makes to ease the process of generating a hypothesis function(a model).**

Fewer assumptions relate to a less biased model(**Low bia**s)

A **high** **bias** model makes strict assumptions about the model. it fails to capture the particulate relationship/patterns in the data. A high bias model performs poorly in the training sets and also on the test sets (underfit)

Generally, the Linear models are usually high biased models. They are easier to understand but make too many assumptions about the hypothesis function preventing them from performing well on complex problems(ex. **assumption of** **Linearity**).examples are Linear and Logistic Regression models

Non-linear models are usually low-biased models. They perform excellently on complex problems. examples are Decision Trees, K-NearestNeighbor, and Support Vector Machines.

**What is variance?Variance **in simple terms “is the state or quality of being variable, changing”.

In machine learning, Variance is the amount to which the predicted output of a model changes if the data it was trained on changes. it simpler terms, it is

**the sensitivity of the model to small changes in the training set**.

For a dataset concerning a certain domain/problem, Ideally, training a model on different training sets of that dataset should invoke little or no change in the model’s prediction. i.e the model does a good job of recognizing the underlying patterns in the dataset.

A **Low variance** relates to a small change in the model prediction with changes to the training set. (the goal)

A **High variance** relates to a large change in the model prediction with changes to the training set. (Not favorable) .

A high variance occurs when the learning algorithm fits to the random noise in the training set(overfit).

Since the training and test set comes from the same distribution, a high variance model would perform poorly when inference on new unseen data(test set).

Examples of low variance models are Linear and Logistic Regression models

Examples of high variance models are Decision Trees, KNN, SVM, etc.

Now, you may wonder, why is it a daunting task to develop a model that recognizes the patterns in the data and also generalize on new data(**Low bias and Low variance**), what is the idea behind the concept of the Bias-variance tradeoff ??.

The goal is a low bias, low variance model

if you read carefully, the bias and variance error complements each other, i.e a model with high variance automatically has a low bias, and a model with high bias has a low variance.

For example, a linear model places a strict assumption on the model (High bias), this result in a small change in the model prediction with changes in the training data (Low variance). The relationship between bias and variance can’t be avoided,

**Increasing variance leads to a decrease in bias

**Increasing bias leads to a decrease in variance

The concept of bias-variance tradeoff is a quest to produce a model that accurately captures the regularities in the data and generalizes well on new data.

Since it is typically impossible to achieve both simultaneously, the idea of a bias-variance tradeoff is to tune the tradeoff depending on the specific problem. How well do you want the model to generalize on new data??

**The bias-variance tradeoff in some algorithms:**

In **KNearestNeighbor** which is a high variance, low bias algorithm, the tradeoff can be tuned by varying the n_neighbor parameters, increasing/decreasing this parameter affects the bias-variance of the model.

Increasing the n_neighbors increases the bias of the model, hence reducing the variance.

In **Support Vector Machines**, the C parameter controls the tradeoff between a smooth decision boundary and classifying training points correctly. it is a regularization parameter. For large values of C, the optimization will choose a smaller-margin hyperplane if that hyperplane does a better job of getting all the training points classified correctly. Conversely, a very small value of C will cause the optimizer to look for a larger-margin separating hyperplane, even if that hyperplane misclassifies more points. For very tiny values of C, you should get misclassified examples, often even if your training data is linearly separable.

when building a model, it is likely to minimize the high bias in the model first because your model does not have a good understanding of the data yet. then address the high variance in the model.

In conclusion, the bias-variance tradeoff is a step to ponder when building a machine learning model.

Thanks for Reading.

Happy Learning!!!! Cheers.

Links for readings concerning this subject:

StatsStackExchange: Here

Wikipedia: Bias-variance Tradeoff

.

[ad_2]

Source link