Before we just dive into the various models that exist in the ML field, let’s take a few minutes to understand what is Machine Learning.
Machine Learning is the subfield of computer science that gives “computers the ability to learn without being explicitly programmed.”
If we had to explicitly program a model, it will need a lot of rules, will be highly dependent on the current dataset and not generalized enough to detect out of sample cases.
This is where ML comes in place. Using ML, we can build a model or in simpler words, a machine learning algorithm to do the desired job. In simpler words, we basically feed the machine with information which in turn learns from the data or the information we have input and gives us the desired output.
Let’s talk about the various types of ML models now.
Broadly speaking all ML models can be categorized as supervised and unsupervised.
Supervised Learning: In supervised learning, we teach the model then with that knowledge it can predict unknown or future instances.
How do we teach the model? We teach the model by training it with some data from a labeled dataset.
With supervised learning, there are two sub categories:
Types of Regression Models:
- Simple Linear Regression: It is a model that describes the relationship between one dependent and one independent variable using a straight line which fits the data.
It’s extensions include:
a. Multiple linear regression that is finding a plane of best fit.
b. Polynomial regression that is finding a curve for best fit.
2. Decision Tree: These are a type of Supervised Machine Learning where the data is continuously split according to a certain parameter.
Each of the parameters above is a node and the more nodes you have, the more accurate your decision tree will be in general.
3. Random Forest: Let’s dive into a real life analogy. A student named X wants to choose a course which will be beneficial for him in the future and he is confused about the choice of course based on his skill set. So, he decided to consult various people like his parents, degree students and teachers. Finally, after consulting various people about the course, he decides to take the course suggested by most of the people.
Random forest is a Supervised Machine Learning Algorithm that is used widely in Classification and Regression problems. It builds decision trees on different samples and takes their majority vote for classification and average in case of regression.
Before understanding the working of the random forest we must look into the ensemble technique. Ensemble simply means combining multiple models. Thus a collection of models is used to make predictions rather than an individual model.
This technique permits higher predictive performance.
Ensemble uses two types of methods:
1. Bagging– It creates a different training subset from sample training data with replacement and the final output is based on majority voting. For example, Random Forest.
2. Boosting– It combines weak learners into strong learners by creating sequential models such that the final model has the highest accuracy. For example, ADA BOOST, XG BOOST
These are the different regression models that you should know about but of course there are a lot more like Lasso Regression, Support Vector Regression, Ridge Regression and a lot more.