In this technique, there is an assumption that the column on which you are working should be normally distributed.

95% of population lies between μ ± 2σ

99% of population lies between μ ± 3σ

If any values lie outside these μ ± 3σ boundries.You can treat it as an outlier.

First, you will find out if the data is normally distributed or not; if yes, then you will find the range of μ ± 3σ. You consider all rows outside that range to be outliers.

You might be wondering why this technique is called the z-score technique.the formula for caculating the z-score is

Suppose you have an age column.You will calculate xi for each value in the age column; that is how you Z-transform the entire data.

If the point is an outlier, there are two possibilities.outlier is detected how to treat it?

If there are 5 values that does not lie in μ ± 3σ i.e. 5 are outliers.In the case of trimming, you will remove all five rows.

Sometimes the problem with trimming is that too many outliers have been removed, resulting in a significant portion of your data being removed. That is bad.

In capping, depending on whether these 5 values are on the lower or upper side, you cap their values.

if the values of μ ± 3σ is 80 on upper side and on lower side is 60

If your 3 values are outliers (85, 0, and 90), then how will you transform/cap this?

You make 85 to 80, 3 to 5 and 90 to 80 thats it i.e you replace the outliers values to maximum or minimum value.