[ad_1]

In computer vision there are many algorithms that are designed to extract spatial features to identify objects using information about image gradients. HOG, or Histogram of Oriented Gradients, is one of these algorithms. A histogram is an approximate representation of the distribution of numerical data that looks like a looks a bar graph. Each bar represents a group of data that falls in a certain range of values, also called bins. Orientation means the direction or orientation of an image gradient. HOG will produce a histogram of gradient directions in an image.

The HOG algorithm is applied in the following steps:

- Calculate the magnitude and direction of the gradients at each pixel of the input image. The above figure shows the gradients of a 8*8 cell in the image. The gradient change in the image is represented by a vector at each pixel. The direction of the vectors indicates the direction of the change of pixel intensity, and the magnitude tells us how strong the change in intensity is.
- Divide the image into cells of the same size (blue windows in the below animation). The cells’ size is an optional parameter. The size should be chosen in a way that the scale of features will fit to the cell.
- Group the gradient directions of all pixels in each cell into a specified number of orientation bins. Sum the magnitudes of the gradients in each bin, which will be the heights of the bins. The number of bins is usually set to 9. So that each bin’s width will be 20 degrees.
- Group the cells into blocks of same size (the red sliding window in the below animation). The amount of movement of the block window over the image is called stride. It is usually set to half the block size. The number of cells in the block and the stride are free parameters which set by the user.
- Normalize the cell histogram according to the other cells in the block. All the normalized histograms from all the blocks will be added up into a single feature vector. This feature vector is called the HOG descriptor.

We will be using OpenCV’s HOGDescriptor class to create the HOG descriptor. The parameters of the HOG descriptor are setup using the HOGDescriptor() function. The parameters of the HOGDescriptor() function and their default values are given below:

`cv2.HOGDescriptor(win_size=(64, 128),`

block_size=(16, 16),

block_stride=(8, 8),

cell_size=(8, 8),

nbins=9,

win_sigma=DEFAULT_WIN_SIGMA,

threshold_L2hys=0.2,

gamma_correction=true,

nlevels=DEFAULT_NLEVELS)

**win_size:**Size of detection window in pixels (*width*,*height*). Defines the region of interest. Must be an integer multiple of cell size.**block_size:**Block size in pixels (*width*,*height*). Defines how many cells are in each block. Must be an integer multiple of cell size and it must be smaller than the detection window. The smaller the block the finer detail you will get.**block_stride:**Block stride in pixels (*horizontal*,*vertical*). It must be an integer multiple of cell size. The block_stride defines the distance between adjecent blocks, for example, 8 pixels horizontally and 8 pixels vertically. Longer block_strides makes the algorithm run faster (because less blocks are evaluated) but the algorithm may not perform as well.**cell_size:**Cell size in pixels (*width*,*height*). Determines the size fo your cell. The smaller the cell the finer detail you will get.**nbins:**Number of bins for the histograms. Determines the number of angular bins used to make the histograms. With more bins you capture more gradient directions. HOG uses unsigned gradients, so the angular bins will have values between 0 and 180 degrees.**win_sigma:**Gaussian smoothing window parameter. The performance of the HOG algorithm can be improved by smoothing the pixels near the edges of the blocks by applying a Gaussian spatial window to each pixel before computing the histograms.**threshold_L2hys:**L2-Hys (Lowe-style clipped L2 norm) normalization method shrinkage. The L2-Hys method is used to normalize the blocks and it consists of an L2-norm followed by clipping and a renormalization. The clipping limits the maximum value of the descriptor vector for each block to have the value of the given threshold (0.2 by default).**gamma_correction:**Flag to specify whether the gamma correction preprocessing is required or not. Performing gamma correction slightly increases the performance of the HOG algorithm.**nlevels:**Maximum number of detection window increases.

[ad_2]

Source link