The human detection feature is a feature that is used as a parameter in the process of dropping goods in the “Drone Delivery” system. Human detection is a method in image processing that focuses on detecting objects in the form of the human body. The object in the form of a human body in this study is the object of the recipient of the goods to be addressed by the system using the Histogram of Oriented Gradients algorithm. When the system has brought the goods to be sent, the system will go down to carry out the process of placing the goods by using the system’s success parameters in detecting the object. A system that successfully detects objects where the human body is, will send the items that have been brought to the object of the recipient that has been detected. However, when the system fails to detect human presence around the system, the system will return to the starting point of departure. The result of this research is a system that successfully detects the presence of humans as objects receiving goods in real-time and provides a signal to the system for the system to go down in the process of placing the goods that have been carried by the system. Thus, the system can perform the packet drop process well when the system can detect objects in real-time with good accuracy up to a distance of 8 meters.
The HOG algorithm in detecting an object is based on the outline of an image. The outline in the image is an area that has a lot of information related to the intensity of light which gives the border area a difference from other areas in the image. The explanation of each detection stage using the HOG method is as follows:
Pre-processing is the stage of processing data in the form of digital images that are used for the next stage. In this pre-processing stage, the initial image size will be changed to 64×128 pixels. As for the example of the size of the image that has been changed to 64×128 pixels.
2. Calculate the image gradient
The gradient image has 2 important pieces of information about the direction (direction) and magnitude. Gradient magnitude describes the magnitude of changes that occur in the image and gradient direction describes the direction of changes in intensity in the image. Therefore, the gradient image is presented in the form of a vector that has magnitude and direction. The magnitude of the vector is denoted by the gradient magnitude and the direction of the vector is denoted by the gradient direction.
3. Calculate histogram of oriented gradients
The step in calculating the histogram in the HOG method is to create a cell containing 8×8 pixels in the image. In each cell that has been created, the gradient magnitude and gradient direction values will be calculated for each cell containing 8×8 pixels. For example, the results of calculating the gradient magnitude and gradient direction values are shown in the following figure:
The gradient direction value will refer to the bin histogram, while the gradient magnitude value will refer to the frequency in the histogram. Each value of the gradient magnitude will overlap in a certain histogram bin based on the magnitude of the gradient magnitude value, so that the bin histogram frequency is the result of the sum of all the overlapping gradient magnitude values.
Then from the histogram results obtained for each 8×8 cell, the next step is to create a vector of each histogram generated. After getting the vector, the next step is to create a feature vector for each 8×8 cell.
4. Block normalization
The block normalization stage is the stage that is carried out after obtaining the feature vector from the previous stage. This step is carried out to prevent the influence of light contrast which can affect changes in the obtained feature vector. In overcoming the effect of light contrast, at this stage, it refers to the overriding of feature vectors in one block. At this stage, we will create a block with a size of 2×2 from the cell in the previous step of calculating the gradient magnitude and gradient direction. So it can be concluded that the number of pixels in a block is 36×36 pixels.
5. Feature vectors
The features obtained are features that have gone through the block normalization stage. Based on the previous steps, the number of feature values in the HOG method is 3780. This value is obtained from the result of multiplying 7×15×36 which means that the value of 7×15 is the number of blocks in the normalization block stage in the image and the value of 36 is the value of the pixels in an image block.
6. Classification using SVM linear
The classifier stage is the stage that is carried out to separate the features that have been obtained into 2 classes, namely the positive feature class and the negative feature class. Positive features are features that are obtained and have the potential to become features that are expected to become the human body, while negative features are features that are obtained but do not have the potential to become the expected features that are not part of the human body.
Classification using the Support Vector Machine is a classic that separates the two classes with a line called a hyperplane. The following is an example of a classification using the Support Vector Machine.
1. Testing Algorithm
The algorithm that has been created will be tested to detect the presence of the human body and recognize human faces. The experimental process is carried out by detecting objects at every certain distance and obtaining results in the form of detected objects. The results of the experiments carried out are attached in Table 1 below.
Table 1. Testing the algorithm based on distance
Based on the results of these experiments, the following are the results of images in detecting objects for a certain distance:
2. Calculate the distance objek from camera
In determining the distance between the camera and the object that has been successfully detected, the calculation is based on the size of the bounding box obtained when the system has successfully detected the object. In general, the size of the bounding box will vary depending on the distance a camera detects the object. The calculation of the size of the bounding box that is carried out is in units of image pixels with the equation of the perimeter of the rectangle, namely:
By using equation 3, the bounding box calculation is done manually for each objects distance from the camera in real terms. The value of the bounding box for several distances through calculations is as follows.
Table 2. Different size of bouding box each distance
So, through the results that have been obtained, a regression can be made based on the data that has been obtained. The polynomial regression obtained based on the data is as follows.
Based on the regression results obtained, then the test is carried out for 10 different objects in obtaining an analysis of the regression values that have been obtained. So based on the results of the experiments that have been carried out, it can be concluded that:
- When the object detects at a distance of 3 meters to 4 meters in real conditions, the experimental results are by the regression values obtained.
- When the detected object is at a distance of 5 meters, the system is still able to produce a correct average value with a regression value of 98%, so it can still be said to be good.
- When the detected object is at a distance of 6 meters to 8 meters, the results obtained are no longer by the regression values obtained. The value of the bounding box at a distance of 6 meters to 8 meters sometimes produces the same bounding box value.
Based on these results, the bounding box value used in the distance measurement is 3–5 meters for the horizontal distance in the process of laying goods.
Based on the research that has been done, the Histogtam of Oriented Gradients algorithm has succeeded in detecting the presence of objects horizontally up to a distance of 8 in real terms. These results can be obtained by using the parameter values winstride = (16.16) and padding = (32.32). The value for each parameter used gives an accuracy rate of 92%.
In the process of calculating the distance, the value of the bounding box will vary according to the distance between the camera and the detected object. Based on the tests that have been carried out, the detected object according to the distance equation inputted in the program code provides a level of accuracy at a distance of 3 meters to 5 meters. While at a distance of 6 meters to 8 meters, the distance calculation results are not accurate because the size of the bounding box at that distance sometimes has the same value.