ABSTRACT

Feature selection and optimization of anchors are important tasks for achieving a good level of object detection and frame rate for real-time implementation capability. The bounding boxes associated with the detected objects play a crucial role in determining the distance of objects in the current image frame because they have spatial pixel coordinates. The use of a fast and simple technique for determining distance using these coordinates becomes a necessity. The current work aims to address critical limitations associated with the above tasks. A novel way of replacing standard red-green-blue (RGB) images by a multiple channel type feature vector is called the “array of processed channels (AOPC).” A new image corpus is created using the AOPC and is used for training an optimized, custom deep learning You Look Only Once (YOLO)-based object detection framework. Another issue, which is addressed prior to the training phase, is the selection and optimization of anchor boxes based on the manually annotated ground truth data. Finally, the pixel coordinates associated with the bounding boxes of the detected objects are transformed into real-world coordinates. It helps to determine the actual distance of it from the ego vehicle. The proposed techniques have shown promising results compared with other research.