ABSTRACT

Object detection is a key feature needed by the majority of computer and robot vision systems. The most recent study in this field has made significant strides. During the last few years, there has been a successful increase in computer vision research. Part of this success may be attributed to the adoption and adaptation of machine learning methods, while others can be attributed to the invention of novel representations and models for specific computer vision challenges or the development of efficient solutions. Object detection is one area that has made significant improvement. As a consequence of the enhanced computational capacity of current processors, progress is being made on a daily basis. More advances like this will boost the chances of Object detection and other related operations. Object detection, given a collection of object classes, consists of determining the location and size of all object instances, if any, that are present in a picture or an image in a series (e.g., video). Thus, the goal of an object detector is to locate all object instances of one or more provided object classes 44independent of scale, position, posture, camera view, partial occlusions, or lighting conditions. Object detection is often the initial stage in many computer vision systems since it allows additional information about the identified object and the scene to be obtained. Once an object instance (e.g., a face, body, thing, etc.) has been detected, it is possible to obtain additional information, such as: (i) recognizing the specific instance; (ii) tracking the object over an image sequence (e.g., tracking an object in a video); and (iii) extracting additional information about the object (e.g., determining the subject’s gender).

It is also possible to: (i) infer the existence or placement of other items in the picture (e.g., an obstacle may be near a face and at a comparable scale); and (ii) better estimate other information about the scene (e.g., the type of setting, etc.).

In this current manuscript, we discuss how the detection of a single object class helps us in detecting and reducing the amount of video analysis required to record and detect objects (e.g., animals or birds, etc.) appearing in front of the recording device. The single-object detection approach is preferred here as, if the algorithm is able to detect a single object precisely, then it may be able to detect multiple objects in front of it as well. Since the graph of the duration of stay of the object is plotted with respect to the single object detected hence it can be inferred if the algorithm can detect a single object from a collection of objects, then the graph will also be plotted for it accordingly. Thus, a reduction of effort is required to analyze video recordings of great length.

Object detection has been utilized in a wide range of applications. Each of these applications has various requirements, such as processing time, occlusion resilience, rotation invariance (e.g., in-plane rotations), and detection under posture changes. While many applications include detecting a single object class (e.g., faces) and from a single perspective (e.g., frontal faces), others require detecting numerous object classes (vehicles, birds, etc.), or a single class from various viewpoints (e.g., side and frontal view of vehicles). Most systems, in general, can detect just a single object class given a limited number of perspectives and positions.