ABSTRACT

This chapter is intended to provide some practical uses-cases of CNNs in robotics using only visual data. The chapter begins with a brief introduction of CNNs and its advantages over the conventional ANNs followed by a detailed description of each of the submodules present in a standard CNN architecture for better understanding of the subsequent applications. While describing each and every module of the CNN architecture, we make sure that some small example codes written in python are provided in this chapter, which can help the reader for practical implementation of a CNN network. Later, in this chapter, we demonstrate two different applications of CNNs towards warehouse automation. Automation of warehouses demand accurate localization and segmentation of an object of interest which is further to be picked and placed by a robotic arm from/in a given location. These applications are involved with various visual challenges which make a conventional vision based technique almost impossible to segment and localize an object of interest. We thus, demonstrate here two such CNN based techniques for accurate localization and segmentation of object present in a cluttered and partially occluded environment. The first application mainly deals with detection of a rectangular bounding box for an object of interest, whereas, the second application is to precisely localize the object boundary for more accurate pick and place of an object. Training a CNN network for multi-class object detection and semantic segmentation demands a large amount of annotated data which is very tedious to obtain manually. To this end, we develop two different automatic annotation techniques and demonstrate their results for different degree of clutters with varying background and lighting conditions.