ABSTRACT

Key terms related to images, video, and vision are defined, including image and video processing, computer vision, machine vision, pattern recognition, and robot vision. Hair-splitting differences in the meanings of seemingly similar terms are pointed out. The image classification algorithms discussed include k-means clustering, iterative self-organizing data analysis technique algorithm (ISODATA), and support vector machine. The image classification procedure is expounded. The ImageNet 2012 challenge, the deep learning revolution in image classification, and the ImageNet hierarchical database for vision research are described. Among the deep learning models for image classification, the AlexNet and the Visual Geometry Group-16 architectures are considered pioneering convolutional neural network architectures in computer vision. Inception v1 is a deep convolutional neural network (CNN) architecture. The design of artificial neural networks is automated with the neural architecture search technique. Progressive Neural Architecture Search, a method for learning CNN structures, is discussed.