ABSTRACT

The significance and importance of optimization methods that are used in training deep neural networks is an area of research that has gained tremendous attention. The aspects that are majorly dealt with are the mechanisms to optimize the learning space of the algorithm. Research studies entail that while hyper-parameters are tuned to enhance the model performance, specific control mechanisms are incorporated to handle the over-fitting of data. With the exponential growth of data and the subsequent increase in the model complexity, identifying the most optimal parameter configuration in an n-dimensional space with an mn possible parameter configuration sets the base and pace for this rigorous research. Equally important are the loss functions and the error rates and identifying the loss associated with each possible configuration. Hence, optimization techniques or optimizers (convex and non-convex) and their variants form an important constituent of study from a mathematical perspective. Training of deep neural network data for optimization using a fuzzy goal programming approach has been discussed. This chapter will also deliberate on mathematical foundation for techniques such as regularization, drop-outs and early stopping, to handle over-fitting of data, and optimization of hyper-parameters mainly the kernel size, the neural network layers, the activation and loss functions, and several others that affect the model performance. A comparative study of computationally intensive, expensive, and adaptive optimizers will also form a part of this study. A correlation study based on the loss functions and their corresponding error types will offer guidance for both the development of optimization and machine learning research. The chapter follows an exemplary approach while discussing the mathematical foundation and frameworks that underline the optimization techniques.