Deep Learning for Multimedia Content Analysis

doi:10.1201/b21638-14

ABSTRACT

Multimedia conventional computing techniques depend mainly on the features that are extracted and captured from multimedia content, including text, audio, image, and video data from different domains. Deep learning has a great impact on a variety of applications, including image classification, image clustering, speech recognition, and natural language processing, which effectively apply to multimedia data. Deep learning architectures consist of multiple levels of nonlinear operations. The parameter space searching of deep architectures is a complex task; however, advanced learning algorithms, for instance, the deep belief networks, have lately proposed to tackle the searching process. This chapter discusses the principles and motivations concerning deep learning algorithms such as deep belief networks, restricted Boltzmann machines, and the conventional deep neural network. It deals with the adaptation of deep learning methods to multimedia content analysis ranging from low-level data such as audio and images to high-level semantic data such as natural language. The chapter also addresses challenges and future directions.