ABSTRACT

Nowadays, videos and images contain text data that indicate to useful information for indexing, retrieval, automatic annotation, and structuring of images. The extraction of this information can be executed by several phases on a digital video. This chapter explains in detail different phases for text extraction and approaches used in every phase. The phases are preprocessing and segmentation, detection, localization, tracking, extraction, and recognition, respectively. In addition, the chapter discusses several suitable techniques according to the video type and phase. Mechanically, when these techniques have been applied, the text in video sequences will be extracted to provide useful information about their contents. Furthermore, this chapter aims at extraction of text information from video (such as news videos) and multimodal mining from the same.