Hierarchical Semantic Content Analysis and Its Applications in Multimedia Summarization and Browsing

doi:10.1201/b11723-25

Chapter

Hierarchical Semantic Content Analysis and Its Applications in Multimedia Summarization and Browsing

ABSTRACT

Contents 15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434 15.2 Review of Multimedia Content Analysis and Applications . . . . . . . . . . . . . 435 15.3 Audiovisual Cues-Based Semantic Importance Analysis . . . . . . . . . . . . . . . . 439

15.3.1 Audio Scene Segmentation and Classiﬁcation . . . . . . . . . . . . . . . . . . 439 15.3.2 Shot Clustering and Semantic Scene Importance . . . . . . . . . . . . . . 443 15.3.3 Semantic Shot Importance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447 15.3.4 Semantic Frame Importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449

15.4 Hierarchical Multimedia Content Summarization for Rapid Browsing 450 15.5 Experiments and Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 15.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461

15.1 Introduction With the rapid growth of television (TV) content in digital libraries, information redundancy of the content is increasing at an even higher pace. A large portion of the TV material is, however, redundant even after editing and postprocessing. Consequently, eﬃcient content management, for example, content classiﬁcation, representation, retrieval, summarization, is becoming a kernel task for TV content providers as well as end users. Traditional management of TV content is usually conducted manually, which is quite expensive in terms of time and labor, especially with the dazzling increase of TV content. Therefore, it is critical to develop automatic content management solutions. On the other hand, in order to enable the end users of TV services to deal with large amounts of content, it has to be presented in a form that facilitates the comprehension of the content and allows judging the relevance of segments of the content quickly. However, a practical diﬃculty in automatic content management is how to bridge the gap between low-level content characteristics that can be processed and recognized by machines and high-level semantics as requested by users, which usually cannot be easily processed by machines, because they lack cognitive processing abilities similarly to human beings. Therefore, semantic content analysismethods are required in order to connect low-level characteristics to semantic meanings of TV content [1-3].