ABSTRACT

This chapter reviews the active research on video modeling and retrieval in recent years, including semantic concept detection, semantic video retrieval, and interactive video retrieval taking advantage of user interaction. It identifies open research issues based on the state-of-the-art research efforts. The chapter explains the basic approach based on supervised learning, and then briefly review advanced approaches in semantic concept detection, such as semisupervised learning, multilabel learning, and cross-domain learning methods. It provides a few representative systems on TRECVID video corpus with highlights on their distinct characteristics. Despite the success of existing semantic concept detection techniques, there are difficulties in handling large-scale semantic concepts with large-scale video data. Existing video modeling and retrieval techniques mostly focus on videos in specific domains, such as broadcast news and documentary videos. The increasingly popular video-sharing service, such as YouTube, has attracted millions of users and is perhaps the most heterogeneous and the largest publicly available social video archive.