Discriminative and Generative Model Learning for Video Object Tracking

doi:10.1201/9780429322990-12

ABSTRACT

Artificial intelligence is the dominating area in industry as well as in academia. In video object tracking, the location and bounding box of an object is estimated as the object moves in the video sequence from one frame to the next. Video object tracking has the following three major steps: appearance model, motion model, and localisation. The purpose of learning an appearance model is to have a unique representation of the target that can be distinguished from many other objects. In construction and learning of a generative appearance model, tracked visual objects in the successive video frames are used. With some initial tracked samples, eigenbasis vectors based subspace is constructed using singular value decomposition. The generative appearance model has high accuracy but it fails most of the time when the background is complex. The background component is recognised as a target when the target itself undergoes variations.