ABSTRACT

Very few shape-capture techniques work effectively for rapidly moving scenes. Among the few exceptions are depth from defocus [9] and stereo [5]. Structured-light stereo methods have shown particularly promising results for capturing depth maps of moving faces [6,11]. Using projected light patterns to provide dense surface texture, these techniques compute pixel correspondences and then derive depth maps by triangulation. Products based on these triangulation techniques are commercially available.*

Traditional one-shot triangulation methods [3,12] treat each time instant in isolation and compute spatial correspondences between pixels in a single pair of images for a static moment in time. While they enable reconstructing moving scenes, they typically provide limited shape detail and resolution. Better results may be obtained by considering how each pixel varies over time and using this variation as a cue for correspondence, an approach we call spacetime stereo.