ABSTRACT

This chapter provides an insight to semantic segmentation on remote sensing images. It discusses the semantic segmentation using a U-Net like architecture, a popular model targeting semantic segmentation of images. Unlike the patch-based approach, semantic segmentation employs models that preserve the spatial resolution of the output. The chapter helps the reader to understand that this kind of models must be trained from densely annotated images: the training step requires patches of images that are fully annotated, i.e. each pixel of the image has a corresponding class label. It presents a simple framework to prepare, select and extract those patches in the remote sensing images, and from a rasterized OpenStreetMap vector layer of buildings. The chapter also presentes a method to properly use fully convolutional models on very large images, avoiding blocking artifact.