ABSTRACT

In this chapter, the authors compute the exact output of the model in extracting the center of the final convolution output. They remove 256 pixels from each side of the output tensor named prediction in the spatial dimensions. Removing 256 pixels from each side of the spatial dimensions of the output tensor is equivalent to using a model that inputs virtual patches of 576 × 576 and produces a field of 64 × 64 pixels. The practical margin should be large enough to avoid blocking artifacts, and small enough to reasonably impact the shrinking of the model output. In the OTBTF formalism, our network has hence a receptive field of 576 × 576 pixels, and an output expression field of 64 × 64 pixels.