The purpose of this chapter is to propose a method to extract hierarchical multiscale spatial features (HMSFs) for the classification of remote sensing (RS) images with very high resolution (VHR). The study is inspired by the fact that convolutional neural networks (CNNs) are capable of extracting high-level spatial features without expert knowledge about the data. In this chapter, a novel architecture of CNNs, namely, hierarchical CNNs (HCNNs), containing several pixel-level CNNs (PCNNs), called PCNNs-α, PCNNs-β, and PCNNs-γ, is proposed to extract HMSFs from pyramid images obtained by the Gaussian pyramid method. In order to cope with the drawbacks for CNNs without considering the original spectral information, the combined features, containing all scale HMSFs, upsampled to the size of the raw image, and original spectral features (OSFs), are fed to an SVM classifier for the classification of the VHR RS image.