Geospatial datasets come in diverse forms including spatiotemporal data, imagery, time series, graphs, vector data, geometries, sensor data, point clouds, elevation models, location and mobility data, and more. Although they are different representations of the same ground reality and numerous tasks in the geospatial domain can benefit from multimodal pattern recognition, the fusion of these datasets is inherently challenging. Compared to traditional heuristics-based methods, recent advances in artificial intelligence (AI) combined with large-scale data processing offer superior approaches such as deep learning-based machine perception, self-supervised learning, and multi-modal learning for solving these tasks. In order to reliably and scalably apply these techniques, and foster deep collaboration between technical and non-technical stakeholders for solving diverse business-critical problems, it is imperative to build a geospatial AI platform that can process and fuse multimodal data and allow for end-to-end machine learning, model building, experimentation, evaluation, and model deployment at scale. In this chapter, we describe the components that such a platform should have—including complex data handling and feature processing pipelines, a feature platform (that stores various types of useful data derivatives, aggregates, embeddings, and other semantically meaningful feature representations), a machine learning kernel for training and predictions, a scalable data processing mechanism, experimentation management, and an intuitive user interface for wider adoption. As a case study, we present our no-code geospatial AI platform called Trinity, that is based on upstream complex spatiotemporal dataset transformations and a deep learning kernel with state-of-the-art computer vision algorithms to adapt a data-centric approach instead of a model-centric one. The versatility thus achieved gives users the ability to represent and fuse disparate data modalities, formulate diverse geospatial problems in a streamlined fashion, and paves the way for rapid prototyping, experimentation, and enables quick productionization of trained machine learning models.