Tokenization | 6 | AI and Language in the Urban Context

ABSTRACT

This chapter examines how large language models (LLMs) use tokens, breaking down words into smaller units for language processing, and draws parallels with how urban environments are represented through data. Just as LLMs use tokens to process language, urban maps use tokens like labels and coordinates to represent physical spaces and features. The chapter illustrates how these parallels extend to human perception and automated image recognition, where patterns and features are identified and classified. Tokenization in LLMs reflects broader cognitive and analytical processes in urban environments, enhancing both language processing and urban analysis.