A pose retrieval system based on skeleton contrastive learning

doi:10.1201/9781003460763-51

ABSTRACT

This study proposes a human pose retrieval system that can retrieve images with similar poses to a pose provided by users. This technology has a wide range of applications, such as providing illustrators and artists with poses to imitate, or for use in the fields of video understanding and AI-generated content (AIGC). Existing research on pose retrieval can be divided into two categories: (1) describing poses with text, or (2) manually extracting pose features, converting them into a low-dimensional space, and searching for similar poses in this space. As the text is imprecise in describing poses and manually extracted features cannot capture the rich semantics of poses, both techniques result in inaccurate search results. In this study, we propose a research method based on graph neural networks (GNNs) and contrastive learning for human pose retrieval. We use GNNs to model human posture and leverage contrastive learning techniques to learn the GNN representation. In this paper, we will explain the design principles of our method and conduct experiments to verify its effectiveness and efficiency.