TY - GEN
T1 - Investigating Visual Localization Using Geospatial Meshes
AU - Vultaggio, Francesco
AU - Fanta-Jende, Phillipp
AU - Schörghuber, Matthias
AU - Kern, Alexander
AU - Gerke, Markus
PY - 2024/12/14
Y1 - 2024/12/14
N2 - This paper investigates the use of geospatial mesh data for visual localization, focusing on city-scale aerial meshes as map representations for locating ground-level query images captured by smartphones. Visual localization, essential for applications such as robotics and augmented reality, traditionally relies on Structure-from-Motion (SfM) reconstructions or image collections as maps. However, mesh-based approaches offer dense spatial representation, memory efficiency, and real-time rendering capabilities. In this work, we evaluate initialization strategies, image matching techniques, and pose refinement methods for mesh-based localization pipelines, comparing the performance of both traditional and deep-learning-based techniques in image matching between real and synthetic views. We created a dataset from nadir and oblique aerial imagery and accurately georeferenced smartphone images to test cross-modal localization. Our findings demonstrate that combining global feature retrieval with GNSS-based spatial filtering yields significant improvements in accuracy and efficiency, achieving submeter positional and subdegree rotational errors. This study advances scalable visual localization using meshes and highlights the potential of integrating smartphone GNSS data for improved performance in urban environments.
AB - This paper investigates the use of geospatial mesh data for visual localization, focusing on city-scale aerial meshes as map representations for locating ground-level query images captured by smartphones. Visual localization, essential for applications such as robotics and augmented reality, traditionally relies on Structure-from-Motion (SfM) reconstructions or image collections as maps. However, mesh-based approaches offer dense spatial representation, memory efficiency, and real-time rendering capabilities. In this work, we evaluate initialization strategies, image matching techniques, and pose refinement methods for mesh-based localization pipelines, comparing the performance of both traditional and deep-learning-based techniques in image matching between real and synthetic views. We created a dataset from nadir and oblique aerial imagery and accurately georeferenced smartphone images to test cross-modal localization. Our findings demonstrate that combining global feature retrieval with GNSS-based spatial filtering yields significant improvements in accuracy and efficiency, achieving submeter positional and subdegree rotational errors. This study advances scalable visual localization using meshes and highlights the potential of integrating smartphone GNSS data for improved performance in urban environments.
UR - http://dx.doi.org/10.5194/isprs-archives-xlviii-2-w8-2024-447-2024
U2 - 10.5194/isprs-archives-xlviii-2-w8-2024-447-2024
DO - 10.5194/isprs-archives-xlviii-2-w8-2024-447-2024
M3 - Conference Proceedings with Oral Presentation
T3 - The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
BT - The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
ER -