Abstract
Artificial intelligence (AI) opens new possibilities for processing and analysing large, heterogeneous historical data corpora in a semi-automated way. The Ottoman Nature in Travelogues (ONiT) project applies a fine-tuned Contrastive Language–Image Pre-Training (CLIP) model for retrieving illustrations of nature representations in digitized early book prints. In this paper, we present preliminary results of our work, including a curated and annotated dataset of more than 8,000 images of nature representations, and the CLIP-based text–image exploration tool ONiT Similarity Explorer. A preliminary evaluation confirms the potential of vision-language models for retrieving specific contents from large image collections in the cultural heritage and digital humanities domains. While our tests show mixed results, the model already works reasonably well for exploring large and unlabelled image collections, and for retrieving various nature representations in our dataset.
| Originalsprache | Englisch |
|---|---|
| Aufsatznummer | fqaf082 |
| Seiten (von - bis) | 1-18 |
| Seitenumfang | 19 |
| Fachzeitschrift | Digital Scholarship in the Humanities |
| DOIs | |
| Publikationsstatus | Veröffentlicht - 7 Sept. 2025 |
| Veranstaltung | Digital Humanities Conference 2023 - Messecongress Graz convention centre, Graz, Österreich Dauer: 10 Juli 2023 → 14 Juli 2023 |
Research Field
- Multimodal Analytics