Dom2Vec - Detecting DGA Domains Through Word Embeddings and AI/ML-Driven Lexicographic Analysis

Lucas Torrealba (Autor:in und Vortragende:r), Pedro Casas-Hernandez, Javier Bustos-Jiménez, Germán Capdehourat, Mislav Findrik

Publikation: Beitrag in Buch oder TagungsbandVortrag mit Beitrag in TagungsbandBegutachtung

Abstract

The timely identification of DNS queries to Domain Generation Algorithm (DGA) domains plays a critical role in mitigating malware propagation and its potential impact, especially in thwarting coordinated botnet activity. We introduce Dom2Vec, an innovative approach for swiftly detecting DGA-generated domains by leveraging lexicographic features exclusively derived from the observed domain names in DNS queries. Dom2Vec leverages word embeddings to map tokens extracted from domain names into highly expressive representations. These representations are then combined with a reputation-based scoring system for domain names, which utilizes the co-occurrence frequency of n-grams in relation to a list of whitelisted domains. The fusion of domain embeddings, reputation scores, and other meaningful lexicographic features derived from domain names provides robust domain name representations for AI/ML-driven detection of DGAs. Through experimental evaluation on a dataset comprising 25 distinct families of DGA domains, we demonstrate that Dom2Vec significantly outperforms current state-of-the-art approaches for DGA detection and analysis, improving our previous detection system based on reputation scores by at least 30%, for a false-alarm rate below 1%.
OriginalspracheEnglisch
Titel2023 19th International Conference on Network and Service Management (CNSM)
Seiten1-5
Seitenumfang5
ISBN (elektronisch)978-3-903176-59-1
DOIs
PublikationsstatusVeröffentlicht - 28 Nov. 2023
Veranstaltung2023 19th International Conference on Network and Service Management (CNSM) - Niagara Falls, ON, Ontario, Kanada
Dauer: 30 Okt. 20232 Nov. 2023

Konferenz

Konferenz2023 19th International Conference on Network and Service Management (CNSM)
Land/GebietKanada
StadtOntario
Zeitraum30/10/232/11/23

Research Field

  • Ehemaliges Research Field - Data Science

Fingerprint

Untersuchen Sie die Forschungsthemen von „Dom2Vec - Detecting DGA Domains Through Word Embeddings and AI/ML-Driven Lexicographic Analysis“. Zusammen bilden sie einen einzigartigen Fingerprint.

Diese Publikation zitieren