Abstract
The rapid detection of Domain Generation Algorithm (DGA) domains plays a critical role in mitigating malware propagation and its potential impact, as well as in limiting botnet activity coordination through command and control (C&C) servers. We introduce DeepD2V, a deep learning driven approach for highly accurate detection of DGA-generated domains, leveraging word embeddings learned from observed domain names in DNS queries or browsing URLs. Domain embeddings are constructed with Dom2Vec (D2V), a novel technique which builds on top of word embedding models (e.g., Word2Vec) to map words and tokens extracted from domain names into highly expressive representations. DeepD2V integrates a deep Convolutional Neural Network (CNN) architecture to make the most out of the D2V embeddings, realizing unprecedented detection performance for low false-alarm rates. Through experimental evaluation on a large-scale dataset (almost 400,000 domains) comprising 25 distinct families of DGA domains, we demonstrate the superiority of D2V embeddings as compared to standard, n-gram based like features commonly used in the literature for DGA detection. We show that DeepD2V significantly outperforms current state-of-the-art approaches for DGA detection and analysis based on shallow learning and lexicographic analysis, realizing precision and recall performance above 97%.
| Originalsprache | Englisch |
|---|---|
| Titel | 2024 IEEE International Conference on Machine Learning for Communication and Networking |
| Untertitel | ICMLCN |
| Seiten | 164-170 |
| Seitenumfang | 7 |
| ISBN (elektronisch) | 979-8-3503-4319-9 |
| DOIs | |
| Publikationsstatus | Veröffentlicht - 15 Aug. 2024 |
| Veranstaltung | 2024 IEEE International Conference on Machine Learning for Communication and Networking - Stockholm, Stockholm, Schweden Dauer: 5 Mai 2024 → 8 Mai 2024 |
Konferenz
| Konferenz | 2024 IEEE International Conference on Machine Learning for Communication and Networking |
|---|---|
| Kurztitel | ICMLCN 2024 |
| Land/Gebiet | Schweden |
| Stadt | Stockholm |
| Zeitraum | 5/05/24 → 8/05/24 |
Research Field
- Multimodal Analytics
Fingerprint
Untersuchen Sie die Forschungsthemen von „DeepD2V - Deep Learning and Domain Word Embeddings for DGA based Malware Detection“. Zusammen bilden sie einen einzigartigen Fingerprint.Diese Publikation zitieren
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver