Abstract
We introduce PHISHWEB, a novel approach to website phishing detection, which detects and categorizes malicious websites through a progressive, multi-layered analysis. PHISHWEB’s detection includes forged domains such as homoglyph and typosquatting, as well as automatically generated domains through DGA technology. The focus of PHISHWEB is on lexicographic-based analysis of the domain name itself, improving applicability and scalability of the approach. Preliminary results on the application of PHISHWEB to multiple open domain-name datasets show precision and recall results above 90%. We additionally extend PHISHWEB’s detection of DGA domains through Machine Learning (ML), using a small set of highly specialized lexicographic domain features. Results on the detection of DGA domains show that, for a false alarm rate below 1%, the ML-extension of PHISHWEB improves non-ML PHISHWEB DGA detector as well as state-of-the-art by at least 60%, realizing precision and recall values of 93.1% and 84.8%, respectively. Finally, we also present preliminary results on the application of PHISHWEB to real, in the wild DNS requests collected at large mobile and fixed-line operational networks, discussing some of the findings.
Original language | English |
---|---|
Title of host publication | IEEE 9th International Conference on Network Softwarization (NetSoft) |
Place of Publication | 2023 |
Pages | 252 |
Number of pages | 256 |
ISBN (Electronic) | 979-8-3503-9980-6 |
DOIs | |
Publication status | Published - 13 Jul 2023 |
Event | 9th IEEE International Conference on Network Softwarization, NetSoft 2023 - Duration: 19 Jun 2023 → 23 Jun 2023 |
Conference
Conference | 9th IEEE International Conference on Network Softwarization, NetSoft 2023 |
---|---|
Period | 19/06/23 → 23/06/23 |
Research Field
- Former Research Field - Data Science
Keywords
- Phishing Websites
- DNS
- Lexicographic Analysis
- Machine Learning