Analysis of statistical properties of variables in log data for advanced anomaly detection in cyber security

Research output: Contribution to journalArticlepeer-review


Log lines consist of static parts that characterize their structure and enable assignment of event types, and event parameters, i.e., variable parts that provide specific information on system processes, such as host and user names, IP addresses, and file operations. Many detection approaches only focus on anomalous event type occurrences, i.e., they parse log lines to derive unique event identifiers and subsequently detect anomalies in event sequences or event count vectors, but neglect variable parts of log lines entirely during analysis. This is especially problematic, when monitoring strongly structured log data that contains only a small number of distinct event types, for example, logs that consist of strict key value pairs, i.e., parameters that occur consistently throughout all log lines, such as it is case in access and audit logs. Thus, novel approaches are required, which focus on analysis of log lines' variable parts. In this paper, we propose the variable type detector (VTD), a novel unsupervised approach that autonomously analyzes variable log line parts to enable anomaly detection. It assigns data types to each variable, which also include probability distributions for discrete and continuous variables. The VTD raises an alarm if a variable's data type changes. Furthermore, it implements a robust indicator function that reduces false positives by tracking the data type history of each variable and reports only significant data type changes. Additionally, an event indicator enables event-based anomaly detection by taking into account the data types of all variables of a single event type. The evaluation conducted on open-source log data, demonstrates the effectiveness of the VTD compared to conventional anomaly detection approaches, such as time series analysis and PCA. Consequently, the VTD acts as a solution that extends the intrusion detection capabilities of security information and event management (SIEM) and integrates with modern concepts of endpoint detection and response (EDR) and extended detection and responses (XDR), while simultaneously serving as an asset for process monitoring that supports user and entity behavior analytics (UEBA).
Original languageEnglish
Article number103631
Number of pages14
JournalComputers & Security
Publication statusPublished - Feb 2024

Research Field

  • Cyber Security


  • Intrusion detection
  • Log analysis
  • Anomaly detection


Dive into the research topics of 'Analysis of statistical properties of variables in log data for advanced anomaly detection in cyber security'. Together they form a unique fingerprint.

Cite this