Over the years, the widespread adoption of the Microsoft (MS) Office suite as a productivity tool used by millions of users worldwide has attracted the interest of malicious users in exploiting its vulnerabilities. The most known of these concerns documents containing macros, whose construction is becoming increasingly complex in order to avoid detection of malicious, hidden, behaviors. In this context, this paper presents a novel technique that exploits Large Language Models (LLMs) to extract a set of linguistic features that could reveal the presence of malicious code embedded within macros, even in case of obfuscation. The experimental evaluation, conducted on a publicly available dataset of MS Office files, indicates that the proposed system achieves robust detection of obfuscated malicious macros. Moreover, the performances of a lighter, purely statistical, method are also evaluated so as to offer analysts the choice between a high-precision, resource-intensive model, or a more time-efficient alternative.
Obfuscation-resistant feature extraction for macro-based office malware detection / Imperiale, Sergio; Morana, Marco; Lo Re, Giuseppe. - 4198:(2026). ( ITASEC & SERICS 2026 - Joint National Conference on Cybersecurity Cagliari, Italy 09-13/02/2026).
Obfuscation-resistant feature extraction for macro-based office malware detection
Imperiale Sergio
;
2026
Abstract
Over the years, the widespread adoption of the Microsoft (MS) Office suite as a productivity tool used by millions of users worldwide has attracted the interest of malicious users in exploiting its vulnerabilities. The most known of these concerns documents containing macros, whose construction is becoming increasingly complex in order to avoid detection of malicious, hidden, behaviors. In this context, this paper presents a novel technique that exploits Large Language Models (LLMs) to extract a set of linguistic features that could reveal the presence of malicious code embedded within macros, even in case of obfuscation. The experimental evaluation, conducted on a publicly available dataset of MS Office files, indicates that the proposed system achieves robust detection of obfuscated malicious macros. Moreover, the performances of a lighter, purely statistical, method are also evaluated so as to offer analysts the choice between a high-precision, resource-intensive model, or a more time-efficient alternative.| File | Dimensione | Formato | |
|---|---|---|---|
|
Obfuscation_Resistant_Feature_Extraction_for_Macro_based_Office_Malware_Detection.pdf
accesso aperto
Descrizione: Obfuscation-Resistant Feature Extraction for Macro-based Office Malware Detection
Tipologia:
Versione Editoriale (PDF)
Licenza:
Creative commons
Dimensione
2.84 MB
Formato
Adobe PDF
|
2.84 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

