An explainable privacy-preserving method for mobile malware detection through federated machine learning

IRIS

Over the last few years, the number of cyberattacks has increased drastically, becoming more resistant to detection systems introduced by researchers. Traditional centralized approaches have been replaced by distributed methods such as Federated Machine Learning to enhance efficiency. The latter enables the creation of a central model by training multiple clients locally and aggregating only the model’s weights rather than sharing raw data. This approach enhances computational efficiency and significantly improves privacy, as sensitive data remains on the client’s devices. This research paper proposes a method for detecting malware in the Android environment leveraging Federated Machine Learning. In detail, we employed two datasets: one of almost 20,000 malicious and trustworthy Android packages and the second of nearly 7,250 applications which were converted into grayscale images by representing their smali code as pixel intensities, enabling Convolutional Neural Networks to process them effectively. After concluding the dataset composition, we trained several models using two Convolutional Neural Networks, such as InceptionV3 and a custom version of MobileNet. After identifying the best-performing model on the binary dataset, we applied the same hyperparameters to train, validate, and test a second dataset composed of four distinct classes. Additionally, we compared the performance of the federated approach with that of centralized training using the same model architectures. At the end of this process, we identified the best-trained model on the binary dataset and applied the two class activation map algorithms to perform explainability using the model. Moreover, as the last step, we also applied the Structural Similarity Index Measure to quantify the consistency and reliability of the generated heatmaps.

An explainable privacy-preserving method for mobile malware detection through federated machine learning / Ciaramella, G., Martinelli, F., Santone, A., Mercaldo, F.. - In: INTERNATIONAL JOURNAL OF INFORMATION SECURITY. - ISSN 1615-5270. - 25:3(2026). [10.1007/s10207-026-01268-4]

An explainable privacy-preserving method for mobile malware detection through federated machine learning

Ciaramella, Giovanni;Martinelli, Fabio;Santone, Antonella;Mercaldo, Francesco

2026

Abstract

Over the last few years, the number of cyberattacks has increased drastically, becoming more resistant to detection systems introduced by researchers. Traditional centralized approaches have been replaced by distributed methods such as Federated Machine Learning to enhance efficiency. The latter enables the creation of a central model by training multiple clients locally and aggregating only the model’s weights rather than sharing raw data. This approach enhances computational efficiency and significantly improves privacy, as sensitive data remains on the client’s devices. This research paper proposes a method for detecting malware in the Android environment leveraging Federated Machine Learning. In detail, we employed two datasets: one of almost 20,000 malicious and trustworthy Android packages and the second of nearly 7,250 applications which were converted into grayscale images by representing their smali code as pixel intensities, enabling Convolutional Neural Networks to process them effectively. After concluding the dataset composition, we trained several models using two Convolutional Neural Networks, such as InceptionV3 and a custom version of MobileNet. After identifying the best-performing model on the binary dataset, we applied the same hyperparameters to train, validate, and test a second dataset composed of four distinct classes. Additionally, we compared the performance of the federated approach with that of centralized training using the same model architectures. At the end of this process, we identified the best-trained model on the binary dataset and applied the two class activation map algorithms to perform explainability using the model. Moreover, as the last step, we also applied the Structural Similarity Index Measure to quantify the consistency and reliability of the generated heatmaps.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Rivista
	
				INTERNATIONAL JOURNAL OF INFORMATION SECURITY
			
	Parole chiave
	
				Biometrics, Computer Science, Learning algorithms, Machine Learning, Mobile Computing, Mobile and Network Security, Machine Learning Techniques for Android Malware Detection
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
s10207-026-01268-4.pdf accesso aperto Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 6.91 MB Formato Adobe PDF Visualizza/Apri	6.91 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11771/41661

Citazioni

ND

ND

ND

social impact