Over the last few years, the number of cyberattacks has increased drastically, becoming more resistant to detection systems introduced by researchers. Traditional centralized approaches have been replaced by distributed methods such as Federated Machine Learning to enhance efficiency. The latter enables the creation of a central model by training multiple clients locally and aggregating only the model’s weights rather than sharing raw data. This approach enhances computational efficiency and significantly improves privacy, as sensitive data remains on the client’s devices. This research paper proposes a method for detecting malware in the Android environment leveraging Federated Machine Learning. In detail, we employed two datasets: one of almost 20,000 malicious and trustworthy Android packages and the second of nearly 7,250 applications which were converted into grayscale images by representing their smali code as pixel intensities, enabling Convolutional Neural Networks to process them effectively. After concluding the dataset composition, we trained several models using two Convolutional Neural Networks, such as InceptionV3 and a custom version of MobileNet. After identifying the best-performing model on the binary dataset, we applied the same hyperparameters to train, validate, and test a second dataset composed of four distinct classes. Additionally, we compared the performance of the federated approach with that of centralized training using the same model architectures. At the end of this process, we identified the best-trained model on the binary dataset and applied the two class activation map algorithms to perform explainability using the model. Moreover, as the last step, we also applied the Structural Similarity Index Measure to quantify the consistency and reliability of the generated heatmaps.

An explainable privacy-preserving method for mobile malware detection through federated machine learning / Ciaramella, G., Martinelli, F., Santone, A., Mercaldo, F.. - In: INTERNATIONAL JOURNAL OF INFORMATION SECURITY. - ISSN 1615-5270. - 25:3(2026). [10.1007/s10207-026-01268-4]

An explainable privacy-preserving method for mobile malware detection through federated machine learning

Ciaramella, Giovanni
;
2026

Abstract

Over the last few years, the number of cyberattacks has increased drastically, becoming more resistant to detection systems introduced by researchers. Traditional centralized approaches have been replaced by distributed methods such as Federated Machine Learning to enhance efficiency. The latter enables the creation of a central model by training multiple clients locally and aggregating only the model’s weights rather than sharing raw data. This approach enhances computational efficiency and significantly improves privacy, as sensitive data remains on the client’s devices. This research paper proposes a method for detecting malware in the Android environment leveraging Federated Machine Learning. In detail, we employed two datasets: one of almost 20,000 malicious and trustworthy Android packages and the second of nearly 7,250 applications which were converted into grayscale images by representing their smali code as pixel intensities, enabling Convolutional Neural Networks to process them effectively. After concluding the dataset composition, we trained several models using two Convolutional Neural Networks, such as InceptionV3 and a custom version of MobileNet. After identifying the best-performing model on the binary dataset, we applied the same hyperparameters to train, validate, and test a second dataset composed of four distinct classes. Additionally, we compared the performance of the federated approach with that of centralized training using the same model architectures. At the end of this process, we identified the best-trained model on the binary dataset and applied the two class activation map algorithms to perform explainability using the model. Moreover, as the last step, we also applied the Structural Similarity Index Measure to quantify the consistency and reliability of the generated heatmaps.
2026
Biometrics, Computer Science, Learning algorithms, Machine Learning, Mobile Computing, Mobile and Network Security, Machine Learning Techniques for Android Malware Detection
File in questo prodotto:
File Dimensione Formato  
s10207-026-01268-4.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 6.91 MB
Formato Adobe PDF
6.91 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11771/41661
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • OpenAlex ND
social impact