Multimodal learning combines heterogeneous data sources such as vision, audio, and text to improve perception and decision-making. Despite its empirical success, this thesis ar- gues that multimodal learning is not inherently robust: most existing approaches assume the availability and reliability of all modalities, leading to significant performance degradation when modalities are missing or corrupted. Our work investi- gates how robustness in multimodal learning can be explicitly modeled and enforced under imperfect data conditions. It addresses three interconnected aspects: architectural design for modality integration, learning with missing modalities, and robustness to corrupted modalities. The contributions include unified taxonomies that reveal structural limitations of existing methods, a modality-agnostic learning framework that enables reliable inference under missing modalities, and principled benchmarks and models for evaluating and im- proving robustness under modality corruption. Overall, our work demonstrates that robustness in multimodal learning must be deliberately designed rather than assumed, and pro- vides foundations for building reliable multimodal systems in real-world environments.

Beyond the Ideal: Multimodal Learning with Imperfect Data Conditions / Liaqat, M.I.. - (2026 May 14). [10.13118/liaqat-muhammad-irzam_phd2026-05-14]

Beyond the Ideal: Multimodal Learning with Imperfect Data Conditions

Liaqat, Muhammad Irzam
2026

Abstract

Multimodal learning combines heterogeneous data sources such as vision, audio, and text to improve perception and decision-making. Despite its empirical success, this thesis ar- gues that multimodal learning is not inherently robust: most existing approaches assume the availability and reliability of all modalities, leading to significant performance degradation when modalities are missing or corrupted. Our work investi- gates how robustness in multimodal learning can be explicitly modeled and enforced under imperfect data conditions. It addresses three interconnected aspects: architectural design for modality integration, learning with missing modalities, and robustness to corrupted modalities. The contributions include unified taxonomies that reveal structural limitations of existing methods, a modality-agnostic learning framework that enables reliable inference under missing modalities, and principled benchmarks and models for evaluating and im- proving robustness under modality corruption. Overall, our work demonstrates that robustness in multimodal learning must be deliberately designed rather than assumed, and pro- vides foundations for building reliable multimodal systems in real-world environments.
14-mag-2026
38
SQ
COSTA, GABRIELE
File in questo prodotto:
File Dimensione Formato  
Phd_Thesis_Liaqat.pdf

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Creative commons
Dimensione 1.12 MB
Formato Adobe PDF
1.12 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11771/41998
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • OpenAlex ND
social impact