Multimodal learning combines heterogeneous data sources such as vision, audio, and text to improve perception and decision-making. Despite its empirical success, this thesis ar- gues that multimodal learning is not inherently robust: most existing approaches assume the availability and reliability of all modalities, leading to significant performance degradation when modalities are missing or corrupted. Our work investi- gates how robustness in multimodal learning can be explicitly modeled and enforced under imperfect data conditions. It addresses three interconnected aspects: architectural design for modality integration, learning with missing modalities, and robustness to corrupted modalities. The contributions include unified taxonomies that reveal structural limitations of existing methods, a modality-agnostic learning framework that enables reliable inference under missing modalities, and principled benchmarks and models for evaluating and im- proving robustness under modality corruption. Overall, our work demonstrates that robustness in multimodal learning must be deliberately designed rather than assumed, and pro- vides foundations for building reliable multimodal systems in real-world environments.
Beyond the Ideal: Multimodal Learning with Imperfect Data Conditions / Liaqat, M.I.. - (2026 May 14). [10.13118/liaqat-muhammad-irzam_phd2026-05-14]
Beyond the Ideal: Multimodal Learning with Imperfect Data Conditions
Liaqat, Muhammad Irzam
2026
Abstract
Multimodal learning combines heterogeneous data sources such as vision, audio, and text to improve perception and decision-making. Despite its empirical success, this thesis ar- gues that multimodal learning is not inherently robust: most existing approaches assume the availability and reliability of all modalities, leading to significant performance degradation when modalities are missing or corrupted. Our work investi- gates how robustness in multimodal learning can be explicitly modeled and enforced under imperfect data conditions. It addresses three interconnected aspects: architectural design for modality integration, learning with missing modalities, and robustness to corrupted modalities. The contributions include unified taxonomies that reveal structural limitations of existing methods, a modality-agnostic learning framework that enables reliable inference under missing modalities, and principled benchmarks and models for evaluating and im- proving robustness under modality corruption. Overall, our work demonstrates that robustness in multimodal learning must be deliberately designed rather than assumed, and pro- vides foundations for building reliable multimodal systems in real-world environments.| File | Dimensione | Formato | |
|---|---|---|---|
|
Phd_Thesis_Liaqat.pdf
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
Creative commons
Dimensione
1.12 MB
Formato
Adobe PDF
|
1.12 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


