Otmane Amel delivered an insightful presentation at the recent INFORTECH seminar, where he explored his innovative work on Multi-Modal AI. His research demonstrates how fusing data from different modalities—such as text, images, and video—can lead to more accurate and explainable AI models. "By combining multiple modalities, we emulate how humans use different senses to better understand their surroundings," Amel explained, emphasizing the power of multimodal learning in enhancing AI performance. 🧠🔍


One key application Amel presented was in HS Code prediction for customs fraud detection. Working with eOrigin, he developed tools to automate the validation of customs declarations. By integrating various data types—such as product images, descriptions, and marketplace information—he achieved an impressive 10.6% increase in Top-3 accuracy compared to traditional models that rely on a single data source. 📦💡




Amel also discussed how his AI models have been used to improve safety on railway construction sites. By collaborating with Infrabel, he was able to apply multimodal AI to detect dangerous actions in real time. By fusing RGB images with depth maps, the AI models can recognize hazardous situations and prevent potential accidents, making construction sites safer for workers. 🚆👷‍♂️




Central to Amel’s work is the MutlConcat Fusion method, which plays a critical role in enhancing the accuracy and flexibility of his models. The approach combines representations from different modalities through both concatenation and element-wise multiplication. This dual strategy preserves specific modality features while also extracting valuable cross-modal information. The element-wise product helps identify key similarities and differences between modalities, ensuring more reliable outcomes. As a model-agnostic and easy-to-implement technique, MutlConcat Fusion proves to be a powerful solution for integrating multimodal data. 🔗🤖




Amel’s presentation highlighted the immense potential of multimodal AI in solving complex, real-world problems. From improving fraud detection in logistics to ensuring worker safety in industrial environments, his research underscores the growing importance of data fusion in AI development. The audience at INFORTECH left with a deeper understanding of the role multimodal learning can play in shaping the future of AI. 💡✨