ABSTRACT :
2D cameras are often used in interactive systems. Other systems like gaming consoles provide more powerful 3D cameras for short range depth sensing. Overall, these cameras are not reliable in large, complex environments. In this work, we propose a 3D stereo vision based pipeline for interactive systems, that is able to handle both ordinary and sensitive applications, through robust scene understanding. We explore the fusion of multiple 3D cameras to do full scene reconstruction, which allows for preforming a wide range of tasks, like event recognition, subject tracking, and notification. Using possible feedback approaches, the system can receive data from the subjects present in the environment, to learn to make better decisions, or to adapt to completely new environments. Throughout the paper, we introduce the pipeline and explain our preliminary experimentation and results. Finally, we draw the roadmap for the next steps that need to be taken, in order to get this pipeline into production.
KEYWORDS : Computer Vision, Deep Learning, Machine Learning
Attention Models
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
Access publicationElectronics 12 (9), 2027, 2023
Access publicationProceedings of the 2023 ACM International Conference on Interactive Media …, 2023
Access publicationJournal of Mobile Multimedia, 773-788, 2021
Access publication