Project information

  • Category: Master's Project
  • Name: Smartphone and Hand Detection in First-Person Video for Virtual Reality
  • School: Memorial University of Newfoundland, NL, Canada
  • Supervisors: Dr. Lourdes Peña Castillo & Dr. Oscar Meruvia-Pastor
  • Project Duration: 3 months (From May 2024 to July 2024)
  • Social Media Post: LinkedIn Post

Project Abstract

Our collective research focuses on advancing the integration of smartphones and tablets into the world of Virtual Reality (VR) without the need to remove Head Mounted Displays (HMDs). My goal was to design a model that could accurately detect smartphones and hands in real-time, laying the foundation for seamless interaction within VR environments.

Over the course of three months, we developed a deep learning model using YOLO Object Detection as the base framework. To ensure robustness, we augmented a dataset of approximately 48,000 images, capturing various environments, devices, and users. Throughout the process, we faced several challenges, improving and testing different model versions to enhance performance and reduce false positives, which helped minimize disruptive flashing in the VR experience. Our final model, YOLOv8n, achieved an impressive F1 Score of 0.935.

The training and validation of the model were executed on parallel compute clusters provided by Digital Research Alliance of Canada, utilizing bash scripts and Slurm job scheduling for efficient resource management.