VISOR: Video Analytics Tool

Project Background & Description


Grading video presentations can be time-consuming for examiners and lecturers. VISOR, a web application, aims to solve this problem using a combination of artificial intelligence technologies.

Users upload their video presentation, and VISOR automatically generates a short summary of the key points using speech-to-text technology. It also analyzes the presenter's voice and facial expressions to assess their emotions and engagement. Additionally, it uses deep learning to check if the slides contain images and graphs. Based on this analysis, VISOR generates a comprehensive critique of the presentation with the help of OpenAI's language models.

Key Benefits


Smart

Convenient

Comprehensive

Perceptive

Key Technologies Used


Speech-to-Text

Eye movement tracking

Deep Learning Large Language Models (LLMs)

Voice and facial emotion analysis

Students


POH SHENG BAO, ETHAN LOW KIM ERN, MANFRED KHO AN

a picture of me a picture of me a picture of me

Supervisor


Liu Yanyan (Dr)