
Audio Classification using CNN and Vision Transformer

The research demonstrates that a Vision Transformer adapted for audio data outperforms both a custom CNN and traditional machine learning models in classifying music genres using mel spectrograms from the GTZAN dataset.


Developed MoodMeter.AI, an advanced tool for remote work communication, employing real-time emotion detection through facial expression and voice analysis to enhance understanding in online meetings

SSORT: Semantic Segmentation for Off-Road Traversibility

Real-time semantic scene understanding is essential for autonomous vehicles, especially in off-road driving where it deals with uncertainties like uneven terrain and hidden obstacles, and this project aims to enhance off-road semantic segmentation to improve route visualization and navigation.