I am currently building and testing a version of the ViT (Vision Transformer model) using stochastic estimation (using a gaussian distribution) of the Self-Attention layer to reduce the computational cost of the function to accelerate the training time of the model.
Trained different models using a Kaggle challenge Facial Expression Recognition dataset with 7 different expressions. The first models experimented with were: Gaussian Bayes, Decision Tree and KNN classifiers tested with HOG, GABOR and LBP filters for feature extraction as well as with and without the PCA and LDA methods for dimensionality reduction. The final and highest performing model was a CNN trained to extract features from the training set to then make a prediction using an SVM classifier on the extracted features. This model scored 65% accuracy on the testing set.
Implemented from scratch (forward pass and backpropagation) an image classifier using a neural network with cross-validation for the training phase. This classifier was used on pictures of 16 different symbols taken from the Cozmo robot. The classifiers reached a 55% accuracy.
Implemented a real estate data scraping tool fetching data from a property rental listing site. The scraper periodically updates the database with property listing and rental listing from different sources and cities in Canada. I also built a dashboard using VueJs so that my friends (non-technical peoples) could generate graphs and spreadsheets without having to write SQL commands.
As part of my end of studies project, my team and I built a proof of concept of an interactive Voice Control System controllinga TV app as well as notifying the hospital staff of an emergency. The company that mandated us to work on this, targeted this system at patients being unable to use a physical remote control.