An implementation of Speech Emotion Recognition, based on HuBERT model, training with PyTorch and HuggingFace framework, and fine-tuning on the RAVDESS dataset.
One way to increase the testing accuracy is to increase the number of epochs.
I personally set it to 10 to avoid excessive overfitting. However, this number can further increased.
An snapshot of my results (previously was 75.14%):