Freelancer:
ashishkumar70785
Winner
video classification entry
following images shows the training , inference and prediction file .Contact me to get proper .py file of training and inference as this site only accept in images. since dataset is small so accuracy is around 75percent overall and 95 % for speakers that have more audio clips.training is done using pretrained resnet18 (large modes overfit) for 60 epoch and choose the best weights.
