I am attempting to perform classification against CIFAR-100 dataset using a Resnet model that I implemented.
I have been trying multiple different hyperparameter configurations, changing learning rates, batch sizes, dropout rates, data augmentation and regularisation but nothing I have done has been able to improve the validation accuracy more than 40% whereas training gets 99%.
I realise having high training accuracy but low validation is a sign of overfitting but after increasing regularisation parameters, and dropout rates, I'm still not seeing any improvement. The train, val, test split is as provided by CIFAR-100 so 40000, 10000, 10000 respectively.
Anyone have any ideas of how to get through this plateau barrier or can see where I could be going wrong?
Here are the type of hyperparameters I've been playing around with:
learning_rates = [0.001, 0.0001]
batch_sizes = [16, 32]
dropout_rates = [0.3,0.5]
decay = [0.001, 0.0001]
optimiser = ['Adam', 'SGD']
And this is how I have been augmenting the data:
transform = transforms.Compose([
transforms.RandomCrop(32, padding=4),
transforms.RandomHorizontalFlip(),
transforms.RandomRotation(10),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
DATA_ROOT_FOLDER = './cifar-100-python'
train_data = datasets.CIFAR100(root=f"{DATA_ROOT_FOLDER}", train=True, download=True, transform=transform)
test_data = datasets.CIFAR100(root=f"{DATA_ROOT_FOLDER}", train=False, download=True, transform=transform)
train_data, val_data = train_test_split(train_data, test_size=0.2, random_state=42)
batch_size = 64
train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size, shuffle=True)
val_loader = torch.utils.data.DataLoader(val_data, batch_size=batch_size, shuffle=False)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, shuffle=False)
