I wanted to know if anyone has any sort of guidance on what is better for image classification on a lot of classes (about 400) with a small amount of samples per class (around 20), for relatively big RGB images (around 600x600).
I know that Autoencoders can be used for feature extraction, such that I can just let an autoencoder run on the images unsupervised, and thus reduce the dimensionality of the images to train on those downsampled images.
Similarly, I also know that you can just use a pretrained network, strip the final layer and change it into a linear layer to your own dataset's number of classes, and then just train that final layer or a few layers before it to fit your dataset.
I haven't been able to find any resources online that determine which of these two techniques for feature extraction is better and under which conditions; does anyone have any advice?