Why does a colormap such as viridis give better results for spectrogram-based audio classification over greyscale?

Question

I have been trying audio classification on the UrbanSound8k dataset and MPSSC snore classification dataset. I am using the approach of transfer learning by extracting features from AlexNet and VGG19 pre-trained on ImageNet. I am then feeding these features to an SVM. Weirdly, I obtain better performance for both the datasets when using the viridis colormap as opposed to giving the same 2D grayscale spectrogram array in each of the 3 channels. One thing I don't understand is how does a colormap add any information which wasn't present in the original spectrogram?

I went through answers such as Do I need 3 RGB channels for a spectrogram CNN? which say that training a CNN has similar performance when using different colormaps. Is the same true for pre-trained networks too?

shimao · Accepted Answer · 2019-12-11 18:54:29Z

VGG was trained on ImageNet, which is composed of primarily color images, so it's unsurprising that a network which is very good at extracting features from and classify color images produces better results when you feed it in a color image versus a greyscale one.

Jon Nordby · Accepted Answer · 2019-12-26 19:54:59Z

This technique is called pseudo coloring, and has been explored a little bit in the litterature, also outside of pretrained networks.

For example in Sound Event Recognition in Unstructured Environments using Spectrogram Image Processing, PhD thesis by Jonathan William Dennis.

Stack Exchange Network

Why does a colormap such as viridis give better results for spectrogram-based audio classification over greyscale?

2 Answers 2

Linked

Hot Network Questions

Why does a colormap such as viridis give better results for spectrogram-based audio classification over greyscale?

2 Answers 2

Linked

Related

Hot Network Questions