Questions tagged [pretrained-models]

Question 1

I'm fine-tuning two different CNNs for an image classification task: The first CNN uses a ResNet101 backbone, and the second uses a MobileNetV2 backbone. Both are pre-trained on ImageNet. I use the ...

Question 2

I've seen it stated multiple times that LLMs have much worse data efficiency than humans (IE require more data to reach same or worse performance), EG this Tweet by Yann LeCun, or 19:30 in this talk ...

Question 3

I need an AI model for facial detection based on an infrared camera Is there an existing model for this with per-trained weights? Does this model work well when the lighting conditions may change ...

Question 4

It is known that multitask objectives in neural networks sometimes have the effect of improving the performance of the neural network for each of the tasks individually (versus training the same ...

Question 5

I am writing a research paper on my own custom CNN model for image classification. I am comparing my model architecture with pre-trained architectures, like DenseNet121 and InceptionV3. I want to ...

Question 6

I have thousands of images similar to this. I can classify them using existing metadata to different folders according to gravel product type loaded on the truck. What would be optimal way to train a ...

Question 7

As per Section 3.2 in the original paper on Fasttext, the authors state: In order to bound the memory requirements of our model, we use a hashing function that maps n-grams to integers in 1 to K ...

Question 8

I am working on the Transformer example demonstrated on TensorFlow's website. https://www.tensorflow.org/text/tutorials/transformer In this example, Machine Translation model is trained to translate ...

Question 9

I read prompt tuning and prefix tuning are two effective mechanisms to leverage frozen language models to perform downstream tasks. What is the difference between the two and how they work really? ...

Question 10

I'm trying to set up a pipeline for my ML models to automatically re-train themselves whenever concept drift occurs to recalibrate to the new output distributions. However, I can't get ground-truth ...

Question 11

Context: I am currently working on an encoder-decoder sequence to sequence model that uses a sequence of word embeddings as input and output, and then reduces the dimensionality of the word embeddings....

Question 12

I am trying to understand the concept of fine-tuning and few-shot learning. I understand the need for fine-tuning. It is essentially tuning a pre-trained model to a specific downstream task. However, ...

Question 13

I have downloaded a pre-trained EfficientDet D2 model (Tensorflow 2.0) and trained it on some data (about 20000 images with 20 classes). I set the number of steps to 25000 and batch size to 3 (...

Question 14

Recently, I came across the BERT model. I did some research and tried some implementations. I wanted to tackle a NER task, so I chose the BertForSequenceClassifications provided by HuggingFace. ...

Question 15

I am trying to build a neural network that has an input of $n$ pairs of integer values (where $n$ is random) and a corresponding output of a binary array with length $n$. The input will be a set of ...

Stack Exchange Network

Questions tagged [pretrained-models]

Fine-tuning ResNet101 stuck at ~50% accuracy while MobileNetV2 reaches ~90% (same data, head, training setup)

Reference request: data efficiency of LLM pre-training

Is there a model for facial detection based on an infrared camera?

Multi-task objective sometimes improve single-task performance, but is this true when fine tuning?

Is size of trained model on disk a good measure of model complexity?

Should I use pretrained model for image classification or not?

Do different ngrams share embedding in Fasttext?

Fine Tuning Transformer Model for Machine Translation

What is the difference between prompt tuning and prefix tuning?

Using a pre-trained model to generate labels to data to then train a model on

How to Train a Decoder for Pre-trained BERT Transformer-Encoder?

What is the difference between fine tuning and variants of few shot learning? [duplicate]

Is it possible that the fine-tuned pre-trained model performs worse than the original pre-trained model?

Does BERT freeze the entire model body when it does fine-tuning?

How to design a neural network with arbitrary input and output length?

Hot Network Questions