Skip to main content

Questions tagged [pretrained-models]

For questions related to pre-trained model. A pre-trained model is a model that was trained on a large benchmark dataset to solve a problem similar to the one that we want to solve. Accordingly, due to the computational cost of training such models, it is common practice to import and use models from published literature (e.g. VGG, Inception, MobileNet)

2 votes
0 answers
69 views

I'm fine-tuning two different CNNs for an image classification task: The first CNN uses a ResNet101 backbone, and the second uses a MobileNetV2 backbone. Both are pre-trained on ImageNet. I use the ...
S.E.K.'s user avatar
  • 41
0 votes
1 answer
97 views

I've seen it stated multiple times that LLMs have much worse data efficiency than humans (IE require more data to reach same or worse performance), EG this Tweet by Yann LeCun, or 19:30 in this talk ...
Jake Levi's user avatar
  • 101
0 votes
3 answers
127 views

I need an AI model for facial detection based on an infrared camera Is there an existing model for this with per-trained weights? Does this model work well when the lighting conditions may change ...
errolflynn's user avatar
1 vote
1 answer
112 views

It is known that multitask objectives in neural networks sometimes have the effect of improving the performance of the neural network for each of the tasks individually (versus training the same ...
Alexander Soare's user avatar
0 votes
1 answer
94 views

I am writing a research paper on my own custom CNN model for image classification. I am comparing my model architecture with pre-trained architectures, like DenseNet121 and InceptionV3. I want to ...
Dawood Ahmad's user avatar
2 votes
1 answer
535 views

I have thousands of images similar to this. I can classify them using existing metadata to different folders according to gravel product type loaded on the truck. What would be optimal way to train a ...
Vojtěch Dohnal's user avatar
0 votes
1 answer
64 views

As per Section 3.2 in the original paper on Fasttext, the authors state: In order to bound the memory requirements of our model, we use a hashing function that maps n-grams to integers in 1 to K ...
Fijoy Vadakkumpadan's user avatar
0 votes
1 answer
513 views

I am working on the Transformer example demonstrated on TensorFlow's website. https://www.tensorflow.org/text/tutorials/transformer In this example, Machine Translation model is trained to translate ...
boyaronur's user avatar
  • 101
3 votes
1 answer
886 views

I read prompt tuning and prefix tuning are two effective mechanisms to leverage frozen language models to perform downstream tasks. What is the difference between the two and how they work really? ...
Exploring's user avatar
  • 381
3 votes
1 answer
681 views

I'm trying to set up a pipeline for my ML models to automatically re-train themselves whenever concept drift occurs to recalibrate to the new output distributions. However, I can't get ground-truth ...
Sanger Steel's user avatar
1 vote
1 answer
823 views

Context: I am currently working on an encoder-decoder sequence to sequence model that uses a sequence of word embeddings as input and output, and then reduces the dimensionality of the word embeddings....
node_env's user avatar
4 votes
1 answer
4k views

I am trying to understand the concept of fine-tuning and few-shot learning. I understand the need for fine-tuning. It is essentially tuning a pre-trained model to a specific downstream task. However, ...
Exploring's user avatar
  • 381
0 votes
1 answer
2k views

I have downloaded a pre-trained EfficientDet D2 model (Tensorflow 2.0) and trained it on some data (about 20000 images with 20 classes). I set the number of steps to 25000 and batch size to 3 (...
Araw's user avatar
  • 103
2 votes
1 answer
3k views

Recently, I came across the BERT model. I did some research and tried some implementations. I wanted to tackle a NER task, so I chose the BertForSequenceClassifications provided by HuggingFace. ...
Joon's user avatar
  • 51
0 votes
2 answers
1k views

I am trying to build a neural network that has an input of $n$ pairs of integer values (where $n$ is random) and a corresponding output of a binary array with length $n$. The input will be a set of ...
Kian's user avatar
  • 1

15 30 50 per page