Questions tagged [pretraining]
The pretraining tag has no summary.
33 questions
1 vote
0 answers
20 views
Should I train YOLOv8 for pre-trained classes?
Following this tutorial, I trained YOLOv8 on a custom dataset to detect holes in the ground. I obtained acceptable results considering how small my training set is (about 50 pictures). When preparing ...
2 votes
1 answer
73 views
Is it methodologically correct to use the data to be used for finetuning in the pretrain phase of the BERT model?
Let us assume the training of a BERT model. An initial pre-train is performed with a large data set A. Subsequently a finetuning is performed with a dataset B which is part of A, but now with labels ...
2 votes
1 answer
1k views
what is the difference between window size and context length of language model?
is window size and context length of language model one and the same thing? ******** following text is added as question with ONLY above text was not allowed ***** I am trying to understand how GPT ...
8 votes
3 answers
6k views
Further Training a pre-trained LLM
My goal is to use the general knowledge and language understanding of a pre-trained LLM and to continue training on a smaller domain specific corpus to improve the model's knowledge on the domain. ...
1 vote
1 answer
332 views
Does the Transformer model has memory to store the state accross different data injection sequences(segments)?
I've trained a transformer model based on the pytorch tutorial: https://pytorch.org/tutorials/beginner/transformer_tutorial.html, But I found I've difficulties to understant this model's input and ...
0 votes
1 answer
32 views
How long is the generator pre-trained in SeqGAN?
I am reading up about SeqGAN and I am trying to understand the pretraining step better. The authors claim they want to maximize the Maximum Likelihood Estimation on the dataset S by pretraining the ...
1 vote
0 answers
16 views
Is there practice to train language-to-code transformer (multi-modal transformer) using uni-modal pretrained models-transformers?
Language-to-code transformation/generation require multiple skills - language and reasoning skills to digest the core problem from the natural language specification. And programming language ...
1 vote
0 answers
347 views
Is it possible to "fine-tune" a pre-trained logistic regression model?
Fine tuning is a concept commonly used in deep learning. We may have a pre-trained model and then fine-tune it to our specific task. Does that apply to simple models, such as logistic regression? For ...