Newest 'pretraining' Questions

1 vote

0 answers

20 views

Should I train YOLOv8 for pre-trained classes?

Following this tutorial, I trained YOLOv8 on a custom dataset to detect holes in the ground. I obtained acceptable results considering how small my training set is (about 50 pictures). When preparing ...

Sheldon

205

asked May 11 at 15:01

2 votes

1 answer

73 views

Is it methodologically correct to use the data to be used for finetuning in the pretrain phase of the BERT model?

Let us assume the training of a BERT model. An initial pre-train is performed with a large data set A. Subsequently a finetuning is performed with a dataset B which is part of A, but now with labels ...

Álvaro Loza

123

asked Feb 14, 2024 at 9:43

2 votes

1 answer

1k views

what is the difference between window size and context length of language model?

is window size and context length of language model one and the same thing? ******** following text is added as question with ONLY above text was not allowed ***** I am trying to understand how GPT ...

Vinay Sharma

187

asked Oct 5, 2023 at 12:43

8 votes

3 answers

6k views

Further Training a pre-trained LLM

My goal is to use the general knowledge and language understanding of a pre-trained LLM and to continue training on a smaller domain specific corpus to improve the model's knowledge on the domain. ...

Arthuro

111

asked Jun 12, 2023 at 9:57

1 vote

1 answer

332 views

Does the Transformer model has memory to store the state accross different data injection sequences(segments)?

I've trained a transformer model based on the pytorch tutorial: https://pytorch.org/tutorials/beginner/transformer_tutorial.html, But I found I've difficulties to understant this model's input and ...

Clock ZHONG

133

asked Apr 9, 2023 at 7:16

0 votes

1 answer

32 views

How long is the generator pre-trained in SeqGAN?

I am reading up about SeqGAN and I am trying to understand the pretraining step better. The authors claim they want to maximize the Maximum Likelihood Estimation on the dataset S by pretraining the ...

postnubilaphoebus

73

asked Oct 11, 2022 at 19:12

1 vote

0 answers

16 views

Is there practice to train language-to-code transformer (multi-modal transformer) using uni-modal pretrained models-transformers?

Language-to-code transformation/generation require multiple skills - language and reasoning skills to digest the core problem from the natural language specification. And programming language ...

TomR

141

asked Jun 30, 2022 at 8:46

1 vote

0 answers

347 views

Is it possible to "fine-tune" a pre-trained logistic regression model?

Fine tuning is a concept commonly used in deep learning. We may have a pre-trained model and then fine-tune it to our specific task. Does that apply to simple models, such as logistic regression? For ...

eduardokapp

111

asked May 17, 2022 at 16:57

Stack Exchange Network

Questions tagged [pretraining]

Should I train YOLOv8 for pre-trained classes?

Is it methodologically correct to use the data to be used for finetuning in the pretrain phase of the BERT model?

what is the difference between window size and context length of language model?

Further Training a pre-trained LLM

Does the Transformer model has memory to store the state accross different data injection sequences(segments)?

How long is the generator pre-trained in SeqGAN?

Is there practice to train language-to-code transformer (multi-modal transformer) using uni-modal pretrained models-transformers?

Is it possible to "fine-tune" a pre-trained logistic regression model?

Hot Network Questions