Questions tagged [masking]
The masking tag has no summary.
10 questions
4 votes
0 answers
29 views
Time-efficient parallelization of masks for pre-processing a dataset
I have a large dataset (~10M points) in python and I want to filter it using a large number of different custom masks, as part of calculations to create a new but related dataset. Because the dataset ...
0 votes
0 answers
50 views
Understanding the Training routine of the Transformer architecture
I have been thinking about the Masking in the Self attention of the decoder in the context of the training for a long time and doesn't really make sense to me. I have browsed through a lot of sources ...
1 vote
0 answers
225 views
Pytorch Transformer only generating NaN when using mask
When I generate a src_mask like this ...
1 vote
1 answer
168 views
Dealing with high frequency tokens during masked Language modelling?
Suppose I am working with a Masked Language Model to pre-train on a specific dataset. In that dataset, most sequences have a particular token of a high frequency ...
2 votes
1 answer
768 views
Decoder Transformer feedforward
I have a question about the decoder transformer feed forward during training. Let's pick an example: input data "i love the sun" traduction i want to ...