Questions tagged [seq2seq]

Question 1

I would like to generate sequences of tuples using a neural network algorithm such that the model trains on a dataset of sequences of tuples and generates synthetic sequences of tuples. Each tuple <...

Question 2

I have ready many explanations of the seq2seq model. In my opinion, however, it is really like a robot that might say something correctly, but doesn't really understand it, just as is true with an LLM ...

Question 3

I am a bit confused on what cross attention mechanisms are doing. I understand that the currently decoded output is usually the query and the conditioning/input (from an encoder) is the key and value. ...

Question 4

I've read many tutorials online that use both words interchangeably. When I search and find that they are the same, why not just use one word since they have the same definition?

Question 5

Let's say I'm training a transformer model to perform a seq to seq task, but there are multiple correct answers. For example, the following outputs would all be considered correct: source: A B C -> ...

Question 6

I want to train (fine-tune) a seq2seq model to perform the task of rephrasing input following these rules : 1- always follow the pattern "Entity Verb Entity" 2- only use simple sentences : ...

Question 7

In a conversational setting where two sources of text (user and the model) follow each other like below User: some text bla bla Model: another text bah bah User: bla bla bla Model: bah bah and so on, ...

Question 8

I am new to Seq2Seq and hope to find a proper guildances, advices. I am doing a Project from an online course so I can not give the material but I got my Project notebook on Github I want to ask ...

Question 9

I am practicing machine translation using seq2seq model (more specifically with GRU/LSTM units). The following is my first model: This model first archived about 0.03 accuracy score and gradually ...

Question 10

How do we train a seq2seq rnn training? We input a sentence that needs to be translated. We encode it sequentially. Then the first decoder outputs the first word with probabilities. We do a gradient ...

Question 11

Why is it called a Seq2Seq model if the output is just a number? For example, if you are trying to predict a movie's recommendation, and you are inputting a ...

Question 12

I am working on the Transformer example demonstrated on TensorFlow's website. https://www.tensorflow.org/text/tutorials/transformer In this example, Machine Translation model is trained to translate ...

Question 13

From my understanding, seq2seq models work by first computing a representation of the input sequence, and feeding this to the decoder. The decoder then predicts each token in the output sequence in an ...

Question 14

There are many sequence to sequence (seq2seq) models and end to end models, like text to sql. I was wondering are there any text to json deep learning models? For example: Text ...

Question 15

For neural machine translation, there's this model "Seq2Seq with attention", also known as the "Bahdanau architecture" (a good image can be found on this page), where instead of ...

Stack Exchange Network

Questions tagged [seq2seq]

Best neural network algorithms/architectures for generating synthetic sequences of tuples of words

Probability interpretation of attention mechanism in Seq2Seq

How to Interpret Cross Attention

What are the differences between seq2seq and encoder-decoder architectures?

Modifying Cross Entropy Loss to work with multiple correct target sequences?

How to train a seq2seq model to rephrase input text following given rules

How does GPT like Decoder only conversational models distunguish the source of text?

Seq2Seq model- Confusing about the dimension of Seq2Seq model [closed]

The model's accuracy becomes suddenly so unreasonably good at beginning of the training process. I need an explaination

Seq2seq with RNNs, how is the training loop performed?

Why is it called a Seq2Seq model if the output is just a number?

Fine Tuning Transformer Model for Machine Translation

Is the decoder in a transformer Seq2Seq model non parallelizable?

Any models for text to json

How does Seq2Seq with attention actually use the attention (i.e. the context vector)?

Hot Network Questions