Why is input_length needed in layers.Embedding in keras tensorflow?

Question

layers.embedding has a parameter (input_length) that the documentation describes as:

input_length : Length of input sequences, when it is constant. This argument is required if you are going to connect Flatten then Dense layers upstream (without it, the shape of the dense outputs cannot be computed).

Why is it that the shape of dense outputs cannot be computed. To me, Flatten seems pretty easy to do. It is just a tf.rehshape(input,(-1,1)) followed by a dense layer with whatever output shape we choose.

Can you help me point to the lapse in my understanding of the entir logic?

Zabir Al Nazi Nabil · Accepted Answer · 2020-05-17 11:00:11Z

By specifying the dimension, you're making sure the model receives fixed-length input.

Technically, you can just put None at any input dimension you want. The shape will be inferred at run-time.

You only need to make sure you're specifying the layer parameters (input_dim, output_dim), kernel_size (for conv layers), units (for FC layers).

The shape can be computed if you use Input and specify what shape of tensor will be passed through the network.

For example following model is perfectly valid:

from tensorflow.keras import layers from tensorflow.keras import models ip = layers.Input((10)) emb = layers.Embedding(10, 2)(ip) flat = layers.Flatten()(emb) out = layers.Dense(5)(flat) model = models.Model(ip, out) model.summary()

Model: "model" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_2 (InputLayer) [(None, 10)] 0 _________________________________________________________________ embedding (Embedding) (None, 10, 2) 20 _________________________________________________________________ flatten (Flatten) (None, 20) 0 _________________________________________________________________ dense (Dense) (None, 5) 105 ================================================================= Total params: 125 Trainable params: 125 Non-trainable params: 0

Here, I didn't specify the input_length but it was inferred from the Input layer.

The problem is with Sequential API, if you don't specify the input shape in the Input layer and also not in the embedding layer, there's no way the model can be built with the proper set of parameters.

For example,

from tensorflow.keras import layers from tensorflow.keras import models model = models.Sequential() model.add(layers.Embedding(10, 2, input_length = 10)) # will be an error if I don't specify input_length here as there is no way to know the shape of the next layers without knowing the length model.add(layers.Flatten()) model.add(layers.Dense(5)) model.summary()

In this example, you must specify the input_length, otherwise the model will throw error.

Thank you for help kind friend! Am i understanding this correct: "input_length : Length of input sequences, when it is constant. This argument is required if you are going to connect Flatten then Dense layers upstream (without it, the shape of the dense outputs cannot be computed)." means that if i am going to attach flatten then dense to the output of embeddings layer then i have to specify input_length parameter. I guess it would be best if you provided me with an example with it was necessary to provide the input_length parameter.
input_length is equivalent to Input(shape = (10)), I showed a functional API model where due to my poinitng out the shape previously in the Input layer I didn't have to again specify the length in Embedding, I added a Sequential example, where if I don't specify the input_lenth, the model will throw error as there is no way to select the weight shape for the model. In the end of the day, NN is bucnh of matrices with numbers, and you need to know the dimension of each matrices before building a model.
That makes sense. But, dont we build the layers in build method lazily? The weights would be computed based upon the first input batch?
Each time you instantiate a layer object, it either gets the shape directly (input layer), or infers from the previous layer, If no relevant shape information is found, it throws error. tensorflow.org/api_docs/python/tf/keras/layers/Layer Think this way, the Dense is a fully connected layer, so it needs to know how many inputs are there in the previous layer (because in dense, every input is connected to every output), otherwise it can't initialize the weight matrix.
Ohhhh yes! I see. It's akin to not specifying the output shape of the previous dense layer and then expecting the next layer to create its weights. However, i still have a wrinkle.. The situation where the output shape of previous dense layer is not specified remains ambiguous even after the first batch is passed through the network. But this situation can easily be handled by creating the weights in build method. I do not think the model weights are initialized when the model is compiled anyway. Your second network has all weights known deterministically the moment the first batch is passed

Collectives™ on Stack Overflow

Why is input_length needed in layers.Embedding in keras tensorflow?

1 Answer 1

13 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

13 Comments

Related