RuntimeError in test_step with multi-GPU: Model input shape mismatch

Say,

I've dataloader gives input shape batch_size, 2 * model_input_shape
Build model batch_size, model_input_shape
In order to run model.evaluate(dataloader), we override test_step.

This works in single GPU. But doesn't on multi-GPU (possibly will fail in TPU as well). I've tested the code with keras-nightly.

strategy = tf.distribute.MirroredStrategy() class CustomModel(keras.Model): def __init__(self, model, split=1, *args, **kwargs): super().__init__(*args, **kwargs) self.model = model self.split = split def call(self, inputs, training=None): return self.model(inputs, training=None) def test_step(self, data): x, y = data x_list = ops.split( x, indices_or_sections=self.split, axis=-1 ) results = [] for x_in_list in x_list: y_pred = self(x_in_list, training=False) results.append(y_pred) y_pred = ops.mean( ops.stack(results, axis=0), axis=0 ) loss = self.compute_loss(y=y, y_pred=y_pred) self.compiled_metrics.update_state( y, y_pred ) return {m.name: m.result() for m in self.metrics} inputs = keras.Input(shape=(32,)) outputs = keras.layers.Dense(1)(inputs) with strategy.scope(): model = keras.Model(inputs, outputs) custom_model = CustomModel(model, split=2) custom_model.compile(loss="mse", metrics=["mae"]) x = np.random.random((10, 64)) y = np.random.random((10, 1)) custom_model.evaluate(x, y)

--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) /tmp/ipykernel_37/3196480519.py in <cell line: 0>() 42 x = np.random.random((10, 64)) 43 y = np.random.random((10, 1)) ---> 44 custom_model.evaluate(x, y) /usr/local/lib/python3.11/dist-packages/keras/src/utils/traceback_utils.py in error_handler(*args, **kwargs) 120 # To get the full stack trace, call: 121 # `keras.config.disable_traceback_filtering()` --> 122 raise e.with_traceback(filtered_tb) from None 123 finally: 124 del filtered_tb /usr/local/lib/python3.11/dist-packages/keras/src/trainers/trainer.py in _symbolic_build(self, iterator, data_batch) 1098 y_pred = backend.compute_output_spec(self, x, training=False) 1099 except Exception as e: -> 1100 raise RuntimeError( 1101 "Unable to automatically build the model. " 1102 "Please build it yourself before calling " RuntimeError: Unable to automatically build the model. Please build it yourself before calling fit/evaluate/predict. A model is 'built' when its variables have been created and its `self.built` attribute is True. Usually, calling the model on a batch of data is the right way to build it. Exception encountered: 'Exception encountered when calling CustomModel.call().  Input 0 of layer "functional" is incompatible with the layer: expected shape=(None, 32), found shape=(5, 64)  Arguments received by CustomModel.call():  • inputs=tf.Tensor(shape=(5, 64), dtype=float32)  • training=False'

Again, running without strategy (single GPU), code works.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RuntimeError in `test_step` with multi-GPU: Model input shape mismatch #21800

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RuntimeError in test_step with multi-GPU: Model input shape mismatch #21800

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

RuntimeError in `test_step` with multi-GPU: Model input shape mismatch #21800