How to handle partial batch insert failures with Python in GridDB Cloud while maintaining row order and avoiding duplicate inserts?

Question

I’m using GridDB Cloud with the official Python client (griddb_python) to insert large batches of rows into a TimeSeries container.

I’m calling put_multiple_rows() for efficiency, but occasionally I get a GSException when one or more rows in the batch violate a schema constraint (e.g., NULL in a NOT NULL column or a duplicate primary key).

According to the docs, put_multiple_rows() aborts the entire batch on error. However, I want to:

Insert all valid rows in the batch that precede the first error row in order.
Skip the invalid rows without re-inserting already committed rows.
Retry failed rows individually in a separate error-handling routine.

My current pseudo-code:

import griddb_python as griddb conInfo = griddb.ContainerInfo( name="weather_data", column_info_list=[ ("ts", griddb.Type.TIMESTAMP), ("temperature", griddb.Type.FLOAT) ], type=griddb.ContainerType.TIME_SERIES, row_key=True ) rows = [...] # large batch of tuples try: container.put_multiple_rows(rows) except griddb.GSException as e: # Need a safe, efficient retry mechanism here

Question:

What’s the correct and efficient way to detect the failing row index in a put_multiple_rows() call in GridDB Cloud’s Python client so I can commit the valid prefix, skip the invalid rows, and retry them individually, without re-inserting already stored rows or breaking TimeSeries ordering?

I’m looking for a concrete explanation of:

How i can get the index of the row that caused the batch failure.
What is the recommended minimal-overhead pattern to achieve this in Python while using GridDB Cloud.

What I have tried:

Checked the GridDB Python API reference but could not find any documented way to retrieve the specific row index causing a put_multiple_rows() failure.
Attempted to catch GSException and parse the exception message for clues about the failing row, but the error message only provides a generic description without row-level detail:

except griddb.GSException as e: for i in range(e.get_error_stack_size()): print(f"[{i}] {e.get_error_code(i)}: {e.get_message(i)}")

This lists the error code and message, but no batch index.

Tried splitting the batch manually and inserting row-by-row to identify the failure, but this negates the performance benefits of put_multiple_rows().
Experimented with smaller chunk sizes (e.g., 100 rows per batch) to isolate errors faster, but it’s still inefficient for large datasets in GridDB Cloud.

Coders020 · Accepted Answer · 2025-08-15 08:56:56Z

I don't think that Python API currently provides a way to get failing row indices. The way I can think of is to implement chunked retries. Also currently the GridDB Python client does not provide a direct method to retrieve the index of the failing row in a put_multiple_rows() batch.

The only workaround is to split your data into progressively smaller batches until you isolate the problematic row.

Pseudocode approach:

def safe_batch_insert(container, rows): try: container.put_multiple_rows(rows) except griddb.GSException: if len(rows) == 1: log_error(rows[0]) return mid = len(rows) // 2 safe_batch_insert(container, rows[:mid]) safe_batch_insert(container, rows[mid:])

This minimizes re-inserts while still finding the error row efficiently.

Coders020 · Accepted Answer · 2025-08-15 08:59:53Z

The second option could be to Use put_row in a fallback pass after a failed bulk insert.

Insert large batches with put_multiple_rows().
If a failure occurs, retry only the failed batch with individual put_row() calls, skipping rows that already exist (by catching duplicate key errors).
The overhead is minimal if failures are rare.

This is what i have tried and seems good in cases where the failures are rare. — Klc
– Klc, Commented Aug 15 at 9:05

Stack Exchange Network

How to handle partial batch insert failures with Python in GridDB Cloud while maintaining row order and avoiding duplicate inserts?

2 Answers 2

Hot Network Questions

How to handle partial batch insert failures with Python in GridDB Cloud while maintaining row order and avoiding duplicate inserts?

2 Answers 2

Related

Hot Network Questions