What I am trying to do
I am using Py Arrow to parse data from a csv (originally from a Postgres database). I am having issues parsing a timestamp (with a timezone) that looks like 2017-08-19 14:22:11.802755+00.
I am then receiving an error that looks like:
pyarrow.lib.ArrowInvalid: In CSV column #11: CSV conversion error to timestamp[ns]: invalid value '2017-08-19 12:22:11.802755+00'
What I have tried to do
I tried using a specified parser for the data, so this is how I read the csv (snippet for brevity):
arrow_table = arrow_csv.read_csv( input_file=input_buffer, convert_options=arrow_csv.ConvertOptions( timestamp_parsers=[ISO8601, "%Y-%m-%d %H:%M:%S.%6N %z"],# I have also tried omitting this column_types=arrow_schema, strings_can_be_null=True, true_values=['t'], false_values=['f'], ) ) Not that in column_types I map the column that I want to parse like (I am mapping Postgres types to Arrow types, which works for all other types except for this):
timestamp with time zone': pa.timestamp('ns', tz="+00:00") But none of that seems to work. I'm happy to provide further information if needed.