Skip to content

Fallback to raw string value, when json field inside geojson isn't decodable#1470

Open
jbylina wants to merge 1 commit intoToblerity:maint-1.10from
jbylina:bugfix/mixed-types-of-geojson-properties
Open

Fallback to raw string value, when json field inside geojson isn't decodable#1470
jbylina wants to merge 1 commit intoToblerity:maint-1.10from
jbylina:bugfix/mixed-types-of-geojson-properties

Conversation

@jbylina
Copy link

@jbylina jbylina commented Dec 3, 2024

Expected behavior and actual behavior.

I expected to successfully read the following geojson file by Fiona and leave the handling of mixed types geojson properties to lib user.

{ "type": "FeatureCollection", "features": [ {"type": "Feature", "properties": { "a": [ ]}, "geometry": { "type": "Point", "coordinates": [ [-74, 5]] } }, {"type": "Feature", "properties": { "a": ""}, "geometry": { "type": "Point", "coordinates": [ [-74, 5]] } } ] }

Actual behavior:

Traceback (most recent call last): File "<input>", line 1, in <module> File "fiona/ogrext.pyx", line 1996, in fiona.ogrext.Iterator.__next__ File "fiona/ogrext.pyx", line 668, in fiona.ogrext.FeatureBuilder.build File "fiona/ogrext.pyx", line 390, in fiona.ogrext.JSONField.get File "/Users/jacek/.pyenv/versions/3.12.7-debug/lib/python3.12/json/__init__.py", line 346, in loads return _default_decoder.decode(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jacek/.pyenv/versions/3.12.7-debug/lib/python3.12/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jacek/.pyenv/versions/3.12.7-debug/lib/python3.12/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) 

Steps to reproduce the problem.

Run the following script using the latest Fiona.

from pathlib import Path import fiona Path("test.geojson").write_text("""\ { "type": "FeatureCollection", "features": [ {"type": "Feature", "properties": { "a": [ ]}, "geometry": { "type": "Point", "coordinates": [ [-74, 5]] } }, {"type": "Feature", "properties": { "a": ""}, "geometry": { "type": "Point", "coordinates": [ [-74, 5]] } } ] } """) list(fiona.open("test.geojson", "r"))

Operating system

macOS 15.1.1

Fiona and GDAL version and provenance

Fiona 1.10.1 from pip
GDAL 3.10.0 installed via Homebrew

Comment

When I flip lines, so that [ ] geojson property isn't first, the code doesn't fail.

{ "type": "FeatureCollection", "features": [ {"type": "Feature", "properties": { "a": ""}, "geometry": { "type": "Point", "coordinates": [ [-74, 5]] } }, {"type": "Feature", "properties": { "a": [ ]}, "geometry": { "type": "Point", "coordinates": [ [-74, 5]] } } ] }

Fix

I propose to fall back to raw string value if json field property isn't decodable. When we look at different json values allowed by RFC, and how it's handled by json.loads() the string is the only one that requires different handling.

json.loads("1.1") returns float json.loads("1") returns int json.loads("[]") returns list json.loads("{}") returns dict json.loads("null") returns None json.loads("a") raises exception. Extra quotes required 

When geojson is processed, at the point of calling JSONField.get() we know that geojson is a valid json so the above cases are the only possible ones. json.loads("[") is impossible, so the decoding exception is possible only in case of the string.

@jbylina jbylina changed the base branch from main to maint-1.10 December 3, 2024 16:38
@jbylina jbylina force-pushed the bugfix/mixed-types-of-geojson-properties branch from 382d2ab to bd4e7c8 Compare December 3, 2024 16:48
@jbylina jbylina changed the base branch from maint-1.10 to maint-1.9 December 3, 2024 16:49
@jbylina jbylina changed the base branch from maint-1.9 to maint-1.10 December 3, 2024 16:49
@jbylina jbylina force-pushed the bugfix/mixed-types-of-geojson-properties branch from bd4e7c8 to 33001a3 Compare December 3, 2024 16:52
Copy link
Member

@sgillies sgillies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbylina Thanks for the suggestion!

How about a log message to say that we are defaulting to different behavior? And a warning so that users can raise an exception instead if they want to be very strict.

Comment on lines +417 to +418
except json.JSONDecodeError:
return val
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
except json.JSONDecodeError:
return val
except json.JSONDecodeError as error:
warnings.warn(f"JSON field value not decoded, returning as string: {val=}, {error=}", FeatureWarning)
log.info("JSON field value not decoded, returning as string: val=%r, error=%r", val, error)
return val
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants