- Notifications
You must be signed in to change notification settings - Fork 321
feat: support timestamp_precision in table schema #2333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| - 6 (Default, for TIMESTAMP type with microsecond precision) | ||
| - 12 (For TIMESTAMP type with picosecond precision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should there be an enum or something for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an option, I did not choose it because the backend defined it to be an integer, and I think we can let the backend handle value validation. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other repos, this is something I'd define an enum for, and then accept either the enum or a raw int value
But I don't have too much context on this repo, so up to you if that makes sense here. Consistency with the rest of the code is probably more relevant
google/cloud/bigquery/schema.py Outdated
| | ||
| @property | ||
| def timestamp_precision(self): | ||
| """Optional[int]: Precision (maximum number of total digits in base 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a valid docstring format? I've never return type declared at the top this way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated code and opened #2340.
google/cloud/bigquery/schema.py Outdated
| Possible values include: | ||
| - 6 (Default, for TIMESTAMP type with microsecond precision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems confusing to say 6 is the default here, since this is a getter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed to None instead, to be consistent with server behavior.
| - 6 (Default, for TIMESTAMP type with microsecond precision) | ||
| - 12 (For TIMESTAMP type with picosecond precision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No mention of None?
(It says 6 is the default. Is the server enforcing that, or the client? Can this even return None?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The server does return None, if we do not set the value. This docstring is copied from the proto files. I just tried to set value to 6, and received the following error:
Invalid value for timestampPrecision: 6 is not a valid value
So here we might need to give up consistency with the proto and make sure the doc is user friendly. WDYT? I will also open a bug with the API team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opened internal bug 463739109 and updated docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually in protos, None is equivalent to 0. So I'd recommend removing the Optional here, and have it return 0 if unset.
But yeah, we should make sure we understand the expected values here first
google/cloud/bigquery/schema.py Outdated
| policy_tags = key[-2] | ||
| policy_tags_inst = None if policy_tags is None else PolicyTagList(policy_tags) | ||
| adjusted_key = key[:-1] + (policy_tags_inst,) | ||
| adjusted_key = key[:-2] + (policy_tags_inst,) + (key[-1],) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a cleaner way to do this?
Ideally self._key() could return a named tuple, so you can pull out individual entries by name
But you could also do something like:
*initial_tags, policy_tags, timestamp_precision_tag = self._key policy_tags_inst = None if policy_tags is None else PolicyTagList(policy_tags) adjusted_key = (*initial_tags, policy_tags_inst, timestamp_precision_tag) There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is such a better idea! I have updated the code to use your version
| - 6 (Default, for TIMESTAMP type with microsecond precision) | ||
| - 12 (For TIMESTAMP type with picosecond precision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other repos, this is something I'd define an enum for, and then accept either the enum or a raw int value
But I don't have too much context on this repo, so up to you if that makes sense here. Consistency with the rest of the code is probably more relevant
| - 6 (Default, for TIMESTAMP type with microsecond precision) | ||
| - 12 (For TIMESTAMP type with picosecond precision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually in protos, None is equivalent to 0. So I'd recommend removing the Optional here, and have it return 0 if unset.
But yeah, we should make sure we understand the expected values here first
| | ||
| self.assertTrue(_table_exists(table)) | ||
| self.assertEqual(table.table_id, table_id) | ||
| self.assertEqual(table.schema, SCHEMA_PICOSECOND) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have a test that reads back a timestamp, and makes sure its in the expected range? Or am I misunderstanding?
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕