-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
modified schema field list #13087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modified schema field list #13087
Conversation
| you need tests for this and an example of why you think this is broken |
Current coverage is 84.14%@@ master #13087 diff @@ ========================================== Files 138 137 -1 Lines 50724 50289 -435 Methods 0 0 Messages 0 0 Branches 0 0 ========================================== - Hits 42726 42312 -414 + Misses 7998 7977 -21 Partials 0 0
|
| I hope this comment provides a clearer example. |
| no please show a run of this and the error you get with master |
| Thanks for your patience! When I run this... ... I get an InvalidSchema error. Traceback: The generated schema of and the schema of the BigQuery table I want to append to is therefore the generated schema of the dataframe does not equal the schema of the BigQuery table I want to append the dataframe to. |
| how is that a bug? looks like user error to me |
The reason I think it is a bug is because the generated schema for dataframes comes from this function. It only adds
Yes, but currently in the which will yield a schema just like the generated schema from my dataframe but with the extra keys
From an ipython notebook -- generating it using the dictionary I provided above gives the same result. |
| The documentation says
but it does not specify that the dataframe's columns must have the same "description" or "mode" as the BigQuery table's columns. It is possible to generate "name" and "type" keys for each column in a dataframe but not "description" or "mode". That's why I think |
| Thanks for the clear explanation of the issue. I can understand how the schema verification could fail. I see that I don't think the |
| Looks like I forgot to include I agree that the Instead of stripping I see that |
| Based on the BigQuery docs and my understanding of the implementation, pandas currently doesn't support a BigQuery schema with any of the following options:
Based on the above, I would prefer the following:
|
| if some of the other contributors can comment. cc @jacobschaer this would need some tests of where it should suceed (append is ok given various states of the existing table), and fail (where it should raise) |
doc/source/whatsnew/v0.18.2.txt Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
give a much simpler message that a user reading will grok immeditely.
I also added a couple more tests to GBQUnitTests using the bigquery-public-data project id. They're meant to test the changes to |
pandas/io/gbq.py Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove default_mode as a parameter for now since we will only support the 'NULLABLE' option?
| Done reviewing. A few minor comments. |
pandas/io/gbq.py Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a unit test to confirm that 'description' is ignored when comparing the schema of an existing table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! You could argue that test_df_with_matching_schema_should_verify_schema confirms that 'description' is ignored, but I see your point.
So I added test_verify_schema_should_ignore_column_description. The idea was to first prove that 'description' was the one and only key in the BigQuery table's schema that doesn't exist in the schema generated from the DataFrame. Then if the generated schema is verified with verify_schema, that proves description is in fact ignored. Feedback welcome.
There is one test that fails but that's because of this commit in testing.assert_index_equal.
| All gbq tests pass locally |
doc/source/io.rst Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paragraph break
| @medullaskyline you need to |
| you can squash via |
4a30557 to 31c2cd4 Compare | @jreback Just checking to see if there's anything else I need to do. |
| pls squash to a single commit (an rebase on master) / some codecov things changed. |
doc/source/io.rst Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this a bulleted list (of the things that need to match)
73428fe to dabe449 Compare doc/source/io.rst Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need the * on column order (just on the list items)
| needs a whatsnew entry. pls squash. |
0ef4c7c to 7f81301 Compare Author: medullaskyline <medullaskyline@gmail.com> Closes pandas-dev#13086 from medullaskyline
7f81301 to c60580f Compare | can you rebase and show a test run |
| can you rebase |
| @medullaskyline ping |
| can you rebase / update? we also have new testing for gbq, pls look at the dev docs and see how to. |
| | ||
| - Bug in ``Categorical.remove_unused_categories()`` changes ``.codes`` dtype to platform int (:issue:`13261`) | ||
| | ||
| - Bug in ``DataFrame.to_gbq()`` where dataframes could not be appended to Google BigQuery tables that had been created in any way other than ``DataFrame.to_gbq()``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move to 0.19.2
| @medullaskyline can you rebase / update |
| this was looking good, but needs updating. pls resubmit a rebased version if you can. |

git diff upstream/master | flake8 --diff