Skip to content

Conversation

@tswast
Copy link
Collaborator

@tswast tswast commented Sep 12, 2024

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes internal issue 366248570
🦕

@tswast tswast requested review from a team as code owners September 12, 2024 21:23
@tswast tswast requested a review from Genesis929 September 12, 2024 21:23
@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Sep 12, 2024
@tswast
Copy link
Collaborator Author

tswast commented Sep 13, 2024

Looks like some real test failures. I'll look at these later today.

@tswast
Copy link
Collaborator Author

tswast commented Sep 13, 2024

Looks like some real test failures. I'll look at these later today.

Looks like there was a bug with subset=None. I believe this is now fixed.

<BLANKLINE>
[3 rows x 3 columns]
Define in which columns to look for missing values.
Copy link
Collaborator

@Genesis929 Genesis929 Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From doctest.
@@ -1,5 +1,5 @@
- name toy born
-1 Batman Batmobile 1940-04-25
-2 Catwoman Bullwhip NaT
+ name toy born
+1 Batman Batmobile 1940-04-25
+2 Catwoman Bullwhip <NA>

[2 rows x 3 columns]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed! The diff got a little big because I pulled in #987. It should be easier to review again once that is merged.

With this change I can once again run ``` pytest --doctest-modules third_party/bigframes_vendored/pandas/core/frame.py ``` Note: having multiple `version.py` files should be fine. release-please will update all such files it finds.
…subset" This reverts commit 57e8335, reversing changes made to 197074a.
if subset is None:
subset_ids = None
elif not utils.is_list_like(subset):
subset_ids = [self._block.label_to_col_id[subset]]
Copy link
Collaborator

@Genesis929 Genesis929 Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of this seems to be a list of tuple, and don't work. May need a loop like in else?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is supposed to cover the single column string case. A loop would give one character per iteration. I'll take a look at the error.

Copy link
Collaborator

@Genesis929 Genesis929 Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the defination def label_to_col_id(self) -> typing.Mapping[Label, typing.Sequence[str]], I think self._block.label_to_col_id[subset] will be a sequence/tuple? So I think a loop like for id in self._block.label_to_col_id[label] should be included?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted offline. self._block.label_to_col_id[subset] is a tuple. I've corrected.

@tswast tswast merged commit f7c03dc into main Sep 16, 2024
@tswast tswast deleted the b366248570-dropna-subset branch September 16, 2024 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: m Pull request size is medium.

3 participants