Skip to content

Conversation

@shantanu-gontia
Copy link
Contributor

@shantanu-gontia shantanu-gontia commented May 15, 2019


Which section should i add the whatsnew entry in for this particular case?

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure this is the right place. I think putting the reshape into make_block is a bit closer to what we want?

diff --git a/pandas/core/internals/blocks.py b/pandas/core/internals/blocks.py index 0c49ebb55..54d295cf0 100644 --- a/pandas/core/internals/blocks.py +++ b/pandas/core/internals/blocks.py @@ -3035,6 +3035,8 @@ def make_block(values, placement, klass=None, ndim=None, dtype=None, # For now, blocks should be backed by ndarrays when possible. if isinstance(values, ABCPandasArray): values = values.to_numpy() + if ndim and ndim > 1: + values = np.atleast_2d(values) if isinstance(dtype, PandasDtype): dtype = dtype.numpy_dtype
# convert pandas array to numpy array
if isinstance(value, ABCPandasArray):
value = value.to_numpy()
return np.atleast_2d(np.asarray(value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the np.asarray?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And do you even need this early return? Can you not just do value = value.to_numpy()?

Copy link
Contributor Author

@shantanu-gontia shantanu-gontia May 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the np.asarray?

The reason for np.asarray was because the same was done if a Numpy array was passed to the column.

And do you even need this early return? Can you not just do value = value.to_numpy()?

You're right though, I shouldn't be returning this early, I had a misconception that the next check would return the wrong value

# return internal types directly if is_extension_type(value) or is_extension_array_dtype(value): return value 

Will fix this.

Copy link
Contributor Author

@shantanu-gontia shantanu-gontia May 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure this is the right place. I think putting the reshape into make_block is a bit closer to what we want?

diff --git a/pandas/core/internals/blocks.py b/pandas/core/internals/blocks.py index 0c49ebb55..54d295cf0 100644 --- a/pandas/core/internals/blocks.py +++ b/pandas/core/internals/blocks.py @@ -3035,6 +3035,8 @@ def make_block(values, placement, klass=None, ndim=None, dtype=None, # For now, blocks should be backed by ndarrays when possible. if isinstance(values, ABCPandasArray): values = values.to_numpy() + if ndim and ndim > 1: + values = np.atleast_2d(values) if isinstance(dtype, PandasDtype): dtype = dtype.numpy_dtype

I guess putting the conversion in make_blocks is a better idea, it was the first thing that came to mind, but I thought it to be a sanitary process, just like the conversion of NumpyArray to a 2D Format.

Anyway I'll make the requisite changes

df['c'] = pd.array([1, 2, None, 3])
df2 = pd.DataFrame({'a': [1, 2, 3, 4], 'b': ['a', 'b', 'c', 'd'],
'c': pd.array([1, 2, None, 3])})
assert_frame_equal(df, df2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you assert that df2['c']._data.blocks[0] is an ObjectBlock (not an extension block).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add these checks

@TomAugspurger TomAugspurger added the Internals Related to non-user accessible pandas implementation label May 15, 2019
@TomAugspurger TomAugspurger added this to the 0.25.0 milestone May 15, 2019
@codecov
Copy link

codecov bot commented May 15, 2019

Codecov Report

Merging #26417 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@ Coverage Diff @@ ## master #26417 +/- ## ========================================== - Coverage 91.69% 91.68% -0.01%  ========================================== Files 174 174 Lines 50739 50742 +3 ========================================== - Hits 46524 46523 -1  - Misses 4215 4219 +4
Flag Coverage Δ
#multiple 90.19% <100%> (ø) ⬆️
#single 41.16% <33.33%> (-0.18%) ⬇️
Impacted Files Coverage Δ
pandas/core/internals/blocks.py 94.07% <ø> (ø) ⬆️
pandas/core/frame.py 97.02% <100%> (-0.12%) ⬇️
pandas/io/gbq.py 78.94% <0%> (-10.53%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ff4437e...fcd29ed. Read the comment docs.

@codecov
Copy link

codecov bot commented May 15, 2019

Codecov Report

Merging #26417 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@ Coverage Diff @@ ## master #26417 +/- ## ========================================== - Coverage 91.73% 91.72% -0.01%  ========================================== Files 174 174 Lines 50741 50756 +15 ========================================== + Hits 46548 46558 +10  - Misses 4193 4198 +5
Flag Coverage Δ
#multiple 90.23% <100%> (ø) ⬆️
#single 41.69% <100%> (-0.11%) ⬇️
Impacted Files Coverage Δ
pandas/core/internals/blocks.py 94.08% <100%> (+0.01%) ⬆️
pandas/io/gbq.py 78.94% <0%> (-10.53%) ⬇️
pandas/core/frame.py 97.02% <0%> (-0.12%) ⬇️
pandas/util/testing.py 90.6% <0%> (-0.11%) ⬇️
pandas/core/base.py 97.99% <0%> (-0.01%) ⬇️
pandas/tseries/offsets.py 96.69% <0%> (-0.01%) ⬇️
pandas/core/indexes/datetimes.py 96.85% <0%> (ø) ⬆️
pandas/core/reshape/reshape.py 99.56% <0%> (ø) ⬆️
pandas/io/pytables.py 90.22% <0%> (ø) ⬆️
pandas/core/window.py 96.41% <0%> (ø) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1263e1a...997c0a5. Read the comment docs.

df['c'] = pd.array([1, 2, None, 3])
df2 = pd.DataFrame({'a': [1, 2, 3, 4], 'b': ['a', 'b', 'c', 'd'],
'c': pd.array([1, 2, None, 3])})
assert(df2['c']._data.blocks[0].__class__ == ObjectBlock)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want assert statements

Suggested change
assert(df2['c']._data.blocks[0].__class__ == ObjectBlock)
assert type(df2['c']._data.blocks[0]) == ObjectBlock

Same for the line below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove these parens?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, will do so in the next commit

@TomAugspurger
Copy link
Contributor

The release note can go in the "Other" section.

assert result.is_integer is True
assert result.is_extension is False


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you actually need this test; rather in the test right above, check the result.values is is the correct type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test was already present, was this valid only prior to the decision of converting PandasArray to numpy array?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that wouuld quite hit the issue. I think you would need to add a new result = make_black(arr, ..., ndim=2) with the right placement. It wouldn't hurt to add that, but I think keep the test below.

@jreback jreback removed this from the 0.25.0 milestone May 15, 2019
assert result.is_integer is True
assert result.is_extension is False


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that wouuld quite hit the issue. I think you would need to add a new result = make_black(arr, ..., ndim=2) with the right placement. It wouldn't hurt to add that, but I think keep the test below.

df['c'] = pd.array([1, 2, None, 3])
df2 = pd.DataFrame({'a': [1, 2, 3, 4], 'b': ['a', 'b', 'c', 'd'],
'c': pd.array([1, 2, None, 3])})
assert(df2['c']._data.blocks[0].__class__ == ObjectBlock)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove these parens?

@jreback jreback added this to the 0.25.0 milestone May 19, 2019
@jreback jreback added the Indexing Related to indexing on series/frames, not to indexes themselves label May 19, 2019
@jreback jreback merged commit f44ac06 into pandas-dev:master May 19, 2019
@jreback
Copy link
Contributor

jreback commented May 19, 2019

@shantanu-gontia shantanu-gontia deleted the bug26390 branch May 19, 2019 19:10
@shantanu-gontia shantanu-gontia restored the bug26390 branch May 19, 2019 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Indexing Related to indexing on series/frames, not to indexes themselves Internals Related to non-user accessible pandas implementation

3 participants