Skip to content

Conversation

@ajcr
Copy link
Contributor

@ajcr ajcr commented Aug 30, 2015

Fixes GH10856.

>>> pd.DataFrame({'a':0.1}, columns=['b']) ValueError: If using all scalar values, you must pass an index

Trying to raise this error was slightly trickier than I anticipated - this was the only way that didn't break existing tests. If no index is passed to the constructor, extract_index is called to check whether the dictionary contacts only scalar values (and raises the ValueError if so).

This check now happens prior to preselecting any columns. If people are happy with this approach I can write tests for the PR.

@jreback
Copy link
Contributor

jreback commented Aug 31, 2015

seems reasonable. Pls add the test in of course!
and a release note. squash / ping me when green.

@jreback jreback added this to the 0.17.0 milestone Aug 31, 2015
@jreback jreback added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Error Reporting Incorrect or improved errors from pandas labels Aug 31, 2015
@ajcr ajcr force-pushed the raise-scalar-dict branch 3 times, most recently from 267b354 to 132532f Compare August 31, 2015 12:25
@ajcr
Copy link
Contributor Author

ajcr commented Aug 31, 2015

@jreback: added the test and note and checks are green.

@jorisvandenbossche
Copy link
Member

@jreback the discussion we had during the sprint is the following: although this patch fixes the actual reported issue (raising a ValueError when a scalar dict is passed), this also introduces another inconsistency in some way:

In [25]: pd.DataFrame({'a':[0.1]}, columns=['b']) Out[25]: Empty DataFrame Columns: [b] Index: [] 

In the above case, first the 'reindexing' of the dict on the columns is done, ending up with an empty dict, leading to an empty dataframe with columns but no index (so ignoring the index that would have been created from the original data).
So if you apply the same logic on the case in this issue, you also end up with an empty dict, but in this case the index is not ignored, as you still raise an error because the original data would cause an error when constructing an index from that.

But there is not really a clear correct solution I think (and it is also rather a cornercase), so I am fine with merging this.

…GH10856) Conflicts:	doc/source/whatsnew/v0.17.0.txt
@ajcr ajcr force-pushed the raise-scalar-dict branch from 132532f to d2d9464 Compare September 1, 2015 20:04
jreback added a commit that referenced this pull request Sep 2, 2015
BUG: passing columns and dict with scalar values should raise error
@jreback jreback merged commit b420e84 into pandas-dev:master Sep 2, 2015
@jreback
Copy link
Contributor

jreback commented Sep 2, 2015

thanks guys. catching all of these corner cases is tricky.

@ajcr ajcr deleted the raise-scalar-dict branch September 2, 2015 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Error Reporting Incorrect or improved errors from pandas Reshaping Concat, Merge/Join, Stack/Unstack, Explode

3 participants