Skip to content

Conversation

@Giftlin
Copy link
Contributor

@Giftlin Giftlin commented Sep 12, 2017

to create directory while saving csv, if the directory is not available

sample code

import pandas as pd

df = pd.DataFrame()
df.to_csv("D://asdfg//pi.py")

The changes in this PR fixes the following error.
image

creates asdfg directory under D:
image

to create directory while saving csv, if the directory is not available
@pep8speaks
Copy link

pep8speaks commented Sep 12, 2017

Hello @Giftlin! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on September 13, 2017 at 05:12 Hours UTC
@Giftlin Giftlin changed the title Update format.py Update format.py : to create directory while saving csv, if the directory is not available Sep 12, 2017
@Giftlin Giftlin closed this Sep 13, 2017
@Giftlin Giftlin reopened this Sep 13, 2017
@Giftlin Giftlin changed the title Update format.py : to create directory while saving csv, if the directory is not available BUG: Update format.py : to create directory while saving csv, if the directory is not available Sep 13, 2017
Copy link

@programminggeek1-0 programminggeek1-0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error occurrs for to_excel too.
Create a PR for it if possible

@gfyoung gfyoung added IO CSV read_csv, to_csv Output-Formatting __repr__ of pandas objects, to_string labels Sep 13, 2017
@gfyoung gfyoung changed the title BUG: Update format.py : to create directory while saving csv, if the directory is not available API: Create directories when saving to CSV Sep 13, 2017
@gfyoung
Copy link
Member

gfyoung commented Sep 13, 2017

@Giftlin : Thanks for submitting this!

Unfortunately, I'm not sure if I agree with the decision to do this. For starters, the current behavior is consistent with what you would see with Python's builtin open function, where directories are also not created if you try to create a non-existent file.

Furthermore, it is consistent across our IO functions as alluded to by @programminggeek1-0 , meaning that this change should not occur on its own but rather in a series of PR's / commits to ensure that our IO interface is consistent.

Nevertheless, IMO, the responsibility of the file location rests on the user, and they should make sure that the file path they point to is reachable for pandas.

Finally, in the future, for changes like this, it would be preferable that you open an issue first instead of a PR so that we can discuss this before changes are implemented.

@Giftlin
Copy link
Contributor Author

Giftlin commented Sep 13, 2017

@gfyoung Hi, actually in our org, we are facing trouble while saving csv's inside trace folders.. every single time we are having to create the folders manually.
Hope this would help many who are facing this trouble.

@gfyoung
Copy link
Member

gfyoung commented Sep 13, 2017

@Giftlin : Your change is relatively trivial though. Why does it need to be integrated into pandas when the solution is purely based in the core Python language?

@Giftlin
Copy link
Contributor Author

Giftlin commented Sep 13, 2017

@gfyoung we use packages mainly to avoid more manual touch in coding. Pandas helps alot in it, except this.

@gfyoung
Copy link
Member

gfyoung commented Sep 13, 2017

we use packages mainly to avoid more manual touch in coding. Pandas helps alot in it, except this.

That's great to hear, but I'm still not convinced. It's one line of code that we're talking about here.

@Giftlin
Copy link
Contributor Author

Giftlin commented Sep 13, 2017

The one line code inside the function will be much better than using the same one line in every required place in the project

@gfyoung
Copy link
Member

gfyoung commented Sep 13, 2017

The one line code inside the function will be much better than using the same one line in every required place in the project

Ah, but this is not the only function where this line would be required, as pointed out by @programminggeek1-0

@Giftlin
Copy link
Contributor Author

Giftlin commented Sep 13, 2017

Yeah.. but it is always better to have in packages' built-in functions right. So that the users wouldn't have to worry about including extra codes for saving as CSV or excel.

@gfyoung
Copy link
Member

gfyoung commented Sep 13, 2017

Yeah.. but it is always better to have in packages' built-in functions right. So that the users wouldn't have to worry about including extra codes for saving as CSV or excel.

Not necessarily. There comes a point when the user needs to take responsibility for what they do. The existence of directories / local file-structure is something that is specific to each user and should therefore be handled and taken care of by the user.

All our I/O methods are not about managing those aspects. They are just about moving data out of Python onto disk and vice-versa.

@Giftlin
Copy link
Contributor Author

Giftlin commented Sep 13, 2017

Okay

@gfyoung
Copy link
Member

gfyoung commented Sep 13, 2017

@jorisvandenbossche
Copy link
Member

I agree with @gfyoung. For me the main reason to have reservations about including this PR, is that with such new behaviour, you would silently create new directories if you make a small typo. IMO that is useful behaviour as a default, and the checking if exists is something the user can easily do themselves.

@gfyoung
Copy link
Member

gfyoung commented Sep 13, 2017

In light of my and @jorisvandenbossche comments, closing this PR.

@gfyoung gfyoung closed this Sep 13, 2017
@gfyoung gfyoung added this to the No action milestone Sep 13, 2017
@Giftlin Giftlin deleted the patch-1 branch September 18, 2017 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

IO CSV read_csv, to_csv Output-Formatting __repr__ of pandas objects, to_string

5 participants