-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
TST: Refactor test_parsers.py #12964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Not sure if this is all the refactoring that needs to be done, but I think I got through a good portion of it. Further suggestions are welcome! |
9846686 to 0a7f5e6 Compare 0a7f5e6 to ecf4cde Compare | changing things like this:
|
| it is really quite difficult to tell what you did and comment on it. I would split this up into multiple files ( e.g.
|
|
Ah, yes, unfortunately, I did not realize how much I had changed until I saw that GitHub could not show the diff at all. Also, your code block is blank if you were trying to demonstrate something... |
| @gfyoung what I was saying (I edited), was the |
| Hmmm...I'll see what I can do to break up the tests. It's not as straightforward as it is in |
| Also, this PR is pretty much just reorganization and removal of duplicate tests. If you can view it, there are no more tests under |
| Also, one more thing: why do we have |
| @gfyoung my point is this is very hard to see. Let's break up first. Yes no realy need for a separate This IS straightforward to breakup if you just have different test classes rather than a single monolith. The idea is to test different options in different files. Perfect example is compressions which is alredy split out. But you can split out parse_date testing. line terminator testsing etc. Lots of ways to do this. |
| @jreback : No, I perfectly understand (I realized that it might be a problem when I saw GitHub couldn't show it)! I was just trying to get some feedback first on how to break up |
ecf4cde to d439b47 Compare Current coverage is 83.92%@@ master #12964 diff @@ ========================================== Files 136 136 Lines 49994 49918 -76 Methods 0 0 Messages 0 0 Branches 0 0 ========================================== - Hits 42028 41889 -139 - Misses 7966 8029 +63 Partials 0 0
|
d439b47 to b886475 Compare pandas/io/tests/parse_dates_tests.py Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas/io/tests/parser/test_dates.py should be the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then won't that file be run as a test suite? I don't want that (it needs to be imported as a test class).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well you need to fix it then if that's the way you want it. the correct way to do this is to split off the base class into common.py or something which is then imported into each file. Look at how tests/indexes was done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay, I see. I'll take a look into that then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tricky part (compared to tests/indexes) is that there are really "three" base classes (CParserHighMemory, CParserLowMemory, PythonParser). How does one organize that then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idea is then you can easily strip out related things and add as separeat files in a nice way. (and don't crows the namespace)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make the main Base classes mostly empty. Then just sub-class them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH it seems easier to work in the way of making the individual test suites part of a common package and sub-class them into each of the main parser tests as is done now. Otherwise, I fear there will be duplicate code IINM. For example, parse_date_tests.py needs to be run for each of three parsers. But I can't just throw them all in as super-classes because they will clash. So I would then have to make three different versions, one for each parser, which I don't think is preferable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simpliest easiest is best. you can also make them mix-in's might be easier. but still keep in separate files (just don't name then test_*)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. I'll try to first separate them out and then we can see what is best afterwards (still WIP with pulling out test suites from test_parsers)
15f50f2 to 6a26224 Compare pandas/io/tests/compression_tests.py Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think naming this compression.py is better because then its not runable on its own (its only mixins to others). name test_* if they are runnable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
bfbbed2 to 4f9c911 Compare pandas/io/tests/test_network.py Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so some of the files are not in the pandas/io/tests/parser/* namespace e.g.
pandas/io/tests/test_network.py as an example.
[jreback-~/pandas] git diff master --stat pandas/io/tests/parser/__init__.py | 0 pandas/io/tests/parser/compression.py | 155 +++++ pandas/io/tests/parser/parse_dates.py | 468 +++++++++++++++ pandas/io/tests/parser/test_parsers.py | 3342 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ pandas/io/tests/parser/test_textreader.py | 402 +++++++++++++ pandas/io/tests/test_cparser.py | 401 ------------- pandas/io/tests/test_network.py | 195 +++++++ pandas/io/tests/test_parsers.py | 5055 -------------------------------------------------------------------------------------------------------------------------------------------------------------- pandas/io/tests/test_read_fwf.py | 337 +++++++++++ pandas/io/tests/test_unsupported.py | 88 +++ pandas/tests/test_lib.py | 74 ++- There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay, I'll move all of those (excluding test_lib of course) into the parser directory as well
| I think this also needs an edit to e.g. add you can do a
|
4f9c911 to edba8fa Compare pandas/tests/test_lib.py Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put at the top
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
fc49f1a to 818f711 Compare | something wrong. lots of tests not running but if its explict it works I think that |
818f711 to 225637f Compare | @jreback : I think you're right about that (at least that's what Google says). I changed the mode of |
225637f to 20389b1 Compare | I think need to add |
|
|
|
|
| Huh? What do you mean? That's the exact command |
| so why didn't your build fail then? |
| I haven't pushed |
| oh, ok, then pls push. I think this is ready as is. (pls flake check) |
20389b1 to c01b4bd Compare Refactored tests in test_parsers.py to increase coverage of the different types of parsers and remove nearly duplicate testing in some cases.
c01b4bd to 3ed01a0 Compare | @jreback : Travis is giving the green light. Not sure what's going on with Codecov though. Is this good to merge? |
| hmm, yeah the codecov that is looking odd |
| So how to proceed? Can this be merged? Or is it a bug in |
| yeah will take another look. if you have insite into codecov, then ok! |
| Unfortunately, I do not :< . Let me know once you have more info! |
| thank you sir! |
| I had to do this: 879f217 to make the setup work in a clean env. Travis works from a clone so the installation is slightly different (it depends on setup). Side issue, this came up because it was segfaulting on |
| Ah, I see. Thanks! |
Refactored tests in
test_parsers.pyto increase coverage of the different types of parsers and remove nearly duplicate testing in some cases.