unhelpful error message when header is a list of names in read_csv

This is a minor issue about error reporting to the mindless user (me...) who confuses the header and the name argument of read_csv. Basically, when calling read_csv with header=['a', 'b'] (whereas it should be names=['a', 'b']), the error message is crytic:

TypeError: must be str, not int

(pandas 0.20.1, see details below)

Two issues:

unhelpful, quite cryptic message, doesn't point in the good direction. E.g. it doesn't explain which argument causes the problem. Of course in the dummy example below, there is just one argument, but in the real case where I got bitten it was messier...
it is impossible to debug with %debug magic, because error is raised in the compiled code parsers.pyx

Here is code to reproduce the error message, taken from a IPython session. (First line may be a bit Unix specific, sorry. It's just to create a dummy CSV file)

In [] !echo '1,2\n3,4' > 1234.csv In [] pd.read_csv('1234.csv')	1	2 0	3	4 In [] pd.read_csv('1234.csv', names=['a', 'b']) # proper call	a	b 0	1	2 1	3	4 In [] pd.read_csv('1234.csv', header=['a', 'b']) # beginer's mistake TypeError Traceback (most recent call last) <ipython-input-5-b065bd1f57c6> in <module>() ----> 1 pd.read_csv('1234.csv', header=['a', 'b']) /home/pierre/Programmes/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision) 653 skip_blank_lines=skip_blank_lines) 654 --> 655 return _read(filepath_or_buffer, kwds) 656 657 parser_f.__name__ = name /home/pierre/Programmes/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds) 403 404 # Create the parser. --> 405 parser = TextFileReader(filepath_or_buffer, **kwds) 406 407 if chunksize or iterator: /home/pierre/Programmes/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py in __init__(self, f, engine, **kwds) 760 self.options['has_index_names'] = kwds['has_index_names'] 761 --> 762 self._make_engine(self.engine) 763 764 def close(self): /home/pierre/Programmes/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py in _make_engine(self, engine) 964 def _make_engine(self, engine='c'): 965 if engine == 'c': --> 966 self._engine = CParserWrapper(self.f, **self.options) 967 else: 968 if engine == 'python': /home/pierre/Programmes/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py in __init__(self, src, **kwds) 1580 kwds['allow_leading_cols'] = self.index_col is not False 1581 -> 1582 self._reader = parsers.TextReader(src, **kwds) 1583 1584 # XXX pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.__cinit__ (pandas/_libs/parsers.c:5996)() TypeError: must be str, not int

Expected Output

I'm not expecting a fancy AI-assistant like error message. However, an early check of the header argument should verify, in coherence with the docstring, that header should be int or list of ints.

What do you think? Is it an overkill?

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Linux OS-release: 4.9.0-2-amd64 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: fr_FR.utf8 LOCALE: fr_FR.UTF-8

pandas: 0.20.1
pytest: 3.0.5
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 6.0.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.3.0
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.2
bs4: 4.5.3
html5lib: None
sqlalchemy: 1.1.5
pymysql: None
psycopg2: None
jinja2: 2.9.4
s3fs: None
pandas_gbq: None
pandas_datareader: None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

unhelpful error message when header is a list of names in read_csv #16338

Expected Output

Output of `pd.show_versions()`

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

unhelpful error message when header is a list of names in read_csv #16338

Description

Expected Output

Output of pd.show_versions()

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Output of `pd.show_versions()`