I am parsing an Apache log file and saving it into pandas data frame for my further investigation.
But in the log file I have some bad lines and so the following error occurs:
ValueError: Expected 11 fields in line 4320, saw 27
To overcome this issue, I included error_bad_lines = False while reading the file. This doesn't help as I am getting the following error:
ValueError: The 'error_bad_lines' option is not supported with the 'python' engine
Note : I am explicitly using python engine as I have separator as a regular expression.
Code snippet:
data = pd.read_csv( log_file, sep=r'\s(?=(?:[^"]*"[^"]*")*[^"]*$)(?![^\[]*\])', engine='python', na_values='-', header=None, usecols = use_cols, skiprows =1, converters={time_taken_index[0]:parse_sec, time_index[0]:parse_datetime, req_index[0]:parse_str,status_index[0]:parse_str}, error_bad_lines = False ) I'd be grateful for any suggestions. Thank you.