MySQL Load Data returning SQL syntax error

Question

I have a huge 23 GB CSV file I am trying to load, and I've found that instead of converting it to insert statements which takes a while, It's possible to directly load to the DB. So I tried the following syntax:

LOAD DATA LOCAL INFILE client_report.csv into table client_report fields terminated by ',' optionally enclosed by '"' lines terminated by '\r\n' ignore 1 lines;

mysql> LOAD DATA LOCAL INFILE client_report.csv into table client_report fields terminated by ',' optionally enclosed by '"' lines terminated by '\r\n' ignore 1 lines; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'client_report.csv into table client_report fields terminated by ',' opti' at line 1

I am at a loss, I seem to be following documentation to the letter, and checked sHOW GLOBAL VARIABLES LIKE 'local_infile'; its ON.

Wrap your CSV file name in single quotes. i.e. LOAD DATA LOCAL INFILE 'client_report.csv' into... — Tangentially Perpendicular
– Tangentially Perpendicular, Commented Oct 29, 2022 at 2:46
However the big file failed to insert, there's error I will try perhaps in chunks each time 10M rows — Carmageddon
– Carmageddon, Commented Oct 29, 2022 at 22:50
hmm apparently this command does not support chunking natively — Carmageddon
– Carmageddon, Commented Oct 29, 2022 at 23:00

Carmageddon · Accepted Answer · 2022-10-31 15:36:18Z

The final solution that worked best, is two steps:

Split the file into chunks of 10M rows each using split -l 10000000 file.csv which will generate files like xaa xab, xac ...
Using the below query (note filename in quote):

LOAD DATA LOCAL INFILE 'xaa' into table table_name fields terminated by ',' optionally enclosed by '"' lines terminated by '\r\n' ignore 1 lines (field1,field2,field3,field4,fieldN) SET ID = NULL;

Note that if you don't explicitly set the fields and ID=null, it will try to insert first column into the autogenerated ID column, shifting all content by 1 column leaving the last one empty.

Also note starting from file xab, you need to remove the ignore 1 lines part otherwise you'll skip the first row on each file.

Gerballi · Accepted Answer · 2022-10-29 02:51:52Z

The only way I know how to do this is using python. This example uses sqlite, but the concept is the same.

It will murder your ram, but with an large enough pagefile you should be fine. It will also take some time (up to a few hours).

import pandas import sqlite3 import time start_time = time.process_time() db = sqlite3.connect("-database.db") def import_csv(): df = pandas.read_csv('csv_file.csv') df.to_sql('test', con = db, if_exists = 'append', chunksize = 100000) # when 1=1 csv file is imported without only select is executed. sqlite auto creates the database and table with index column if 1 == 1: import_csv() #print(pandas.read_sql_query('select * from test limit 100000000', db)) end_time = time.process_time() print(end_time-start_time)

The OP hasn't tagged a language other than MySQL. This won't be very helpful if he's using, say, PHP.
The question has nothing to do with Python, so an answer that uses it is not appropriate. This isn't a proper answer to the question asked.

Collectives™ on Stack Overflow

MySQL Load Data returning SQL syntax error

2 Answers 2

Comments

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Related