I'm querying a table in a SQL Server database and exporting out to a CSV using pandas:
import pandas as pd df = pd.read_sql_query(sql, conn) df.to_csv(csvFile, index=False) Is there a way to remove non-ascii characters when exporting the CSV?
You can read in the file and then use a regular expression to strip out non-ASCII characters:
df.to_csv(csvFile, index=False) with open(csvFile) as f: new_text = re.sub(r'[^\x00-\x7F]+', '', f.read()) with open(csvFile, 'w') as f: f.write(new_text) This was the case I ran into. Here's what worked for me:
import re regex = re.compile(r'[^\x00-\x7F]+') #regex that matches non-ascii characters with open(csvFile, 'r') as infile, open('myfile.csv', 'w') as outfile: for line in infile: #keep looping until we hit EOF (meaning there's no more lines to read) outfile.write(regex.sub('', line)) #write the current line in the input file to the output file, but if it matches our regex then we replace it with nothing (so it will get removed)
df.to_csv(csvFile, index=False, encoding='ascii')?