I'm trying to remove accents from data in a csv file. So I use the remove_accents function (See below) but for that I need to encode my csv files in utf-8. But I've got the error 'encoding' is an invalid keyword argument for this function
I've seen that I may have to use Python3 and then execute python3 ./myscript.py? Is this the right way to do it ? Or is there another way to remove accents wihtout having to install python3 ? Any help would be much appreciated
#!/usr/bin/env python import re import string import csv import unicodedata def remove_accents(data): return ''.join(x for x in unicodedata.normalize('NFKD', data) if \ unicodedata.category(x)[0] == 'L').lower() reader=csv.reader(open('infile.csv', 'r', encoding='utf-8'), delimiter='\t') writer=csv.writer(open('outfile.csv', 'w', encoding='utf-8'), delimiter=',') for line in reader: if line[0] != '': person=re.split(' ',line[0]) first_name = person[0].strip().upper() first_name1=unicode(first_name) first_name2=remove_accents(first_name1) if len(person) == 2: last_name=person[1].strip().upper() line[0]=last_name line[15]=first_name2 writer.writerow(line)