0

I have a pandas dataframe, which I am creating from an excel (.xlsx) file. Using df = pd.read_excel()

One of the column names in the excel document is: IN - Original ex factory date.

I am trying to remove all special characters and white spaces as I will be saving this file in Avro format in Spark.

I am doing the following to clean the column names:

df.columns = df.columns.str.replace('(', '') df.columns = df.columns.str.replace(')', '') df.columns = df.columns.str.replace('=', '') df.columns = df.columns.str.replace('-', '_') df.columns = df.columns.str.replace('/', '_') df.columns = df.columns.str.replace('Â', '__') df.columns = df.columns.str.replace(' ', '_') 

After the above transformations, I am left with the above column name as:

IN___Original__ ex_factory_date

It doesn't remove the whitespace before ex_factory_date. And you can see all the other whitespaces are replaced by _.

I even tried to use unicodedata

def remove_accents(input_str): nfkd_form = unicodedata.normalize('NFKD', input_str) only_ascii = nfkd_form.encode('ASCII', 'ignore') return only_ascii for col in df.columns: print(col) neutral = remove_accents(col) print(normal) 

It is returning the same thing: IN___Original__ ex_factory_date

Is there something I am missing as to why this wihtespace is not removed? I need a clean column name.

2
  • It's may be a double space. try to add double space in df.columns = df.columns.str.replace(' ', '_') Commented Feb 19, 2021 at 13:46
  • @MaheryRanaivoson it is not a double space. I checked thoroughly. It's a single space but not replaced by the regex somehow Commented Feb 19, 2021 at 13:56

1 Answer 1

1

Most likely, your space is not the ASCII space character, but rather one of the many Unicode whitespace characters (see this extensive answer https://stackoverflow.com/a/37903645/5660315). This code should replace your whitespace:

# .str.replace supports regular expressions by default df.columns = df.columns.str.replace('\s', '_') 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.