5

I have written code in Python using Pandas that adds "VEN_" to the beginning of the column names:

Tablon.columns = "VEN_" + Tablon.columns 

And it works fine, but now I'm working with PySpark and it doesn't work. I've tried:

Vaa_total.columns = ['Vaa_' + col for col in Vaa_total.columns] 

or

for elemento in Vaa_total.columns: elemento = "Vaa_" + elemento 

And other things like that but it doesn't work.

I don't want to replace the columns name, I just want to mantain it but adding a string to the beginning.

4
  • Possible duplicate of How to change dataframe column names in pyspark? Commented Jul 17, 2018 at 8:40
  • I don't think so, there is explained how to replace it but I don't know how I can to add a string to my columns name, I get: AttributeError: can't set attribute. Commented Jul 17, 2018 at 8:46
  • look into option 2 or 3. It's exactly what you need. Commented Jul 17, 2018 at 8:51
  • yes, you are right! Commented Jul 17, 2018 at 8:58

3 Answers 3

4

Try something like this:

for elemento in Vaa_total.columns: Vaa_total =Vaa_total.withColumnRenamed(elemento, "Vaa_" + elemento) 
Sign up to request clarification or add additional context in comments.

Comments

0

I linked similar topic in comment. Here's example adapted from that topic to your task:

dataframe.select([col(col_name).alias('VAA_' + col_name) for col_name in dataframe]) 

Comments

0

Standard format of writing it:

renamed_df = df.withColumnRenamed(col_name, "insert_text" + col_name) for col_name in dataframe.columns]) 

1 Comment

There is a bracket missing somewhere in your solution

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.