5

I have written code in Python using Pandas that adds "VEN_" to the beginning of the column names:

Tablon.columns = "VEN_" + Tablon.columns

And it works fine, but now I'm working with PySpark and it doesn't work. I've tried:

Vaa_total.columns = ['Vaa_' + col for col in Vaa_total.columns]

or

for elemento in Vaa_total.columns:
    elemento = "Vaa_" + elemento

And other things like that but it doesn't work.

I don't want to replace the columns name, I just want to mantain it but adding a string to the beginning.

4
  • Possible duplicate of How to change dataframe column names in pyspark? Commented Jul 17, 2018 at 8:40
  • I don't think so, there is explained how to replace it but I don't know how I can to add a string to my columns name, I get: AttributeError: can't set attribute. Commented Jul 17, 2018 at 8:46
  • look into option 2 or 3. It's exactly what you need. Commented Jul 17, 2018 at 8:51
  • yes, you are right! Commented Jul 17, 2018 at 8:58

3 Answers 3

4

Try something like this:

for elemento in Vaa_total.columns:
    Vaa_total =Vaa_total.withColumnRenamed(elemento, "Vaa_" + elemento)
Sign up to request clarification or add additional context in comments.

Comments

0

I linked similar topic in comment. Here's example adapted from that topic to your task:

dataframe.select([col(col_name).alias('VAA_' + col_name) for col_name in dataframe])

Comments

0

Standard format of writing it:

renamed_df = df.withColumnRenamed(col_name, "insert_text" + col_name) for col_name in dataframe.columns])

1 Comment

There is a bracket missing somewhere in your solution

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.