0

I want to add a column to a spark dataframe which has been registered as a table. This column needs to have an auto incrementing long.

df = spark.sql(query)
df.createOrReplaceTempView("user_stories")
df = spark.sql("ALTER TABLE user_stories ADD COLUMN rank int AUTO_INCREMENT")
df.show(5)

This throws the following error,

Py4JJavaError: An error occurred while calling o72.sql.
: org.apache.spark.sql.catalyst.parser.ParseException: 
no viable alternative at input 'ALTER TABLE user_stories ADD COLUMN'(line 1, pos 29)

== SQL ==
ALTER TABLE user_stories ADD COLUMN rank int AUTO_INCREMENT
-----------------------------^^^

What am I missing here?

2
  • How's this question a duplicate, I need an auto increment value in the new column, I don't see that addressed in the question you've cited. Commented Jul 25, 2018 at 19:16
  • There is no auto increment in spark. What are you trying to do? Commented Jul 25, 2018 at 19:25

1 Answer 1

1

if you want to add new incremental column to DF, you could do in following ways.

df.show()
+-------+
|   name|
+-------+
|gaurnag|
+-------+   
from pyspark.sql.functions import monotonically_increasing_id
new_df = df.withColumn("id", monotonically_increasing_id())
new_df.show()
+-------+---+
|   name| id|
+-------+---+
|gaurnag|  0|
+-------+---+
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.