I'm using Azure Databricks and pyspark to process data using dataframes and I use Azure SQL Database to store the data after it's been processed. I have created the output tables using ordinary CREATE TABLE scripts in SQL, but I realized that the dataframe write method overwrites the table format. E.g. all the string columns become nvarchar(max). Is there any way to keep the table format as specified in the CREATE TABLE script?
Example of my write statement in pyspark:
df.write
.mode("overwrite")
.format("jdbc")
.option("url", f"jdbc:sqlserver://myserver.database.windows.net;databaseName=mydatabase;")
.option("dbtable", "mytable")
.option("user", jdbcUsername)
.option("password", jdbcPassword)
.option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")
.save()
appendinstead ofoverwrite. See here: Save modes