I have a requirment to load the data in Salesforce using Databricks. I am using simple_salesforce library to load the data. As Salesforce accepts data in dictionary format I need to convert the pyspark dataframe to dictionary and it is failing as below.
from pyspark.sql.types import StructType,StructField, StringType, IntegerType
data2 = [("Test_Conv1","[email protected]","Olivia","A",'3000000000'),
("Test_Conv2","[email protected]","Jack","B",4000000000),
("Test_Conv3","[email protected]","Williams","C",5000000000),
("Test_Conv4","[email protected]","Jones","D",6000000000),
("Test_Conv5","[email protected]","Brown",None,9000000000)]
schema = StructType([ \
StructField("LastName",StringType(),True), \
StructField("Email",StringType(),True), \
StructField("FirstName",StringType(),True), \
StructField("MiddleName", StringType(), True), \
StructField("Phone", StringType(), True)])
df = spark.createDataFrame(data=data2,schema=schema)
It is failing in the below line
df_contact = df.rdd.map(lambda row: row.asDict()).collect()
Error message
py4j.security.Py4JSecurityException: Method public org.apache.spark.rdd.RDD org.apache.spark.api.java.JavaRDD.rdd() is not whitelisted on class class org.apache.spark.api.java.JavaRDD
Loading to Target
sf.bulk.Contact.insert(df_contact,batch_size=20000,use_serial=True)