0

Working on my free trial azure account , I am trying to copy csv files to ADLS Gen2 and save the dataframe as table in adls silver layer.

code: DForderItems = spark.read.csv("abfss://[email protected]/retailfiles/orderItems.csv",header=False,schema=schema) ** I am able to read csv file into DForderitems , but the trickiest part is i am unable to save it as table in given path as below.

DForderItems**.write.option("path","abfss://[email protected]/retailfiles/orderItems").option("mergedschema", True).mode("append").saveAsTable("retail.orderItems") ** Error : ** [NO_PARENT_EXTERNAL_LOCATION_FOR_PATH] No parent external location was found for path 'abfss://[email protected]/retailfiles/orderItems'. Please create an external location on one of the parent paths and then retry the query or command again. File , line 2

  1. I tried creating a table in external location using sql

%sql CREATE EXTERNAL LOCATION silv_layer URL 'abfss://[email protected]/retailfiles/' WITH (CREDENTIAL (STORAGE_ACCOUNT_KEY = 'GlGLL4o2tXZUawz0CqVgguhKGsAN2YLQRIUs56yw8PHTw8zYQIWc2+gWFojXFWWo/puH/Q2e/t6B+AStOAyWig=='));

Still got this error : [PARSE_SYNTAX_ERROR] Syntax error at or near 'LOCATION'. SQLSTATE: 42601

  1. I tried creating a database and writing table into it

#spark.sql

("CREATE DATABASE IF NOT EXISTS retail LOCATION 'abfss://[email protected]/retailfilessilver'") DForderItems.write.option("path","abfss://[email protected]/retailfiles/orderItems").option("mergeschema", True).mode("append").saveAsTable("retail.orderItems")

But i got another error [NO_PARENT_EXTERNAL_LOCATION_FOR_PATH] No parent external location was found for path 'abfss://[email protected]/retailfiles/orderItems'. Please create an external location on one of the parent paths and then retry the query or command again. File , line 2

1 Answer 1

0

I have received the same ERROR below are the ERROR details:

Error[NO_PARENT_EXTERNAL_LOCATION_FOR_PATH] No parent external location was found for path 'abfss://[email protected]/Customer.csv. Please create an external location on one of the parent paths and then retry the query or command again.

The above ERROR indicates that the system cannot find the specified path

I have tried the below approach:

I have mounted my ADLS using the below script:

configs = { 'fs.azure.account.auth.type': 'OAuth', 'fs.azure.account.oauth.provider.type': 'org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider', 'fs.azure.account.oauth2.client.id': '< YOUR CLIENT ID> ', 'fs.azure.account.oauth2.client.secret': dbutils.secrets.get(scope='dbxsecretscope', key='kvsecretname'), 'fs.azure.account.oauth2.client.endpoint': 'https://login.microsoftonline.com/< YOUR TENANT ID >/token' }

dbutils.fs.mount(source='abfss://[email protected]/', mount_point='/mnt/raw', extra_configs=configs)

Note: When you are mounting your ADLS to azure databricks you will need to add Storage Blob Data Contributor role to you ADLS & Key Vault Administrator role to Key Vault.

The below will let you to list your mount point:

display(dbutils.fs.mounts())

Know more about how to mount ADLS to Azure databricks using the SPN,Azure Key Vault

"Next, I created an external table from a single CSV file in Azure Databricks as an example."

%sql
CREATE  TABLE  IF  NOT  EXISTS  hive_metastore.default.cust USING csv OPTIONS (path  "/mnt/raw/new/Customer.csv", inferSchema=True, header=True)

Results: enter image description here

df = spark.read.format("csv") \
.option("header", "true") \
.option("inferSchema", "true") \
.load("dbfs:/mnt/raw/new/Customer.csv")
df.show()
+--------------+
|          Col1|
+--------------+
|123,456@1234_1|
+--------------+

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.