0

I am using .NET C# with the SparkSQL ODBC driver to run a query against Databricks. To test my query, I first test and run in a SQL Notebook in the Databricks portal. I create a TEMP VIEW and then use that in a subsequent SELECT and it works great.

In the notebook it looks like so:

CREATE OR REPLACE TEMP VIEW budget AS
SELECT 1 as ID, 2025 as OPYEAR, 1 as OPMONTH, 13.2 as BGQTY
UNION ALL
SELECT 2, 2025, 2, 97.1
UNION ALL
SELECT 3, 2025, 3, 105.8;

SELECT
    SUM(if(date_format(purchdate, "yyyyMMdd")='20250313',budget.BGQTY,0)) as daySum
FROM
    CoreData
JOIN budget on budget.OPYEAR= cast(date_format(purchdate, "yyyy") as int) 
          and budget.OPMONTH= cast(date_format(purchdate, "MM") as int) 
WHERE
         location = 'HDQ';

Now that I have verified the query in a notebook I then add it to my C# code. I use the SparkSQL ODBC driver to get my data and have confirmed that my connection works and all other standard queries are working. With this query, however, I get this error:

Driver={Simba Spark ODBC Driver};Server=xxxxxxxxx;
Exception thrown: 'System.Data.Odbc.OdbcException' in System.Data.dll
ERROR [42601] [Simba][Hardy] (80) Syntax or semantic analysis error thrown in server while executing query. 
Error message from server: org.apache.hive.service.cli.HiveSQLException: Error running query: [PARSE_SYNTAX_ERROR] org.apache.spark.sql.catalyst.parser.ParseException: 
[PARSE_SYNTAX_ERROR] Syntax error at or near 'SELECT': extra input 'SELECT'. SQLSTATE: 42601 (line xx, pos 0)

The line reported in the error message coincides with the line of the 2nd SELECT statement.

So I am unsure why it works in the notebook but not in my .NET code.

2
  • It looks like your select is returning 2 data sets (2 different selects). If that is the expected results the C# code may need to be adjusted to accept the 2 data sets returned to the C# code with a different call/return method accepting 2 data sets/an array of data sets. Commented Mar 18 at 19:21
  • It does not. The first SELECT creates a table on the fly that then is used as a JOIN to the second SELECT. ie if I only put the first SELECT in the notebook nothing gets "returned". You need something like a SELECT * FROM budget; to get a result back. Commented Mar 18 at 20:28

1 Answer 1

0

Ok I found a solution to my problem. The key to this whole thing is to connect to Databricks via ODBC under ONE session and use ExecuteNonQuery twice and then the final query:

using (OdbcConnection cn = new OdbcConnection(databricksConnStr))
{
    cn.ConnectionTimeout = 15 * 60; //15 minutes

    Debug.Print("Attempting to connect to Databricks...");
    cn.Open();
    Debug.Print("Connection to Databricks was successful.");

    // DATABRICKS STEP 1: drop the temp table if it exists
    OdbcCommand cmdDrop = new OdbcCommand("DROP TABLE IF EXISTS budget;", cn);
    cmdDrop.CommandTimeout = 3 * 60; //3 minutes
    int dropNum = cmdDrop.ExecuteNonQuery();
    if (dropNum != -1) throw new Exception("Databricks temp table was NOT dropped.");

    // DATABRICKS STEP 2: now create the temp table from the data we got from secondary source.
    // This is the first part in my example from CREATE to the first semi-colon.
    OdbcCommand cmdCreate = new OdbcCommand(budgetTableSql, cn);
    cmdCreate.CommandTimeout = 3 * 60; //3 minutes
    int createNum = cmdCreate.ExecuteNonQuery();
    if (createNum != -1) throw new Exception("Databricks temp table was NOT created.");

    // DATABRICKS STEP 3: at this point we have the temp table in the same session. So we can then use
    // our main query which uses the budget temp table to compute the final data. This is the second part
    // in my example from SELECT SUM to the second semi-colon.
    DataTable dtFinal = new DataTable();
    OdbcCommand cmdFinal = new OdbcCommand(sql, cn);
    cmdFinal.CommandTimeout = 60 * 10;
    OdbcDataAdapter daFinal = new OdbcDataAdapter(cmdFinal);
    daFinal.Fill(dtFinal);
    Debug.Print("row count=" + dtFinal.Rows.Count);
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.