1

I would like to understand why when working with Apache Spark we don't explicitly close JDBC connections.

See: https://learn.microsoft.com/en-us/azure/sql-database/sql-database-spark-connector or https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html

Is this due to the fact, that when we do

val collection = sqlContext.read.sqlDB(config)

or

jdbcDF.write
  .format("jdbc")
   (...)
  .save()

we don't really open the connection but merely specify a DAG stage? And then under the hood Spark establishes the connection and closes it?

1 Answer 1

1

That's correct, Spark takes care of opening/closing JDBC connections to relational data sources during plan execution phase. This allows it to maintain level of abstraction required to support a multitude of various DataSource types. You can check the source code of JdbcRelationProvider (for read) or JdbcUtils (for save) to review that logic.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.