15

I am trying to find out the python version I am using in Databricks.

To find out I tried

import sys
print(sys.version)

And I got the output as 3.7.3

However when I went to Cluster --> SparkUI --> Environment

I see that the cluster Python version is 2.

Which version does this refer to ?

When I tried running

%sh python --version

I still get Python 3.7.3

Can there be a different python version for each worker / driver node ?

Note: I am using a setup where there is 1 worker node and 1 driver node (2 nodes in total with the same spec) and Databricks Runtime Version is 6.5 ML

3 Answers 3

18

This works in all notebooks either gooogle colab or MS Azure Databricks:

!python --version
Sign up to request clarification or add additional context in comments.

2 Comments

Just ran in Databricks notebook and went fine
I had to add %sh to the top of the cell
8
Answer recommended by Microsoft Azure Collective

Update: This issue has been fixed.

For new cluster: If you create a new cluster it will have python environment variable as 3.

For existing clusters: You need to add in Environment Variables tab in Cluster Configuration > Advanced, it changes in the Environmental variable.

PYSPARK_PYTHON=/databricks/python3/bin/python3

enter image description here


Thanks for bringing this to our attention. This is a product-bug, currently I'm working with the product team to fix the issue asap.

The default Python version for clusters created using the UI is Python 3.

As part of repro, I had created Databricks Runtime Version: 6.5 ML and observed the same behaviour.

Cluster --> SparkUI --> Environment shows incorrect version.

enter image description here

enter image description here

Comments

2

I believe you are running a cluster that is using Databricks Runtime 5.5 or below. What you see when you run

import sys
print(sys.version)

is the python version referred by the PYSPARK_PYTHON environment variable. The one in Cluster --> SparkUI --> Environment is the python version of the Ubuntu instance, which is Python 2.

Source

1 Comment

Databricks Runtime Version is 6.5 ML And when I run %sh printenv I do not see PYSPARK_PYTHON as an environment variable

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.