101

I built Spark 1.4 from the GH development master, and the build went through fine. But when I do a bin/pyspark I get the Python 2.7.9 version. How can I change this?

1
  • 7
    For anyone looking for how to do this: PYSPARK_DRIVER_PYTHON=ipython3 PYSPARK_DRIVER_PYTHON_OPTS="notebook" ./bin/pyspark, in which case it runs IPython 3 notebook. Commented May 16, 2015 at 19:49

5 Answers 5

156

Just set the environment variable:

export PYSPARK_PYTHON=python3

in case you want this to be a permanent change add this line to pyspark script.

Sign up to request clarification or add additional context in comments.

5 Comments

The environment variables can be edited under /etc/profile. Do not forget to execute "source /etc/profile" after saving the profile, so the changes can be taken into action immediately.
Obviously, use export PYSPARK_PYTHON=python3.5 for Python 3.5
It's better to add this to $SPARK_HOME/conf/spark-env.sh so spark-submit uses the same interpreter as well.
@flow2k that's a better idea. Tnx
@flow2k - thanks alot for this suggestion, it worked!
34
PYSPARK_PYTHON=python3 ./bin/pyspark 

If you want to run in in IPython Notebook, write:

PYSPARK_PYTHON=python3 PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" ./bin/pyspark 

If python3 is not accessible, you need to pass path to it instead.

Bear in mind that the current documentation (as of 1.4.1) has outdate instructions. Fortunately, it has been patched.

4 Comments

I think your command for the IPython Notebook is not correct. Should be like this : PYSPARK_PYTHON=python3 PYSPARK_DRIVER_PYTHON=ipython3 PYSPARK_DRIVER_PYTHON_OPTS="notebook" ./bin/pyspark
@ChrisNielsen In the terminal.
@ChrisNielsen In Linux or OS X is is a terminal/console. I have no idea how it works under Windows (when in Windows, I used Spark only on a Docker container).
@SpiderRico These don't seem to work on my Mac. For Jupyter Notebook to work for Spark, use the following. PYSPARK_PYTHON=python3 PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS="notebook" ./bin/pyspark
9

1,edit profile :vim ~/.profile

2,add the code into the file: export PYSPARK_PYTHON=python3

3, execute command : source ~/.profile

4, ./bin/pyspark

Comments

4

Have a look into the file. The shebang line is probably pointed to the 'env' binary which searches the path for the first compatible executable.

You can change python to python3. Change the env to directly use hardcoded the python3 binary. Or execute the binary directly with python3 and omit the shebang line.

1 Comment

Yeah, looking into the file helped. Needed to set the PYSPARK_PYTHON environment variable.
4

For Jupyter Notebook, edit spark-env.sh file as shown below from command line

$ vi $SPARK_HOME/conf/spark-env.sh 

Goto the bottom of the file and copy paste these lines

export PYSPARK_PYTHON=python3 export PYSPARK_DRIVER_PYTHON=jupyter export PYSPARK_DRIVER_PYTHON_OPTS="notebook" 

Then, simply run following command to start pyspark in notebook

$ pyspark 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.