I built Spark 1.4 from the GH development master, and the build went through fine. But when I do a bin/pyspark I get the Python 2.7.9 version. How can I change this?
5 Answers
Just set the environment variable:
export PYSPARK_PYTHON=python3
in case you want this to be a permanent change add this line to pyspark script.
5 Comments
export PYSPARK_PYTHON=python3.5 for Python 3.5$SPARK_HOME/conf/spark-env.sh so spark-submit uses the same interpreter as well.PYSPARK_PYTHON=python3 ./bin/pyspark If you want to run in in IPython Notebook, write:
PYSPARK_PYTHON=python3 PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" ./bin/pyspark If python3 is not accessible, you need to pass path to it instead.
Bear in mind that the current documentation (as of 1.4.1) has outdate instructions. Fortunately, it has been patched.
4 Comments
Have a look into the file. The shebang line is probably pointed to the 'env' binary which searches the path for the first compatible executable.
You can change python to python3. Change the env to directly use hardcoded the python3 binary. Or execute the binary directly with python3 and omit the shebang line.
1 Comment
PYSPARK_PYTHON environment variable.For Jupyter Notebook, edit spark-env.sh file as shown below from command line
$ vi $SPARK_HOME/conf/spark-env.sh Goto the bottom of the file and copy paste these lines
export PYSPARK_PYTHON=python3 export PYSPARK_DRIVER_PYTHON=jupyter export PYSPARK_DRIVER_PYTHON_OPTS="notebook" Then, simply run following command to start pyspark in notebook
$ pyspark
PYSPARK_DRIVER_PYTHON=ipython3 PYSPARK_DRIVER_PYTHON_OPTS="notebook" ./bin/pyspark, in which case it runs IPython 3 notebook.