0

I am trying to execute Python code inside Scala program passing RDD as data to the Python script. The Spark cluster is initialized successfully, the data conversion to RDD is fine and running the Python script separately(outside Scala code) works. However, the execution of the same Python script inside Scala fails with:

java.lang.IllegalStateException: Subprocess exited with status 2. Command ran: /{filePath}/{File}.py 

Looking deeper, it shows import: command not found when trying to execute the Python file. I believe the built-in Scala executor cannot understand that this is a Python script.

Note: My environment variables are set correctly and I can access and execute Python scripts from everywhere else. In place of filePath and File I have the actual path to that file and the file name.

Environment: Spark 2.2.1 Scala 2.11.11 Python 2.7.10

Code:

val conf=new SparkConf().setAppName("Test").setMaster("local[*]") val sparkContext = new SparkContext(conf) val distScript = "/{filePath}/{File}.py" val distScriptName = "{File}.py" sparkContext.addFile(distScript) val ipData = sparkContext.parallelize(List("asd","xyz","zxcz","sdfsfd","Ssdfd","Sdfsf")) val pipeRDD = ipData.pipe(SparkFiles.get(distScriptName)) pipeRDD.foreach(println) 

Have someone tried this before and is able to help resolving the issue I am getting? Is this the best way to integrate Scala and Python scripts? I am open for some other verified recommendations and suggestions to try.

3
  • Piping python command with Spark should work. Just looking at your code, arent you missing a 's' before the string? val distScript = s"/{filePath}/{File}.py" val distScriptName = s"{File}.py" Commented Feb 16, 2018 at 18:36
  • @geoalgo and some dollar signs. Commented Feb 16, 2018 at 18:40
  • @erip The file name and path are just placeholders for the post. I have the actual names in my code, there are no compilation errors and it is able to find and add the file to the SparkContext. Commented Feb 16, 2018 at 18:43

1 Answer 1

1

I found where the issue came from. I was missing the following line in the Python file:

#!/usr/bin/python 

After adding it the java.lang.IllegalStateException disappeared.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.