Revisions to Spark can't find Python module

Post Undeleted by pvy4917

occurred Nov 5, 2018 at 21:25

Post Deleted by pvy4917

occurred Nov 5, 2018 at 21:25

Mod Moved Comments To Chat

occurred Oct 17, 2018 at 23:25

Added new code

Source Link

edited Oct 17, 2018 at 18:08

pvy4917

1.8k
20
27

from pyspark import SparkContext, SparkConf def splitComma(line): splits = Utils.COMMA_DELIMITER.split(line) return "{}, {}".format(splits[1], splits[2]) if __name__ == "__main__": conf = SparkConf().setAppName("airports").setMaster("local[2]") sc = SparkContext(conf = conf) sc.addPyFile('.../pathto commons.zip') from commons import Utils airports = sc.textFile("in/airports.text") airportsInUSA = airports\ .filter(lambda line : Utils.COMMA_DELIMITER.split(line)[3] == "\"United States\"") airportsNameAndCityNames = airportsInUSA.map(splitComma) airportsNameAndCityNames.saveAsTextFile("out/airports_in_usa.text")

Yes, it only accepts the ones from the Spark. You can zip the required files (Utils, numpy) etc and specify the parameter --py-files in the spark-submit.

spark-submit --py-files rdd/file.zip rdd/AirportsInUsaSolution.py

Yes, it only accepts the ones from the Spark. You can zip the required files (Utils, numpy) etc and specify the parameter --py-files in the spark-submit.

spark-submit --py-files rdd/file.zip rdd/AirportsInUsaSolution.py

from pyspark import SparkContext, SparkConf def splitComma(line): splits = Utils.COMMA_DELIMITER.split(line) return "{}, {}".format(splits[1], splits[2]) if __name__ == "__main__": conf = SparkConf().setAppName("airports").setMaster("local[2]") sc = SparkContext(conf = conf) sc.addPyFile('.../pathto commons.zip') from commons import Utils airports = sc.textFile("in/airports.text") airportsInUSA = airports\ .filter(lambda line : Utils.COMMA_DELIMITER.split(line)[3] == "\"United States\"") airportsNameAndCityNames = airportsInUSA.map(splitComma) airportsNameAndCityNames.saveAsTextFile("out/airports_in_usa.text")

Yes, it only accepts the ones from the Spark. You can zip the required files (Utils, numpy) etc and specify the parameter --py-files in the spark-submit.

spark-submit --py-files rdd/file.zip rdd/AirportsInUsaSolution.py

Source Link

answered Oct 16, 2018 at 19:32

pvy4917

1.8k
20
27

Yes, it only accepts the ones from the Spark. You can zip the required files (Utils, numpy) etc and specify the parameter --py-files in the spark-submit.

spark-submit --py-files rdd/file.zip rdd/AirportsInUsaSolution.py

Collectives™ on Stack Overflow

Return to Answer