I'm trying to run the following Python script locally, using spark-submit command:
import sys sys.path.insert(0, '.') from pyspark import SparkContext, SparkConf from commons.Utils import Utils def splitComma(line): splits = Utils.COMMA_DELIMITER.split(line) return "{}, {}".format(splits[1], splits[2]) if __name__ == "__main__": conf = SparkConf().setAppName("airports").setMaster("local[2]") sc = SparkContext(conf = conf) airports = sc.textFile("in/airports.text") airportsInUSA = airports\ .filter(lambda line : Utils.COMMA_DELIMITER.split(line)[3] == "\"United States\"") airportsNameAndCityNames = airportsInUSA.map(splitComma) airportsNameAndCityNames.saveAsTextFile("out/airports_in_usa.text") The command used (while inside the project directory):
spark-submit rdd/AirportsInUsaSolution.py I keep getting this error:
Traceback (most recent call last): File "/home/gustavo/Documentos/TCC/python_spark_yt/python-spark-tutorial/rdd/AirportsInUsaSolution.py", line 4, in from commons.Utils import Utils ImportError: No module named commons.Utils
Even though there is a commons.Utils with a Utils class.
It seems that the only imports it accepts are the ones from Spark, because this error persists when I try to import any other class or file from my project.