From this StackOverflow thread, I know how to obtain and use the log4j logger in pyspark like so:
from pyspark import SparkContext sc = SparkContext() log4jLogger = sc._jvm.org.apache.log4j LOGGER = log4jLogger.LogManager.getLogger('MYLOGGER') LOGGER.info("pyspark script logger initialized") Which works fine with the spark-submit script.
My question is how to modify the log4j.properties file to configure the log level for this particular logger or how to configure it dynamically?