Pyspark job aborted due to stage failure

Question

I am trying to create PySpark dataframe by using the following code

#!/usr/bin/env python # coding: utf-8 import pyspark from pyspark.sql.session import SparkSession import pyspark.sql.functions as f from pyspark.sql.functions import coalesce spark = SparkSession.builder.appName("Test").enableHiveSupport().getOrCreate() #spark.sql("use bocconi") tableName = "dynamic_pricing.final" inputDF = spark.sql("""SELECT * FROM dynamic_pricing.final WHERE year = '2019' AND mercati_id = '6'""")

I get the following error:

Py4JJavaError: An error occurred while calling o48.sql. : org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 9730 tasks (1024.1 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)

I had gone through those links: link1 and link2, but still problem not resolved. Any ideas about how to solve this? I tried also this:

# Create new config conf = (SparkConf() .set("spark.driver.maxResultSize", 0)) # Create new context sc = SparkContext(conf=conf)

Pierre Gourseaud · Accepted Answer · 2019-05-06 17:41:01Z

Total size of serialized results of 9730 tasks is bigger than spark.driver.maxResultSize means you're sending too much at once for the driver to receive. Looking at your maxResultSize of 1024.0 MB (only 1GB), I would advise you to increase your maxResultSize. Try setting it at 0 so it's unlimited and check if you have an out of memory error.

please see the edit, I tried to change maxresultsize, no luck
Do you have a java.lang.OutOfMemoryError ? Otherwise try conf = (SparkConf().set("spark.driver.maxResultSize", "0")), I think it expects a string argument.

Nayan · Accepted Answer · 2023-02-28 03:25:43Z

Changing the spark executor instance and executor core from 1 to 2 solved the issue for me: spark_executor_cores = 2 spark_executor_instances = 2

Development Environment: Azure Synapse and Azure machine learning.

Collectives™ on Stack Overflow

Pyspark job aborted due to stage failure

2 Answers 2

2 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Linked

Related