1. Home
2. Questions
3. AI Assist Labs
4. Tags
6. Challenges
7. Chat
8. Articles
9. Users
11. Jobs
12. Companies
13. Collectives
14. Communities for your favorite technologies. Explore all Collectives
Stack Internal

Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work.
Try for free Learn more
Stack Internal
Bring the best of human thought and AI automation together at your work. Learn more

spark submit executor memory/failed batch

Asked 9 years, 6 months ago

Modified 6 years, 3 months ago

Viewed 322 times

2

I have 2 questions on spark streaming :

I have a spark streaming application running and collection data in 20 seconds batch intervals, out of 4000 batches there are 18 batches which failed because of exception :

Could not compute split, block input-0-1464774108087 not found

I assumed the data size is bigger than spark available memory at that point, also the app StorageLevel is MEMORY_ONLY.

Please advice how to fix this.

Also in the command I use below, I use executor memory 20G(total RAM on the data nodes is 140G), does that mean all that memory is reserved in full for this app, and what happens if I have multiple spark streaming applications ?

would I not run out of memory after a few applications ? do I need that much memory at all ?

/usr/iop/4.1.0.0/spark/bin/spark-submit --master yarn --deploy-mode client --jars /home/blah.jar --num-executors 8 --executor-cores 5 --executor-memory 20G --driver-memory 12G --driver-cores 8
--class com.ccc.nifi.MyProcessor Nifi-Spark-Streaming-20160524.jar

edited Jun 2, 2016 at 4:14

29k9 gold badges67 silver badges106 bronze badges

asked Jun 1, 2016 at 16:27

3633 silver badges17 bronze badges

Add a comment |

1 Answer 1

Sorted by:

0

It seems might be your executor memory will be getting full,try these few optimization techniques like :

Instead of using StorageLevel is MEMORY_AND_DISK.
Use Kyro serialization which is fast and better than normal java serialization.f yougo for caching with memory and serialization.
Check if there are gc,you can find in the tasks being executed.

edited Aug 8, 2019 at 6:18

52.6k29 gold badges130 silver badges142 bronze badges

answered Aug 8, 2019 at 5:51

2163 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Explore related questions

See similar questions with these tags.