I made standalone cluster and wanted to find the fastest way to process my app. My machine has 12g ram. Here is some result I tried.
Test A (took 15mins) 1 worker node spark.executor.memory = 8g spark.driver.memory = 6g Test B(took 8mins) 2 worker nodes spark.executor.memory = 4g spark.driver.memory = 6g Test C(took 6mins) 2 worker nodes spark.executor.memory = 6g spark.driver.memory = 6g Test D(took 6mins) 3 worker nodes spark.executor.memory = 4g spark.driver.memory = 6g Test E(took 6mins) 3 worker nodes spark.executor.memory = 6g spark.driver.memory = 6g - Compared Test A, Test B just made one more woker (but same memory spend 4*2=8) but It made app fast. Why it happened?
- Test C, D, E tried to spend much more memory than it had. but It worked and even faster. is config memory size just for limiting edge of memory?
- It does not just as fast as adding worker nodes. How should I know profit number of worker and executor memory size?