Which of the following statements about executors is correct, assuming that one can consider each of the JVMs working as executors as a pool of task execution slots?
- Slot is another name for executor.
- There must be less executors than tasks.
- An executor runs on a single core.
- There must be more slots than tasks.
- Tasks run in parallel via slots.
Answer(s): E
Explanation:
Tasks run in parallel via slots.
Correct. Given the assumption, an executor then has one or more "slots", defined by the equation spark.executor.cores / spark.task.cpus. With the executor's resources divided into slots, each task takes up a slot and multiple tasks can be executed in parallel. Slot is another name for executor.
No, a slot is part of an executor. An executor runs on a single core.
No, an executor can occupy multiple cores. This is set by the spark.executor.cores option. There must be more slots than tasks.
No. Slots just process tasks. One could imagine a scenario where there was just a single slot for multiple tasks, processing one task at a time. Granted – this is the opposite of what Spark should be used for, which is distributed data processing over multiple cores and machines, performing many tasks in parallel.
There must be less executors than tasks. No, there is no such requirement.
More info: Spark Architecture | Distributed Systems Architecture (https://bit.ly/3x4MZZt)
Reveal Solution Next Question