4

A standard DPU in AWS Glue comes with 4 vCPU and 2 executors. I am confused about the maximum number of concurrent tasks that can be run in parallel with this configuration. Is it 4 or 8 on a single DPU with 4vcpu and 2 executors?

2
  • By default Spark allocates 1 core per task. Commented Jun 27, 2020 at 19:45
  • That I understand. But if you have say 8 tasks, will all 8 run in parralel as there are 2 executors or will only 4 run in parralel Commented Jun 27, 2020 at 19:57

1 Answer 1

7

I had a similar discussion with the AWS Glue support team about this, I'll share with you what they told me about Glue Configuration. Take in example the Standard and the G1.X configuration.

Standard DPU Configuration:

  • 1 DPU reserved for MasterNode
  • 1 executor reserved for Driver/ApplicationMaster
  • Each DPU is configured with 2 executors
  • Each executor is configured with 5.5 GB memory
  • Each executor is configured with 4 cores

G.1X WorkerType Configuration:

  • 1 DPU added for MasterNode
  • 1 DPU reserved for Driver/ApplicationMaster
  • Each worker is configured with 1 executor
  • Each executor is configured with 10 GB memory
  • Each executor is configured with 8 cores

If we have for example a Job with Standard Configuration with 21 DPU means that we have:

  • 1 DPU reserved for Master
  • 20 DPU x 2 = 40 executors
  • 40 executors - 1 Driver/AM = 39 executors

Which we then end up with a total amount of 156 cores. Meaning, your job has 156 slots for execution. If for example you read files from S3 that means that you will be able to accept 156 input files in parallel.

Hope it helps.

Sign up to request clarification or add additional context in comments.

2 Comments

I've enabled Job monitoring and When I check Job Execution: Active Executors, Completed Stages & Maximum Needed Executors Graph, there is no data displayed in Number of Active Executors and Number of Maximum Needed Executors.and at bottom it shows Maximum allocated Executors(1) can you help me with that?
@AchyutVyas if you are running on glue 2.0, it does not support ExecutionAllocationManager metrics: docs.aws.amazon.com/glue/latest/dg/… (Features Not Supported)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.