Amazon AWS Certified Data Engineer - Associate DEA-C01 Exam (page: 6)
Amazon AWS Certified Data Engineer - Associate DEA-C01
Updated on: 15-Feb-2026

A company is planning to use a provisioned Amazon EMR cluster that runs Apache Spark jobs to perform big data analysis. The company requires high reliability. A big data team must follow best practices for running cost-optimized and long-running workloads on Amazon EMR. The team must find a solution that will maintain the company's current level of performance.
Which combination of resources will meet these requirements MOST cost-effectively? (Choose two.)

  1. Use Hadoop Distributed File System (HDFS) as a persistent data store.
  2. Use Amazon S3 as a persistent data store.
  3. Use x86-based instances for core nodes and task nodes.
  4. Use Graviton instances for core nodes and task nodes.
  5. Use Spot Instances for all primary nodes.

Answer(s): B,D



A company wants to implement real-time analytics capabilities. The company wants to use Amazon Kinesis Data Streams and Amazon Redshift to ingest and process streaming data at the rate of several gigabytes per second. The company wants to derive near real-time insights by using existing business intelligence (BI) and analytics tools.
Which solution will meet these requirements with the LEAST operational overhead?

  1. Use Kinesis Data Streams to stage data in Amazon S3. Use the COPY command to load data from Amazon S3 directly into Amazon Redshift to make the data immediately available for real-time analysis.
  2. Access the data from Kinesis Data Streams by using SQL queries. Create materialized views directly on top of the stream. Refresh the materialized views regularly to query the most recent stream data.
  3. Create an external schema in Amazon Redshift to map the data from Kinesis Data Streams to an Amazon Redshift object. Create a materialized view to read data from the stream. Set the materialized view to auto refresh.
  4. Connect Kinesis Data Streams to Amazon Kinesis Data Firehose. Use Kinesis Data Firehose to stage the data in Amazon S3. Use the COPY command to load the data from Amazon S3 to a table in Amazon Redshift.

Answer(s): C



A company uses an Amazon QuickSight dashboard to monitor usage of one of the company's applications. The company uses AWS Glue jobs to process data for the dashboard. The company stores the data in a single Amazon S3 bucket. The company adds new data every day.
A data engineer discovers that dashboard queries are becoming slower over time. The data engineer determines that the root cause of the slowing queries is long-running AWS Glue jobs.
Which actions should the data engineer take to improve the performance of the AWS Glue jobs? (Choose two.)

  1. Partition the data that is in the S3 bucket. Organize the data by year, month, and day.
  2. Increase the AWS Glue instance size by scaling up the worker type.
  3. Convert the AWS Glue schema to the DynamicFrame schema class.
  4. Adjust AWS Glue job scheduling frequency so the jobs run half as many times each day.
  5. Modify the IAM role that grants access to AWS glue to grant access to all S3 features.

Answer(s): A,B



A data engineer needs to use AWS Step Functions to design an orchestration workflow. The workflow must parallel process a large collection of data files and apply a specific transformation to each file.
Which Step Functions state should the data engineer use to meet these requirements?

  1. Parallel state
  2. Choice state
  3. Map state
  4. Wait state

Answer(s): C



A company is migrating a legacy application to an Amazon S3 based data lake. A data engineer reviewed data that is associated with the legacy application. The data engineer found that the legacy data contained some duplicate information.
The data engineer must identify and remove duplicate information from the legacy application data.
Which solution will meet these requirements with the LEAST operational overhead?

  1. Write a custom extract, transform, and load (ETL) job in Python. Use the DataFrame.drop_duplicates() function by importing the Pandas library to perform data deduplication.
  2. Write an AWS Glue extract, transform, and load (ETL) job. Use the FindMatches machine learning (ML) transform to transform the data to perform data deduplication.
  3. Write a custom extract, transform, and load (ETL) job in Python. Import the Python dedupe library. Use the dedupe library to perform data deduplication.
  4. Write an AWS Glue extract, transform, and load (ETL) job. Import the Python dedupe library. Use the dedupe library to perform data deduplication.

Answer(s): B



Viewing Page 6 of 43



Share your comments for Amazon AWS Certified Data Engineer - Associate DEA-C01 exam with other users:

Blessious Phiri 8/15/2023 2:18:00 PM

pdb and cdb are critical to the database
Anonymous


Zuned 10/22/2023 4:39:00 AM

till 104 questions are free, lets see how it helps me in my exam today.
UNITED STATES


Muhammad Rawish Siddiqui 12/3/2023 12:11:00 PM

question # 56, answer is true not false.
SAUDI ARABIA


Amaresh Vashishtha 8/27/2023 1:33:00 AM

i would be requiring dumps to prepare for certification exam
Anonymous


Asad 9/8/2023 1:01:00 AM

very helpful
PAKISTAN


Blessious Phiri 8/13/2023 3:10:00 PM

control file is the heart of rman backup
Anonymous


Senthil 9/19/2023 5:47:00 AM

hi could you please upload the ibm c2090-543 dumps
Anonymous


Harry 6/27/2023 7:20:00 AM

appriciate if you could upload this again
AUSTRALIA


Anonymous 7/10/2023 4:10:00 AM

please upload the dump
SWEDEN


Raja 6/20/2023 5:30:00 AM

i found some questions answers mismatch with explanation answers. please properly update
UNITED STATES


Doora 11/30/2023 4:20:00 AM

nothing to mention
Anonymous


deally 1/19/2024 3:41:00 PM

knowable questions
UNITED STATES


Sonia 7/23/2023 4:03:00 PM

very helpfull
UNITED STATES


binEY 10/6/2023 5:15:00 AM

good questions
Anonymous


Neha 9/28/2023 1:58:00 PM

its helpful
Anonymous


Desmond 1/5/2023 9:11:00 PM

i just took my oracle exam and let me tell you, this exam dumps was a lifesaver! without them, iam not sure i would have passed. the questions were tricky and the answers were obscure, but the exam dumps had everything i needed. i would recommend to anyone looking to pass their oracle exams with flying colors (and a little bit of cheating) lol.
SINGAPORE


Davidson OZ 9/9/2023 6:37:00 PM

22. if you need to make sure that one computer in your hot-spot network can access the internet without hot-spot authentication, which menu allows you to do this? answer is ip binding and not wall garden. wall garden allows specified websites to be accessed with users authentication to the hotspot
Anonymous


381 9/2/2023 4:31:00 PM

is question 1 correct?
Anonymous


Laurent 10/6/2023 5:09:00 PM

good content
Anonymous


Sniper69 5/9/2022 11:04:00 PM

manged to pass the exam with this exam dumps.
UNITED STATES


Deepak 12/27/2023 2:37:00 AM

good questions
SINGAPORE


dba 9/23/2023 3:10:00 AM

can we please have the latest exam questions?
Anonymous


Prasad 9/29/2023 7:27:00 AM

please help with jn0-649 latest dumps
HONG KONG


GTI9982 7/31/2023 10:15:00 PM

please i need this dump. thanks
CANADA


Elton Riva 12/12/2023 8:20:00 PM

i have to take the aws certified developer - associate dva-c02 in the next few weeks and i wanted to know if the questions on your website are the same as the official exam.
Anonymous


Berihun Desalegn Wonde 7/13/2023 11:00:00 AM

all questions are more important
Anonymous


gr 7/2/2023 7:03:00 AM

ques 4 answer should be c ie automatically recover from failure
Anonymous


RS 7/27/2023 7:17:00 AM

very very useful page
INDIA


Blessious Phiri 8/12/2023 11:47:00 AM

the exams are giving me an eye opener
Anonymous


AD 10/22/2023 9:08:00 AM

3rd so far, need to cover more
Anonymous


Matt 11/18/2023 2:32:00 AM

aligns with the pecd notes
Anonymous


Sri 10/15/2023 4:38:00 PM

question 4: b securityadmin is the correct answer. https://docs.snowflake.com/en/user-guide/security-access-control-overview#access-control-framework
GERMANY


H.T.M. D 6/25/2023 2:55:00 PM

kindly please share dumps
Anonymous


Satish 11/6/2023 4:27:00 AM

it is very useful, thank you
Anonymous