A data engineer needs to schedule a workflow that runs a set of AWS Glue jobs every day. The data engineer does not require the Glue jobs to run or finish at a specific time.Which solution will run the Glue jobs in the MOST cost-effective way?
Answer(s): A
A) Choosing FLEX execution class is most cost-effective for nondeterministic or flexible-start workflows, as FLEX allows Glue to use fewer compute resources and scale efficiently when exact start/finish times aren’t required.B) Spot Instances are not a Glue job option; Glue supports on-demand, streaming, and a few pricing models, but Spot isn’t applicable for Glue jobs.C) STANDARD is a fixed cost model; it may be more expensive for sporadic, scheduled tasks that don’t need guaranteed timing.D) GlueVersion specifies features, not cost optimization; newer versions don’t inherently reduce cost for flexible schedules.
A data engineer needs to create an AWS Lambda function that converts the format of data from .csv to Apache Parquet. The Lambda function must run only if a user uploads a .csv file to an Amazon S3 bucket.Which solution will meet these requirements with the LEAST operational overhead?
A is correct because S3 event notifications can filter for object creation events with a .csv suffix and directly invoke the Lambda function, minimizing components and operational overhead. B is incorrect because tag-based triggers require tagging policy and do not guarantee the file is a CSV, adding complexity. C is incorrect because s3:* is overly broad and not needed; it would generate excessive events and complicate processing. D is incorrect because using SNS introduces an additional service and subscription step, increasing latency and maintenance versus direct Lambda invocation from S3 events.
A data engineer needs Amazon Athena queries to finish faster. The data engineer notices that all the files the Athena queries use are currently stored in uncompressed .csv format. The data engineer also notices that users perform most queries by selecting a specific column.Which solution will MOST speed up the Athena query performance?
Answer(s): C
Athena performance improves when using columnar, compressed formats and predicate pushdown; Parquet with Snappy enables efficient column pruning for single-column queries.A) JSON is not columnar and increases read I/O; adds no efficient columnar pruning, even with Snappy.B) Snappy compression alone on CSV helps slightly but does not enable columnar pruning or reduce I/O as effectively as Parquet.C) Parquet is columnar AND Snappy-compressed, enabling predicate pushdown for specific columns and substantial I/O reduction, speeding queries.D) gzip is a single-stream compression on CSV, not columnar and limits parallelism/Predicate pushdown; less performance gain than Parquet.
A manufacturing company collects sensor data from its factory floor to monitor and enhance operational efficiency. The company uses Amazon Kinesis Data Streams to publish the data that the sensors collect to a data stream. Then Amazon Kinesis Data Firehose writes the data to an Amazon S3 bucket.The company needs to display a real-time view of operational efficiency on a large screen in the manufacturing facility.Which solution will meet these requirements with the LOWEST latency?
A) Real-time processing with Amazon Managed Service for Apache Flink and a Grafana dashboard using Timestream minimizes latency by streaming analytics directly from Kinesis Data Streams, with Timestream as a time-series store and Grafana for real-time visualization.B) S3 event-based Lambda introduces higher latency due to polling and object-level processing; Aurora+QuickSight is batch-oriented and not real-time.C) While Flink is suitable, publishing directly from Flink to Timestream via a dedicated Firehose stream adds unnecessary hops and potential latency compared to inline Flink processing and Grafana.D) AWS Glue bookmarks are batch-oriented and not suitable for real-time dashboards; Grafana over Timestream would be possible, but the end-to-end real-time path is weaker than A.
A company stores daily records of the financial performance of investment portfolios in .csv format in an Amazon S3 bucket. A data engineer uses AWS Glue crawlers to crawl the S3 data.The data engineer must make the S3 data accessible daily in the AWS Glue Data Catalog.Which solution will meet these requirements?
Answer(s): B
The correct option B: Uses the AWSGlueServiceRole, which is the appropriate IAM role for Glue crawlers, ensuring least privilege and proper integration with the Glue service; specifying the S3 source and a daily crawl with a database name for the output places the catalog metadata in a known database.A) Uses AmazonS3FullAccess, excessive permissions and not required; output destination path concept not necessary for catalog integration.C) Adds DPUs but lacks proper output database naming and uses broad S3 access; unnecessary for catalog registration.D) Same issue as C with DPUs and output path, plus uses full service role rather than Glue service role.
A company loads transaction data for each day into Amazon Redshift tables at the end of each day. The company wants to have the ability to track which tables have been loaded and which tables still need to be loaded.A data engineer wants to store the load statuses of Redshift tables in an Amazon DynamoDB table. The data engineer creates an AWS Lambda function to publish the details of the load statuses to DynamoDB.How should the data engineer invoke the Lambda function to write load statuses to the DynamoDB table?
B) The Redshift Data API can publish events to EventBridge, which can trigger the Lambda to write load statuses to DynamoDB, enabling decoupled, serverless event-driven updates aligned with Redshift activity.A) Requires a separate Lambda and CloudWatch events; more complex and not as direct as EventBridge integration with Redshift Data API.C) SQS-to-Lambda path adds unnecessary queueing and is not the idiomatic Redshift event notification mechanism.D) CloudTrail events are audit logs, not intended for real-time data workflow triggers between Redshift and Lambda.
A data engineer needs to securely transfer 5 TB of data from an on-premises data center to an Amazon S3 bucket. Approximately 5% of the data changes every day. Updates to the data need to be regularly proliferated to the S3 bucket. The data includes files that are in multiple formats. The data engineer needs to automate the transfer process and must schedule the process to run periodically.Which AWS service should the data engineer use to transfer the data in the MOST operationally efficient way?
A) AWS DataSync is correct because it enables secure, automated, periodic transfer of large on-premises datasets to S3, supports incremental changes, multiple file formats, and can schedule transfers; it handles continuous updates efficiently without manual scripting.B) AWS Glue is optimized for ETL processing and data cataloging, not for secure, ongoing bulk transfer from on-premises to S3 with scheduling and incremental sync.C) AWS Direct Connect provides a dedicated network connection, not data movement orchestration or scheduling of transfers to S3.D) Amazon S3 Transfer Acceleration speeds individual uploads over long distances but is not designed for automated, scheduled, incremental sync from on-premises with ongoing updates.
A company uses an on-premises Microsoft SQL Server database to store financial transaction data. The company migrates the transaction data from the on-premises database to AWS at the end of each month. The company has noticed that the cost to migrate data from the on-premises database to an Amazon RDS for SQL Server database has increased recently.The company requires a cost-effective solution to migrate the data to AWS. The solution must cause minimal downtown for the applications that access the database.Which AWS service should the company use to meet these requirements?
AWS DMS is designed for ongoing or batch migrations with minimal downtime, supporting continuous replication from on-premises SQL Server to RDS for SQL Server and optimizing for cost during monthly migrations.A) AWS Lambda is event-driven compute unsuitable for large data migrations and lacks built-in data replication capabilities.C) AWS Direct Connect provides dedicated network connectivity but does not handle data transformation or ongoing replication between on-prem and AWS.D) AWS DataSync focuses on high-speed transfer of files and object storage, not relational database replication to RDS.B) Correct: DMS handles database migration with minimal downtime and cost-effective, ongoing replication for SQL Server to RDS.
Share your comments for Amazon DEA-C01 exam with other users:
i scored 87% on the az-204 exam. thanks! i always trust
good need more
sample questions seems good
huawei is ok
good one nice
please continue
this exam dumps just did the job. i donot want to ruffle your feathers but your exam dumps and mock test engine is amazing.
nice questions
the explanation are really helpful
just passed my exam yesterday on my first attempt. these dumps were extremely helpful in passing first time. the questions were very, very similar to these questions!
cosmos db is paas not saas
what is the percentage of common questions in gcp exam compared to 197 dump questions? are they 100% matching with real gcp exam?
not able to see questions
by far one of the best sites for free questions. i have pass 2 exams with the help of this website.
excellent question bank.
it really helped
excelent material
the new versoin of this exam which i downloaded has all the latest questions from the exam. i only saw 3 new questions in the exam which was not in this dump.
question 8 - can cloudtrail be used for storing jobs? based on aws - aws cloudtrail is used for governance, compliance and investigating api usage across all of our aws accounts. every action that is taken by a user or script is an api call so this is logged to [aws] cloudtrail. something seems incorrect here.
question 13 tda - c01 answer : quick table calculation -> percentage of total , compute using table down
pls share teh dump
question 44 answer is user risk
please post the questions for preparation
thanks for the questions
please reopen it now ..its really urgent
these practice exam questions were exactly what i needed. the variety of questions and the realistic exam-like environment they created helped me assess my strengths and weaknesses. i felt more confident and well-prepared on exam day, and i owe it to this exam dumps!
thank u it very instructuf
its helpful?
is this dump still valid???
question 205 answer is b
question 39, should be answer b, directions stated is being sudneted from /21 to a /23. a /23 has 512 ips so 510 hosts. and can make 4 subnets out of the /21
beautiful test engine software and very helpful. questions are same as in the real exam. i passed my paper.
the questions are exactly the same in real exam. just make sure not to answer all them correct or else they suspect you are cheating.
question: 78 the right answer i think is d not a