A data engineer is building a data pipeline on AWS by using AWS Glue extract, transform, and load (ETL) jobs. The data engineer needs to process data from Amazon RDS and MongoDB, perform transformations, and load the transformed data into Amazon Redshift for analytics. The data updates must occur every hour.Which combination of tasks will meet these requirements with the LEAST operational overhead? (Choose two.)
Answer(s): A,D
Hourly triggers in AWS Glue provide automated, serverless ETL execution aligned with the requirement, meeting the least-ops overhead. A) Glue triggers can schedule ETL jobs to run on an hourly cadence without manual intervention. D) Glue connections enable secure, managed connectivity between RDS, MongoDB, and Redshift within Glue’s managed environment, simplifying data movement and transformation without custom networking setup. B) DataBrew is primarily for data cleaning in a data prep context and not a full ETL workflow for multiple data stores to Redshift. C) Lambda scheduling would add orchestration and state management overhead. E) Redshift Data API is for issuing SQL from applications, not for orchestrating and loading ETL pipelines.
A company uses an Amazon Redshift cluster that runs on RA3 nodes. The company wants to scale read and write capacity to meet demand. A data engineer needs to identify a solution that will turn on concurrency scaling.Which solution will meet this requirement?
Answer(s): B
Concurrency scaling in Redshift is enabled at the WLM queue level for a provisioned cluster, which allows automatic scaling of read/write workloads to handle bursts without user intervention. A) is incorrect because concurrency scaling applies to cluster-based WLM, not Serverless workgroups. B) is correct. C) is incorrect because concurrency scaling is not toggled globally at cluster creation; it is configured per WLM queue. D) is incorrect because daily usage quotas are unrelated to concurrency scaling behavior.
A data engineer must orchestrate a series of Amazon Athena queries that will run every day. Each query can run for more than 15 minutes.Which combination of steps will meet these requirements MOST cost-effectively? (Choose two.)
Answer(s): A,B
Athena queries over 15 minutes are best orchestrated with event-driven control and scalable coordination; Lambda with start_query_execution is cost-effective for invoking queries, and Step Functions can poll for completion without idle EC2 costs.A) Lambda with Athena Boto3 start_query_execution is cost-efficient for short-lived orchestration and triggers; suitable for repeated daily runs without provisioning servers.B) Step Functions with a Wait and get_query_execution provides reliable polling and sequencing across multiple queries without constant active polling, reducing compute waste.C) Glue Python shell is more expensive and not ideal for long-running, frequent daily queries; adds unnecessary ETL service.D) Glue Python shell with sleep polling incurs unnecessary Lambda-like idle wait and maintenance overhead.E) MWAA introduces extra managed Airflow overhead and AWS Batch, not cost-optimal for simple sequential tasks.
A company is migrating on-premises workloads to AWS. The company wants to reduce overall operational overhead. The company also wants to explore serverless options.The company's current workloads use Apache Pig, Apache Oozie, Apache Spark, Apache Hbase, and Apache Flink. The on-premises workloads process petabytes of data in seconds. The company must maintain similar or better performance after the migration to AWS.Which extract, transform, and load (ETL) service will meet these requirements?
A strong fit: Amazon EMR, because it provides managed clusters for big data frameworks (Hadoop, Spark, HBase, Flink, Pig, Oozie) enabling scalable ETL at prior-on-prem performance, with options to run on-demand and serverless-like flexibility via EMR on EKS/Step Functions integration, reducing operational overhead.A) AWS Glue is serverless but primarily targets data cataloging and ETL for structured data; it may not natively support Pig, Oozie, HBase, or Flink at petabyte-scale with existing Pig/Oozie workflows.C) AWS Lambda is serverless compute but not suitable for long-running, heavy ETL workloads and complex big data pipelines at petabyte scale.D) Amazon Redshift is a data warehouse, not an ETL service, and lacks direct support for Pig/Oozie workflows and HBase/Flink-based processing.
A data engineer must use AWS services to ingest a dataset into an Amazon S3 data lake. The data engineer profiles the dataset and discovers that the dataset contains personally identifiable information (PII). The data engineer must implement a solution to profile the dataset and obfuscate the PII.Which solution will meet this requirement with the LEAST operational effort?
The Detect PII transform in AWS Glue Studio provides built-in profiling and PII detection with minimal setup, and combining it with obfuscation and an orchestrated ingest pipeline via AWS Step Functions yields a low-operational, serverless solution to profile and mask data before storing in S3.A) Requires custom Lambda transform and SDK, increasing operational overhead and maintenance risk. C) Uses Glue Studio detection but relies on Glue Data Quality for obfuscation, adding extra tools and steps. D) Involves DynamoDB and Lambda for both detection and obfuscation, plus manual data movement to S3, raising complexity and latency.
A company maintains multiple extract, transform, and load (ETL) workflows that ingest data from the company's operational databases into an Amazon S3 based data lake. The ETL workflows use AWS Glue and Amazon EMR to process data.The company wants to improve the existing architecture to provide automated orchestration and to require minimal manual effort.Which solution will meet these requirements with the LEAST operational overhead?
Automating ETL orchestration with minimal manual effort is best achieved using AWS Step Functions, which can coordinate Glue jobs, EMR steps, and other AWS services in serverless workflows with built-in retries, error handling, and visual monitoring.A) AWS Glue workflows are Glue-native but provide limited cross-service orchestration and less flexibility for complex state machines compared to Step Functions.C) AWS Lambda functions require custom orchestration logic and may not handle long-running tasks efficiently, increasing operational effort.D) Amazon MWAA provides Airflow-based orchestration but introduces more management overhead and is not as lightweight as Step Functions for serverless, event-driven workflows.
A company currently stores all of its data in Amazon S3 by using the S3 Standard storage class.A data engineer examined data access patterns to identify trends. During the first 6 months, most data files are accessed several times each day. Between 6 months and 2 years, most data files are accessed once or twice each month. After 2 years, data files are accessed only once or twice each year.The data engineer needs to use an S3 Lifecycle policy to develop new data storage rules. The new storage solution must continue to provide high availability.Which solution will meet these requirements in the MOST cost-effective way?
Transition to S3 Standard-IA after 6 months preserves high availability while reducing cost for infrequently accessed data, and then move to Glacier Flexible Retrieval after 2 years for long-term archival at lower costs. This aligns with a tiered retention pattern: frequent access early, then infrequent, then archival, without sacrificing availability during the lifecycle.A) One Zone-IA reduces durability (single AZ) and is not ideal for high availability across AZ failures.C) Glacier Deep Archive after 2 years offers the lowest cost but higher retrieval times and is not the most cost-effective given typical retrieval requirements.D) One Zone-IA + Glacier Deep Archive further compromises durability and performance compared with Standard-IA followed by Glacier Flexible Retrieval.
A company maintains an Amazon Redshift provisioned cluster that the company uses for extract, transform, and load (ETL) operations to support critical analysis tasks. A sales team within the company maintains a Redshift cluster that the sales team uses for business intelligence (BI) tasks.The sales team recently requested access to the data that is in the ETL Redshift cluster so the team can perform weekly summary analysis tasks. The sales team needs to join data from the ETL cluster with data that is in the sales team's BI cluster.The company needs a solution that will share the ETL cluster data with the sales team without interrupting the critical analysis tasks. The solution must minimize usage of the computing resources of the ETL cluster.Which solution will meet these requirements?
Answer(s): A
Redshift data sharing allows a consumer cluster (sales BI) to access live data from the producer cluster (ETL) without data duplication or ETL cluster load, meeting low-resource impact and real-time joined analytics.A) Correct. Redshift data sharing enables cross-cluster query access with minimal compute on the producer, avoiding ETL disruption.B) Incorrect. Materialized views would require data duplication or periodic refresh, and granting direct access to the ETL cluster increases load and risks contention.C) Incorrect. Database views alone offer no cross-cluster sharing; direct access forces ETL cluster workload and potential performance impact.D) Incorrect. Unloading to S3 and Spectrum adds ETL to ETL data movement, introduces latency, and does not provide real-time joins between clusters.
Share your comments for Amazon DEA-C01 exam with other users:
could you please upload the dumps of sap c_sac_2302
asm management configuration is about storage
kool thumb up
just passed the az-500 exam this last friday. most of the questions in this exam dumps are in the exam. i bought the full version and noticed some of the questions which were answered wrong in the free version are all corrected in the full version. this site is good but i wish the had it in an interactive version like a test engine simulator.
i can practice for exam
please i need this exam.
i need the dump
i want it bad, even if cs6 maybe retired, i want to learn cs6
i hate comptia with all my heart with their "choose the best" answer format as an argument could be made on every question. they say "the "comptia way", lmao no this right here boys is the comptia way 100%. take it from someone whos failed this exam twice but can configure an entire complex network that these are the questions that are on the test 100% no questions asked. the pbqs are dead on! nice work
very good materials
thanks for your support.
iam impressed with the quality of these dumps. they questions and answers were easy to understand and the xengine app was very helpful to use.
not bad but you question database from isaca
awesome contents
answer to 134 is casb. while data loss prevention is the goal, in order to implement dlp in cloud applications you need to deploy a casb.
are these brain dumps sufficient enough to go write exam after practicing them? or does one need more material this wont be enough?
i did attend the required cources and i need to be sure that i am ready to take the exam, i would ask you please to share the questions, to be sure that i am fit to proceed with taking the exam.
why only give explanations on some, and not all questions and their respective answers?
refresh db knowledge
interested for sap certification
could you please upload practice questions for scr exam ?
please upload free oracle cloud infrastructure 2023 foundations associate exam braindumps
sweating! they are tricky
i never use these dumps sites but i had to do it for this exam as it is impossible to pass without using these question dumps.
good practice and well sites.
passed my first exam last week and pass the second exam this morning. thank you sir for all the help and these brian dumps.
does anyone who attended exam csa 8.8, can confirm these questions are really coming ? or these are just for practicing?
kindly share the dumps
very nice content
passed today
hi can you please upload questions
please upload quetions
i passed my exam thanks to this braindumps questions. these questions are valid in us and i highly recommend it!
are they truely latest