Amazon DAS-C01 Exam (page: 2)
Amazon AWS Certified Data Analytics - Specialty (DAS-C01)
Updated on: 29-Mar-2026

Viewing Page 2 of 22

A manufacturing company has been collecting IoT sensor data from devices on its factory oor for a year and is storing the data in Amazon Redshift for daily analysis. A data analyst has determined that, at an expected ingestion rate of about 2 TB per day, the cluster will be undersized in less than 4 months. A long-term solution is needed. The data analyst has indicated that most queries only reference the most recent 13 months of data, yet there are also quarterly reports that need to query all the data generated from the past 7 years. The chief technology o cer (CTO) is concerned about the costs, administrative effort, and performance of a long-term solution.
Which solution should the data analyst use to meet these requirements?

  1. Create a daily job in AWS Glue to UNLOAD records older than 13 months to Amazon S3 and delete those records from Amazon Redshift. Create an external table in Amazon Redshift to point to the S3 location. Use Amazon Redshift Spectrum to join to data that is older than 13 months.
  2. Take a snapshot of the Amazon Redshift cluster. Restore the cluster to a new cluster using dense storage nodes with additional storage capacity.
  3. Execute a CREATE TABLE AS SELECT (CTAS) statement to move records that are older than 13 months to quarterly partitioned data in Amazon Redshift Spectrum backed by Amazon S3.
  4. Unload all the tables in Amazon Redshift to an Amazon S3 bucket using S3 Intelligent-Tiering. Use AWS Glue to crawl the S3 bucket location to create external tables in an AWS Glue Data Catalog. Create an Amazon EMR cluster using Auto Scaling for any daily analytics needs, and use Amazon Athena for the quarterly reports, with both using the same AWS Glue Data Catalog.

Answer(s): A

Explanation:

A) Correct because using Redshift to keep recent 13 months, unload older data to S3 and reference via Redshift Spectrum provides cost-effective long-term storage, maintains query performance for recent data, and enables access to historical data without oversized Redshift maintenance. B) Fully duplicative duplication of cluster storage is costly and does not optimize long-term data lifecycle or query patterns. C) CTAS to move older data into S3 via Spectrum is plausible but partitioning by quarters adds unnecessary complexity and does not clearly minimize maintenance or cost for mixed query workloads. D) Involves EMR/Athena with Glue catalog; adds managing multiple services and higher orchestration overhead for quarterly reports.



An insurance company has raw data in JSON format that is sent without a prede ned schedule through an Amazon Kinesis Data Firehose delivery stream to an
Amazon S3 bucket. An AWS Glue crawler is scheduled to run every 8 hours to update the schema in the data catalog of the tables stored in the S3 bucket. Data analysts analyze the data using Apache Spark SQL on Amazon EMR set up with AWS Glue Data Catalog as the metastore. Data analysts say that, occasionally, the data they receive is stale. A data engineer needs to provide access to the most up-to-date data.
Which solution meets these requirements?

  1. Create an external schema based on the AWS Glue Data Catalog on the existing Amazon Redshift cluster to query new data in Amazon S3 with Amazon Redshift Spectrum.
  2. Use Amazon CloudWatch Events with the rate (1 hour) expression to execute the AWS Glue crawler every hour.
  3. Using the AWS CLI, modify the execution schedule of the AWS Glue crawler from 8 hours to 1 minute.
  4. Run the AWS Glue crawler from an AWS Lambda function triggered by an S3:ObjectCreated:* event noti cation on the S3 bucket.

Answer(s): D

Explanation:

The correct solution ensures the crawler updates the catalog immediately as new data arrives, reducing staleness.
A) Incorrect: Redshift Spectrum relies on a Redshift cluster; external schema with Glue CDC does not address real-time crawling and may incur latency or data freshness issues.
B) Incorrect: Scheduling crawler hourly still allows gaps; data arriving between runs may be stale.
C) Incorrect: Modifying the crawler interval to 1 minute is impractical and may cause excessive crawl overhead and costs; not event-driven.
D) Correct: Lambda triggered by S3 ObjectCreated events provides event-driven, near-real-time crawler execution, ensuring the Data Catalog reflects new data promptly.



A company that produces network devices has millions of users. Data is collected from the devices on an hourly basis and stored in an Amazon S3 data lake.
The company runs analyses on the last 24 hours of data ow logs for abnormality detection and to troubleshoot and resolve user issues. The company also analyzes historical logs dating back 2 years to discover patterns and look for improvement opportunities. The data ow logs contain many metrics, such as date, timestamp, source IP, and target IP. There are about 10 billion events every day.
How should this data be stored for optimal performance?

  1. In Apache ORC partitioned by date and sorted by source IP
  2. In compressed .csv partitioned by date and sorted by source IP
  3. In Apache Parquet partitioned by source IP and sorted by date
  4. In compressed nested JSON partitioned by source IP and sorted by date

Answer(s): A

Explanation:

A) In Apache ORC partitioned by date and sorted by source IP
ORC is a columnar, highly compressed format optimized for analytical queries on large datasets, improving scan performance for both short-term and historical data. Partitioning by date enables pruning of partitions for 24-hour window analyses, while sorting by source IP accelerates range and join-like operations on frequently queried fields. B) CSV is plain text and lacks columnar compression, leading to large I/O and slow scans for 10 billion events daily. C) Parquet is suitable but sorting by source IP rather than by date reduces partition pruning efficiency for time-based queries. D) Nested JSON adds parsing overhead and less efficient columnar storage compared with ORC/Parquet, hurting performance for large-scale analytics.



A banking company is currently using an Amazon Redshift cluster with dense storage (DS) nodes to store sensitive data. An audit found that the cluster is unencrypted. Compliance requirements state that a database with sensitive data must be encrypted through a hardware security module (HSM) with automated key rotation.
Which combination of steps is required to achieve compliance? (Choose two.)

  1. Set up a trusted connection with HSM using a client and server certi cate with automatic key rotation.
  2. Modify the cluster with an HSM encryption option and automatic key rotation.
  3. Create a new HSM-encrypted Amazon Redshift cluster and migrate the data to the new cluster.
  4. Enable HSM with key rotation through the AWS CLI.
  5. Enable Elliptic Curve Di e-Hellman Ephemeral (ECDHE) encryption in the HSM.

Answer(s): A,C

Explanation:

The correct combination ensures Redshift data at rest is encrypted using an HSM with automated key rotation and requires both enabling encryption with HSM and migrating data if needed.
A) Correct: Establishing a trusted connection with an HSM using client/server certs supports hardware-backed key management and automated rotation in a compliant deployment.
B) Incorrect: Redshift DS nodes are not updated in-place with an HSM option; you typically deploy a new HSM-encrypted configuration or cluster, not modify existing DS nodes to add HSM.
C) Correct: Creating an HSM-encrypted Redshift cluster and migrating data achieves compliant, hardware-backed encryption with automated key rotation.
D) Incorrect: AWS CLI enablement alone does not guarantee HSM encryption with automated rotation on an existing cluster.
E) Incorrect: ECDHE is a TLS feature, not a mechanism to enable HSM-based key rotation for Redshift.


Reference:

https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-db-encryption.html



A company is planning to do a proof of concept for a machine learning (ML) project using Amazon SageMaker with a subset of existing on- premises data hosted in the company's 3 TB data warehouse. For part of the project, AWS Direct Connect is established and tested. To prepare the data for ML, data analysts are performing data curation. The data analysts want to perform multiple step, including mapping, dropping null elds, resolving choice, and splitting elds. The company needs the fastest solution to curate the data for this project.
Which solution meets these requirements?

  1. Ingest data into Amazon S3 using AWS DataSync and use Apache Spark scrips to curate the data in an Amazon EMR cluster. Store the curated data in Amazon S3 for ML processing.
  2. Create custom ETL jobs on-premises to curate the data. Use AWS DMS to ingest data into Amazon S3 for ML processing.
  3. Ingest data into Amazon S3 using AWS DMS. Use AWS Glue to perform data curation and store the data in Amazon S3 for ML processing.
  4. Take a full backup of the data store and ship the backup les using AWS Snowball. Upload Snowball data into Amazon S3 and schedule data curation jobs using AWS Batch to prepare the data for ML.

Answer(s): C

Explanation:

AWS Glue provides serverless, scalable data cataloging, transformation, and ETL capabilities tightly integrated with S3, making it the fastest path to perform multiple-step data curation (mapping, dropping nulls, resolving choices, splitting fields) and store the curated data for ML preprocessing.
A) Ingesting via DataSync and curating with Spark on EMR adds manual orchestration and management; not as streamlined as Glue for ETL/curation at scale.
B) On-prem ETL with DMS adds latency and complexity; DMS specializes in ongoing replication, not full-featured data transformation for ML prep.
D) Snowball and Batch introduce unnecessary data transfer overhead and longer cycle times for a PoC; Glue offers serverless, rapid iteration.



A US-based sneaker retail company launched its global website. All the transaction data is stored in Amazon RDS and curated historic transaction data is stored in Amazon Redshift in the us-east-1 Region. The business intelligence (BI) team wants to enhance the user experience by providing a dashboard for sneaker trends.
The BI team decides to use Amazon QuickSight to render the website dashboards. During development, a team in Japan provisioned Amazon QuickSight in ap- northeast-1. The team is having di culty connecting Amazon QuickSight from ap-northeast-1 to Amazon Redshift in us-east-1.
Which solution will solve this issue and meet the requirements?

  1. In the Amazon Redshift console, choose to con gure cross-Region snapshots and set the destination Region as ap-northeast-1. Restore the Amazon Redshift Cluster from the snapshot and connect to Amazon QuickSight launched in ap-northeast-1.
  2. Create a VPC endpoint from the Amazon QuickSight VPC to the Amazon Redshift VPC so Amazon QuickSight can access data from Amazon Redshift.
  3. Create an Amazon Redshift endpoint connection string with Region information in the string and use this connection string in Amazon QuickSight to connect to Amazon Redshift.
  4. Create a new security group for Amazon Redshift in us-east-1 with an inbound rule authorizing access from the appropriate IP address range for the Amazon QuickSight servers in ap-northeast-1.

Answer(s): D

Explanation:

Cross-region access requires allowing the QuickSight service handlers in ap-northeast-1 to reach Redshift in us-east-1; authorizing the relevant QuickSight IP range via a security group in us-east-1 enables S3/Redshift network traffic from the QuickSight region to the Redshift cluster.
A) Cross-region snapshots do not enable cross-region runtime access or BI querying; restoring from snapshot in ap-northeast-1 does not connect to Redshift in us-east-1 for QuickSight dashboards.
B) VPC endpoints for QuickSight to Redshift VPC are not available; QuickSight connects to Redshift over the public internet or VPN, not via an inter-region VPC endpoint.
C) Connection string region fields do not manage cross-region network reachability; they do not authorize access or modify VPC routing.
D) Correct: creates a security boundary allowing QuickSight in ap-northeast-1 to reach Redshift in us-east-1 by permitting the appropriate inbound traffic.



An airline has .csv-formatted data stored in Amazon S3 with an AWS Glue Data Catalog. Data analysts want to join this data with call center data stored in
Amazon Redshift as part of a dally batch process. The Amazon Redshift cluster is already under a heavy load. The solution must be managed, serverless, well- functioning, and minimize the load on the existing Amazon Redshift cluster. The solution should also require minimal effort and development activity.
Which solution meets these requirements?

  1. Unload the call center data from Amazon Redshift to Amazon S3 using an AWS Lambda function. Perform the join with AWS Glue ETL scripts.
  2. Export the call center data from Amazon Redshift using a Python shell in AWS Glue. Perform the join with AWS Glue ETL scripts.
  3. Create an external table using Amazon Redshift Spectrum for the call center data and perform the join with Amazon Redshift.
  4. Export the call center data from Amazon Redshift to Amazon EMR using Apache Sqoop. Perform the join with Apache Hive.

Answer(s): C

Explanation:

Creating an external table with Redshift Spectrum allows querying S3-hosted CSVs and joining with Redshift resident data without moving data or loading Redshift, minimizing cluster load and preserving serverless management. Spectrum offloads compute to Redshift Spectrum fleet and uses the Glue Data Catalog for metadata, aligning with the existing catalog and S3 storage.
A) Unload to S3 and join via Glue ETL adds data movement and ETL overhead, increasing latency and maintenance. B) Glue Python shell can orchestrate but duplicates data movement and adds extra orchestration without reducing Redshift load. D) Sqoop/Hive on EMR introduces heavy infrastructure and maintenance, not serverless, and increases cost.



A data analyst is using Amazon QuickSight for data visualization across multiple datasets generated by applications. Each application stores les within a separate Amazon S3 bucket. AWS Glue Data Catalog is used as a central catalog across all application data in Amazon S3. A new application stores its data within a separate S3 bucket. After updating the catalog to include the new application data source, the data analyst created a new Amazon QuickSight data source from an Amazon Athena table, but the import into SPICE failed.
How should the data analyst resolve the issue?

  1. Edit the permissions for the AWS Glue Data Catalog from within the Amazon QuickSight console.
  2. Edit the permissions for the new S3 bucket from within the Amazon QuickSight console.
  3. Edit the permissions for the AWS Glue Data Catalog from within the AWS Glue console.
  4. Edit the permissions for the new S3 bucket from within the S3 console.

Answer(s): B

Explanation:

The SPICE import failed likely due to QuickSight needing access to the new S3 bucket where the data resides; granting S3 bucket permissions enables QuickSight to read the data for SPICE ingestion.
A) Incorrect — Glue Data Catalog permissions in QuickSight control metadata access, not the actual S3 data required for SPICE ingestion.
C) Incorrect — Glue Console permissions govern catalog operations, not access to the new data in S3 for SPICE.
D) Incorrect — S3 bucket permissions in the S3 console are appropriate, but QuickSight access is typically managed via the QuickSight-level permissions, not only the bucket Console, and B directly addresses QuickSight’s required access.


Reference:

https://aws.amazon.com/blogs/big-data/harmonize-query-and-visualize-data-from-various-providers-using-aws-glue-amazon-athena-and- amazon- quicksight/



Viewing Page 2 of 22



Share your comments for Amazon DAS-C01 exam with other users:

unknown-R 11/23/2023 7:36:00 AM

good collection of questions and solution for pl500 certification
UNITED STATES


Swaminathan 5/11/2023 9:59:00 AM

i would like to appear the exam.
Anonymous


Veenu 10/24/2023 6:26:00 AM

i am very happy as i cleared my comptia a+ 220-1101 exam. i studied from as it has all exam dumps and mock tests available. i got 91% on the test.
Anonymous


Karan 5/17/2023 4:26:00 AM

need this dump
Anonymous


Ramesh Kutumbaka 12/30/2023 11:17:00 PM

its really good to eventuate knowledge before appearing for the actual exam.
Anonymous


anonymous 7/20/2023 10:31:00 PM

this is great
CANADA


Xenofon 6/26/2023 9:35:00 AM

please i want the questions to pass the exam
UNITED STATES


Diego 1/21/2024 8:21:00 PM

i need to pass exam
Anonymous


Vichhai 12/25/2023 3:25:00 AM

great, i appreciate it.
AUSTRALIA


P Simon 8/25/2023 2:39:00 AM

please could you upload (isc)2 certified in cybersecurity (cc) exam questions
SOUTH AFRICA


Karim 10/8/2023 8:34:00 PM

good questions, wrong answers
Anonymous


Itumeleng 1/6/2024 12:53:00 PM

im preparing for exams
Anonymous


MS 1/19/2024 2:56:00 PM

question no: 42 isnt azure vm an iaas solution? so, shouldnt the answer be "no"?
Anonymous


keylly 11/28/2023 10:10:00 AM

im study azure
Anonymous


dorcas 9/22/2023 8:08:00 AM

i need this now
Anonymous


treyf 11/9/2023 5:13:00 AM

i took the aws saa-c03 test and scored 935/1000. it has all the exam dumps and important info.
UNITED STATES


anonymous 1/11/2024 4:50:00 AM

good questions
Anonymous


Anjum 9/23/2023 6:22:00 PM

well explained
Anonymous


Thakor 6/7/2023 11:52:00 PM

i got the full version and it helped me pass the exam. pdf version is very good.
INDIA


sartaj 7/18/2023 11:36:00 AM

provide the download link, please
INDIA


loso 7/25/2023 5:18:00 AM

please upload thank.
THAILAND


Paul 6/23/2023 7:12:00 AM

please can you share 1z0-1055-22 dump pls
UNITED STATES


exampei 10/7/2023 8:14:00 AM

i will wait impatiently. thank youu
Anonymous


Prince 10/31/2023 9:09:00 PM

is it possible to clear the exam if we focus on only these 156 questions instead of 623 questions? kindly help!
Anonymous


Ali Azam 12/7/2023 1:51:00 AM

really helped with preparation of my scrum exam
Anonymous


Jerman 9/29/2023 8:46:00 AM

very informative and through explanations
Anonymous


Jimmy 11/4/2023 12:11:00 PM

prep for exam
INDONESIA


Abhi 9/19/2023 1:22:00 PM

thanks for helping us
Anonymous


mrtom33 11/20/2023 4:51:00 AM

i prepared for the eccouncil 350-401 exam. i scored 92% on the test.
Anonymous


JUAN 6/28/2023 2:12:00 AM

aba questions to practice
UNITED STATES


LK 1/2/2024 11:56:00 AM

great content
Anonymous


Srijeeta 10/8/2023 6:24:00 AM

how do i get the remaining questions?
INDIA


Jovanne 7/26/2022 11:42:00 PM

well formatted pdf and the test engine software is free. well worth the money i sept.
ITALY


CHINIMILLI SATISH 8/29/2023 6:22:00 AM

looking for 1z0-116
Anonymous