A manufacturing company has been collecting IoT sensor data from devices on its factory oor for a year and is storing the data in Amazon Redshift for daily analysis. A data analyst has determined that, at an expected ingestion rate of about 2 TB per day, the cluster will be undersized in less than 4 months. A long-term solution is needed. The data analyst has indicated that most queries only reference the most recent 13 months of data, yet there are also quarterly reports that need to query all the data generated from the past 7 years. The chief technology o cer (CTO) is concerned about the costs, administrative effort, and performance of a long-term solution.Which solution should the data analyst use to meet these requirements?
Answer(s): A
A) Correct because using Redshift to keep recent 13 months, unload older data to S3 and reference via Redshift Spectrum provides cost-effective long-term storage, maintains query performance for recent data, and enables access to historical data without oversized Redshift maintenance. B) Fully duplicative duplication of cluster storage is costly and does not optimize long-term data lifecycle or query patterns. C) CTAS to move older data into S3 via Spectrum is plausible but partitioning by quarters adds unnecessary complexity and does not clearly minimize maintenance or cost for mixed query workloads. D) Involves EMR/Athena with Glue catalog; adds managing multiple services and higher orchestration overhead for quarterly reports.
An insurance company has raw data in JSON format that is sent without a prede ned schedule through an Amazon Kinesis Data Firehose delivery stream to anAmazon S3 bucket. An AWS Glue crawler is scheduled to run every 8 hours to update the schema in the data catalog of the tables stored in the S3 bucket. Data analysts analyze the data using Apache Spark SQL on Amazon EMR set up with AWS Glue Data Catalog as the metastore. Data analysts say that, occasionally, the data they receive is stale. A data engineer needs to provide access to the most up-to-date data.Which solution meets these requirements?
Answer(s): D
The correct solution ensures the crawler updates the catalog immediately as new data arrives, reducing staleness.A) Incorrect: Redshift Spectrum relies on a Redshift cluster; external schema with Glue CDC does not address real-time crawling and may incur latency or data freshness issues.B) Incorrect: Scheduling crawler hourly still allows gaps; data arriving between runs may be stale.C) Incorrect: Modifying the crawler interval to 1 minute is impractical and may cause excessive crawl overhead and costs; not event-driven.D) Correct: Lambda triggered by S3 ObjectCreated events provides event-driven, near-real-time crawler execution, ensuring the Data Catalog reflects new data promptly.
A company that produces network devices has millions of users. Data is collected from the devices on an hourly basis and stored in an Amazon S3 data lake.The company runs analyses on the last 24 hours of data ow logs for abnormality detection and to troubleshoot and resolve user issues. The company also analyzes historical logs dating back 2 years to discover patterns and look for improvement opportunities. The data ow logs contain many metrics, such as date, timestamp, source IP, and target IP. There are about 10 billion events every day.How should this data be stored for optimal performance?
A) In Apache ORC partitioned by date and sorted by source IPORC is a columnar, highly compressed format optimized for analytical queries on large datasets, improving scan performance for both short-term and historical data. Partitioning by date enables pruning of partitions for 24-hour window analyses, while sorting by source IP accelerates range and join-like operations on frequently queried fields. B) CSV is plain text and lacks columnar compression, leading to large I/O and slow scans for 10 billion events daily. C) Parquet is suitable but sorting by source IP rather than by date reduces partition pruning efficiency for time-based queries. D) Nested JSON adds parsing overhead and less efficient columnar storage compared with ORC/Parquet, hurting performance for large-scale analytics.
A banking company is currently using an Amazon Redshift cluster with dense storage (DS) nodes to store sensitive data. An audit found that the cluster is unencrypted. Compliance requirements state that a database with sensitive data must be encrypted through a hardware security module (HSM) with automated key rotation.Which combination of steps is required to achieve compliance? (Choose two.)
Answer(s): A,C
The correct combination ensures Redshift data at rest is encrypted using an HSM with automated key rotation and requires both enabling encryption with HSM and migrating data if needed.A) Correct: Establishing a trusted connection with an HSM using client/server certs supports hardware-backed key management and automated rotation in a compliant deployment.B) Incorrect: Redshift DS nodes are not updated in-place with an HSM option; you typically deploy a new HSM-encrypted configuration or cluster, not modify existing DS nodes to add HSM.C) Correct: Creating an HSM-encrypted Redshift cluster and migrating data achieves compliant, hardware-backed encryption with automated key rotation.D) Incorrect: AWS CLI enablement alone does not guarantee HSM encryption with automated rotation on an existing cluster.E) Incorrect: ECDHE is a TLS feature, not a mechanism to enable HSM-based key rotation for Redshift.
https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-db-encryption.html
A company is planning to do a proof of concept for a machine learning (ML) project using Amazon SageMaker with a subset of existing on- premises data hosted in the company's 3 TB data warehouse. For part of the project, AWS Direct Connect is established and tested. To prepare the data for ML, data analysts are performing data curation. The data analysts want to perform multiple step, including mapping, dropping null elds, resolving choice, and splitting elds. The company needs the fastest solution to curate the data for this project.Which solution meets these requirements?
Answer(s): C
AWS Glue provides serverless, scalable data cataloging, transformation, and ETL capabilities tightly integrated with S3, making it the fastest path to perform multiple-step data curation (mapping, dropping nulls, resolving choices, splitting fields) and store the curated data for ML preprocessing.A) Ingesting via DataSync and curating with Spark on EMR adds manual orchestration and management; not as streamlined as Glue for ETL/curation at scale.B) On-prem ETL with DMS adds latency and complexity; DMS specializes in ongoing replication, not full-featured data transformation for ML prep.D) Snowball and Batch introduce unnecessary data transfer overhead and longer cycle times for a PoC; Glue offers serverless, rapid iteration.
A US-based sneaker retail company launched its global website. All the transaction data is stored in Amazon RDS and curated historic transaction data is stored in Amazon Redshift in the us-east-1 Region. The business intelligence (BI) team wants to enhance the user experience by providing a dashboard for sneaker trends.The BI team decides to use Amazon QuickSight to render the website dashboards. During development, a team in Japan provisioned Amazon QuickSight in ap- northeast-1. The team is having di culty connecting Amazon QuickSight from ap-northeast-1 to Amazon Redshift in us-east-1.Which solution will solve this issue and meet the requirements?
Cross-region access requires allowing the QuickSight service handlers in ap-northeast-1 to reach Redshift in us-east-1; authorizing the relevant QuickSight IP range via a security group in us-east-1 enables S3/Redshift network traffic from the QuickSight region to the Redshift cluster.A) Cross-region snapshots do not enable cross-region runtime access or BI querying; restoring from snapshot in ap-northeast-1 does not connect to Redshift in us-east-1 for QuickSight dashboards.B) VPC endpoints for QuickSight to Redshift VPC are not available; QuickSight connects to Redshift over the public internet or VPN, not via an inter-region VPC endpoint.C) Connection string region fields do not manage cross-region network reachability; they do not authorize access or modify VPC routing.D) Correct: creates a security boundary allowing QuickSight in ap-northeast-1 to reach Redshift in us-east-1 by permitting the appropriate inbound traffic.
An airline has .csv-formatted data stored in Amazon S3 with an AWS Glue Data Catalog. Data analysts want to join this data with call center data stored inAmazon Redshift as part of a dally batch process. The Amazon Redshift cluster is already under a heavy load. The solution must be managed, serverless, well- functioning, and minimize the load on the existing Amazon Redshift cluster. The solution should also require minimal effort and development activity.Which solution meets these requirements?
Creating an external table with Redshift Spectrum allows querying S3-hosted CSVs and joining with Redshift resident data without moving data or loading Redshift, minimizing cluster load and preserving serverless management. Spectrum offloads compute to Redshift Spectrum fleet and uses the Glue Data Catalog for metadata, aligning with the existing catalog and S3 storage.A) Unload to S3 and join via Glue ETL adds data movement and ETL overhead, increasing latency and maintenance. B) Glue Python shell can orchestrate but duplicates data movement and adds extra orchestration without reducing Redshift load. D) Sqoop/Hive on EMR introduces heavy infrastructure and maintenance, not serverless, and increases cost.
A data analyst is using Amazon QuickSight for data visualization across multiple datasets generated by applications. Each application stores les within a separate Amazon S3 bucket. AWS Glue Data Catalog is used as a central catalog across all application data in Amazon S3. A new application stores its data within a separate S3 bucket. After updating the catalog to include the new application data source, the data analyst created a new Amazon QuickSight data source from an Amazon Athena table, but the import into SPICE failed.How should the data analyst resolve the issue?
Answer(s): B
The SPICE import failed likely due to QuickSight needing access to the new S3 bucket where the data resides; granting S3 bucket permissions enables QuickSight to read the data for SPICE ingestion.A) Incorrect — Glue Data Catalog permissions in QuickSight control metadata access, not the actual S3 data required for SPICE ingestion.C) Incorrect — Glue Console permissions govern catalog operations, not access to the new data in S3 for SPICE.D) Incorrect — S3 bucket permissions in the S3 console are appropriate, but QuickSight access is typically managed via the QuickSight-level permissions, not only the bucket Console, and B directly addresses QuickSight’s required access.
https://aws.amazon.com/blogs/big-data/harmonize-query-and-visualize-data-from-various-providers-using-aws-glue-amazon-athena-and- amazon- quicksight/
Share your comments for Amazon DAS-C01 exam with other users:
really helpful
question #50 and question #81 are exactly the same questions, azure site recovery provides________for virtual machines. the first says that it is fault tolerance is the answer and second says disater recovery. from my research, it says it should be disaster recovery. can anybody explain to me why? thank you
iam thankful for these exam dumps questions, i would not have passed without this exam dumps.
some of the answers seem to be inaccurate. q10 for example shouldnt it be an m custom column?
are the question real or fake?
thank you for providing such assistance.
nice questions
my 3rd purcahse from this site. these exam dumps are helpful. very helpful.
found it good
excellent material
very helpfull
well explained.
i need the pdf, please.
a good source for exam preparation
i need ielts general training audio guide questions
please make this content available
content is good
latest dumps please
aside from pdf the test engine software is helpful. the interface is user-friendly and intuitive, making it easy to navigate and find the questions.
questions and options are correct, but the answers are wrong sometimes. so please check twice or refer some other platform for the right answer
90% of questions was there but i failed the exam, i marked the answers as per the guide but looks like they are not accurate , if not i would have passed the exam given that i saw about 45 of 50 questions from dump
answer to this question "what administrative safeguards should be implemented to protect the collected data while in use by manasa and her product management team? " it should be (c) for the following reasons: this administrative safeguard involves controlling access to collected data by ensuring that only individuals who need the data for their job responsibilities have access to it. this helps minimize the risk of unauthorized access and potential misuse of sensitive information. while other options such as (a) documenting data flows and (b) conducting a privacy impact assessment (pia) are important steps in data protection, implementing a "need to know" access policy directly addresses the issue of protecting data while in use by limiting access to those who require it for legitimate purposes. (d) is not directly related to safeguarding data during use; it focuses on data transfers and location.
password lockout being the correct answer for question 37 does not make sense. it should be geofencing.
for question 4, the righr answer is :recover automatically from failures
question number 4s answer is 3, option c. i
very good questions
i am confused about the answers to the questions. are the answers correct?
very usefull
need certification.
great exam prep
i require dump
good morning, could you please upload this exam again,
hi can you please upload the dumps for sap contingent module. thanks