A media company has a streaming playback application. The company needs to collect and analyze data to provide near-real-time feedback on playback issues within 30 seconds. The company requires a consumer application to identify playback issues, such as decreased quality during a speci ed time frame. The data will be streamed in JSON format. The schema can change over time.Which solution will meet these requirements?
Answer(s): D
D) Correct because Kinesis Data Streams with a Kinesis Data Analytics for Apache Flink app supports real-time, low-latency processing of streaming JSON with evolving schemas, suitable for near-instant playback issue detection within 30 seconds.A) Firehose is designed for near-real-time delivery to destinations like S3, not low-latency analytics; Lambda from S3 events introduces additional latency and lacks streaming windowing for continuous analysis.B) Managed Streaming for Kafka plus Kinesis Data Analytics SQL is not ideal for real-time, per-event analytic latency and may complicate schema evolution handling.C) Firehose to S3 with Lambda is batch-oriented and unsuitable for continuous, low-latency streaming analytics and schema changes.
An ecommerce company stores customer purchase data in Amazon RDS. The company wants a solution to store and analyze historical data. The most recent 6 months of data will be queried frequently for analytics workloads. This data is several terabytes large. Once a month, historical data for the last 5 years must be accessible and will be joined with the more recent data. The company wants to optimize performance and cost.Which storage solution will meet these requirements?
The correct choice combines cost-effective storage for long-term historical data with fast analytics on recent data and seamless cross-join capability: Redshift Spectrum enables querying both recent data in Redshift and historical data in S3 without moving data, providing performance for frequent 6-month analytics and scalable cost for multi-year history.A) Incorrect: separates historical data in S3 but still relies on Athena for history; lacks integrated fast join with recent 6 months in Redshift, reducing join performance and analytics speed.B) Incorrect: uses Redshift for recent data but keeps RDS read replica for history, adding complexity and potential latency; not a unified analytical model.C) Incorrect: only S3 with Athena; no active storage for recent 6 months in a fast-access data warehouse, hindering performance for frequent queries.
A company leverages Amazon Athena for ad-hoc queries against data stored in Amazon S3. The company wants to implement additional controls to separate query execution and query history among users, teams, or applications running in the same AWS account to comply with internal security policies.Which solution meets these requirements?
Answer(s): B
Athena workgroups isolate query execution and history per use case, allowing per-group settings, quotas, and history separation while sharing the same account and catalog.A) Incorrect: S3 bucket policies alone separate data access but do not isolate query execution history or control per-user query context in Athena.B) Correct: Athena workgroups provide distinct execution contexts and query history per use case; tagging and IAM policies scoped to workgroups enforce separation.C) Incorrect: IAM roles per use case don’t inherently partition Athena query history or execution isolation within a single account unless combined with dedicated workspaces; complexity increases.D) Incorrect: Glue Data Catalog resource policies govern table-level access but do not offer per-use-case query execution isolation or history separation in Athena.
https://aws.amazon.com/athena/faqs/
A company wants to use an automatic machine learning (ML) Random Cut Forest (RCF) algorithm to visualize complex real-world scenarios, such as detecting seasonality and trends, excluding outers, and imputing missing values. The team working on this project is non-technical and is looking for an out-of-the-box solution that will require the LEAST amount of management overhead.Which solution will meet these requirements?
A concise justification: QuickSight’s built-in ML-powered forecasting provides an out-of-the-box, low-management solution suitable for non-technical users, enabling visualization of seasonality, trends, missing value handling, and basic imputation without extra infrastructure.A) Incorrect: AWS Glue ML transforms introduce additional ETL/ML setup and management, not as turnkey for visualization and forecasting as QuickSight forecasting.B) Correct: QuickSight forecasting is integrated, managed, and user-friendly, minimizing overhead while delivering time series insights.C) Incorrect: Pre-built ML AMIs require provisioning, maintenance, and potentially more technical skill than native QuickSight forecasting.D) Incorrect: Calculated fields are manual and limited; they do not provide an automated ML-based forecast with seasonality/trend handling.
https://aws.amazon.com/blogs/big-data/query-visualize-and-forecast-trufactor-web-session-intelligence-with-aws-data-exchange/
A retail company's data analytics team recently created multiple product sales analysis dashboards for the average selling price per product using AmazonQuickSight. The dashboards were created from .csv les uploaded to Amazon S3. The team is now planning to share the dashboards with the respective external product owners by creating individual users in Amazon QuickSight. For compliance and governance reasons, restricting access is a key requirement. The product owners should view only their respective product analysis in the dashboard reports. Which approach should the data analytics team take to allow product owners to view only their products in the dashboard?
A short summary: Use row-level security at the dataset level to filter data per product for each user.A) Separate the data by product and use S3 bucket policies for authorization. — Incorrect: S3 bucket policies control access to storage, not per-user data filtering within QuickSight dashboards.B) Separate the data by product and use IAM policies for authorization. — Incorrect: IAM policies grant access to AWS resources, not fine-grained row-level data visibility inside QuickSight dashboards.C) Create a manifest le with row-level security. — Incorrect: manifests control data source metadata for ingestion, not per-user data access within QuickSight analyses.D) Create dataset rules with row-level security. — Correct: DS-level row-level security (RLS) restricts data rows based on user identity or attributes, allowing product owners to see only their products in dashboards.
A company has developed an Apache Hive script to batch process data stared in Amazon S3. The script needs to run once every day and store the output inAmazon S3. The company tested the script, and it completes within 30 minutes on a small local three-node cluster. Which solution is the MOST cost-effective for scheduling and executing the script?
Answer(s): A
A) The most cost-effective approach is to use a Lambda function to spin up a transient EMR cluster for the Hive job, then terminate after completion. This minimizes idle cost since clusters run only on demand, aligns with a daily batch window, and CloudWatch Events provides simple scheduling without maintaining always-on infrastructure.B) Using Hue, Oozie, and termination protection with Spot Instances adds management overhead and potential reliability concerns; it’s more costly and complex for a daily, self-contained Hive batch. Oozie workflows on EMR are heavier to maintain than a serverless scheduling pattern.C) Glue supports ETL but is not optimized for pure Hive scripts running on S3 data; it may introduce unnecessary data catalog and job orchestration overhead for a simple Hive batch.D) Lambda layers for Hive runtime in Lambda is not a typical or reliable deployment model for Hive workloads, and Step Functions adds orchestration cost and complexity without clear cost benefits for a once-daily batch.
A company wants to improve the data load time of a sales data dashboard. Data has been collected as .csv les and stored within an Amazon S3 bucket that is partitioned by date. The data is then loaded to an Amazon Redshift data warehouse for frequent analysis. The data volume is up to 500 GB per day.Which solution will improve the data loading performance?
A brief strengthens load performance: B is correct because a COPY command parallelizes ingestion from S3, efficiently loading split CSV files into Redshift with proper formatting and optional compression, dramatically improving throughput for large daily volumes. A is incorrect because INSERTs are row-by-row and far slower for bulk loads. C is incorrect because Kinesis Data Firehose is designed for streaming data ingestion, not optimized bulk loads from S3 into Redshift. D is incorrect because loading unsorted data and VACUUMing post-load adds unnecessary overhead; proper data organization and automatic sort keys during COPY yield better performance.
https://aws.amazon.com/blogs/big-data/using-amazon-redshift-spectrum-amazon-athena-and-aws-glue-with-node-js-in-production/
A company has a data warehouse in Amazon Redshift that is approximately 500 TB in size. New data is imported every few hours and read-only queries are run throughout the day and evening. There is a particularly heavy load with no writes for several hours each morning on business days. During those hours, some queries are queued and take a long time to execute. The company needs to optimize query execution and avoid any downtime.What is the MOST cost-effective solution?
Concurrency scaling in Redshift automatically adds transient capacity to handle bursts of read queries without downtime, reducing queue wait times while only incurring charges for usage during those bursts, making it cost-effective for sporadic heavy workloads.A) Correct. Concurrency scaling spins up additional cluster capacity to serve high-demand read queries with no downtime and pay-per-use pricing.B) Incorrect. Scaling by adding nodes with ALL distribution is expensive and can cause data movement; not ideal for intermittent bursts and increases baseline cost.C) Incorrect. Elastic resize adds permanent capacity and involves data redistribution; not ideal for short-lived peak periods and can cause downtime or performance variability.D) Incorrect. Snapshot/restore/resize is disruptive and not suitable for rapid, transient workload bursts; also more complex and slower.
Share your comments for Amazon DAS-C01 exam with other users:
Can I trust to this source?
can you please provide the CBDA latest test preparation
This is the best and only way of passing this exam as it is extremely hard. Good questions and valid dump.
Can I use this dumps when I am taking the exam? I mean does somebody look what tabs or windows I have opened ?
Finally got a change to write this exam and pass it! Valid and accurate!
Upload this exam please!
Thank you for providing these questions. It helped me a lot with passing my exam.
my first attempt
very explainable
i think answer of q 462 is variance analysis
hi i need see questions
best study material for exam
very interesting repository
american history 1
good level of questions
i need this dump kindly upload it
do we need c# coding to be az204 certified
excellent topics covered
are these really financial cloud questions and answers, seems these are basic admin question and answers
are these comments real
please upload the latest dumps
a company runs its workloads on premises. the company wants to forecast the cost of running a large application on aws. which aws service or tool can the company use to obtain this information? pricing calculator ... the aws pricing calculator is primarily used for estimating future costs
looks interesting
thanks! that’s amazing
the exam dumps are helping me get a solid foundation on the practical techniques and practices needed to be successful in the auditing world.
q 14 should be dmz sever1 and notepad.exe why does note pad have a 443 connection
question # 108, correct answers are business growth and risk reduction.
are these valid chfi questions
question: 162 should be dlp (b)
good exam questions
I have to say this is really close to real exam. Passed my exam with this.
good analytics question
this looks accurate
question 46, the answer should be data "virtualization" (not visualization).