AWS Certified Machine Learning Engineer - Associate MLA-C01 Exam Questions and Answers (Page: 4)

QUESTION: 25

A company uses Amazon Athena to query a dataset in Amazon S3. The dataset has a target variable that the company wants to predict.

The company needs to use the dataset in a solution to determine if a model can predict the target variable.

Which solution will provide this information with the LEAST development effort?

Create a new model by using Amazon SageMaker Autopilot. Report the model's achieved performance.
Implement custom scripts to perform data pre-processing, multiple linear regression, and performance evaluation. Run the scripts on Amazon EC2 instances.
Configure Amazon Macie to analyze the dataset and to create a model. Report the model's achieved performance.
Select a model from Amazon Bedrock. Tune the model with the data. Report the model's achieved performance.

Answer(s): A

Explanation:

Amazon SageMaker Autopilot automates the process of building, training, and tuning machine learning models. It provides insights into whether the target variable can be effectively predicted by evaluating the model's performance metrics. This solution requires minimal development effort as SageMaker Autopilot handles data preprocessing, algorithm selection, and hyperparameter optimization automatically, making it the most efficient choice for this scenario.

Reveal Solution Next Question

QUESTION: 26

A company wants to predict the success of advertising campaigns by considering the color scheme of each advertisement. An ML engineer is preparing data for a neural network model. The dataset includes color information as categorical data.

Which technique for feature engineering should the ML engineer use for the model?

Apply label encoding to the color categories. Automatically assign each color a unique integer.
Implement padding to ensure that all color feature vectors have the same length.
Perform dimensionality reduction on the color categories.
One-hot encode the color categories to transform the color scheme feature into a binary matrix.

Answer(s): D

Explanation:

One-hot encoding is the appropriate technique for transforming categorical data, such as color information, into a format suitable for input to a neural network. This technique creates a binary vector representation where each unique category (color) is represented as a separate binary column, ensuring that the model does not infer ordinal relationships between categories. This approach preserves the categorical nature of the data and avoids introducing unintended biases.

Reveal Solution Next Question

QUESTION: 27

A company uses a hybrid cloud environment. A model that is deployed on premises uses data in Amazon S3 to provide customers with a live conversational engine.

The model is using sensitive data. An ML engineer needs to implement a solution to identify and remove the sensitive data.

Which solution will meet these requirements with the LEAST operational overhead?

Deploy the model on Amazon SageMaker AI. Create a set of AWS Lambda functions to identify and remove the sensitive data.
Deploy the model on an Amazon Elastic Container Service (Amazon ECS) cluster that uses AWS Fargate.
Create an AWS Batch job to identify and remove the sensitive data.
Use Amazon Macie to identify the sensitive data. Create a set of AWS Lambda functions to remove the sensitive data.
Use Amazon Comprehend to identify the sensitive data. Launch Amazon EC2 instances to remove the sensitive data.

Answer(s): C

Explanation:

Amazon Macie is a fully managed data security and privacy service that uses machine learning to discover and classify sensitive data in Amazon S3. It is purpose-built to identify sensitive data with minimal operational overhead. After identifying the sensitive data, you can use AWS Lambda functions to automate the process of removing or redacting the sensitive data, ensuring efficiency and integration with the hybrid cloud environment. This solution requires the least development effort and aligns with the requirement to handle sensitive data effectively.

Reveal Solution Next Question

QUESTION: 28

An ML engineer needs to create data ingestion pipelines and ML model deployment pipelines on AWS. All the raw data is stored in Amazon S3 buckets.

Which solution will meet these requirements?

Use Amazon Data Firehose to create the data ingestion pipelines. Use Amazon SageMaker Studio Classic to create the model deployment pipelines.
Use AWS Glue to create the data ingestion pipelines. Use Amazon SageMaker Studio Classic to create the model deployment pipelines.
Use Amazon Redshift ML to create the data ingestion pipelines. Use Amazon SageMaker Studio Classic to create the model deployment pipelines.
Use Amazon Athena to create the data ingestion pipelines. Use an Amazon SageMaker notebook to create the model deployment pipelines.

Answer(s): B

Explanation:

AWS Glue is a serverless data integration service that is well-suited for creating data ingestion pipelines, especially when raw data is stored in Amazon S3. It can clean, transform, and catalog data, making it accessible for downstream ML tasks.
Amazon SageMaker Studio Classic provides a comprehensive environment for building, training, and deploying ML models. It includes built-in tools and capabilities to create efficient model deployment pipelines with minimal setup.
This combination ensures seamless integration of data ingestion and ML model deployment with minimal operational overhead.

Reveal Solution Next Question

QUESTION: 29

A company that has hundreds of data scientists is using Amazon SageMaker to create ML models. The models are in model groups in the SageMaker Model Registry.

The data scientists are grouped into three categories: computer vision, natural language processing (NLP), and speech recognition. An ML engineer needs to implement a solution to organize the existing models into these groups to improve model discoverability at scale. The solution must not affect the integrity of the model artifacts and their existing groupings.

Which solution will meet these requirements?

Create a custom tag for each of the three categories. Add the tags to the model packages in the SageMaker Model Registry.
Create a model group for each category. Move the existing models into these category model groups.
Use SageMaker ML Lineage Tracking to automatically identify and tag which model groups should contain the models.
Create a Model Registry collection for each of the three categories. Move the existing model groups into the collections.

Answer(s): D

Reveal Solution Next Question

QUESTION: 30

A company runs an Amazon SageMaker domain in a public subnet of a newly created VPC. The network is configured properly, and ML engineers can access the SageMaker domain.

Recently, the company discovered suspicious traffic to the domain from a specific IP address. The company needs to block traffic from the specific IP address.

Which update to the network configuration will meet this requirement?

Create a security group inbound rule to deny traffic from the specific IP address. Assign the security group to the domain.
Create a network ACL inbound rule to deny traffic from the specific IP address. Assign the rule to the default network Ad for the subnet where the domain is located.
Create a shadow variant for the domain. Configure SageMaker Inference Recommender to send traffic from the specific IP address to the shadow endpoint.
Create a VPC route table to deny inbound traffic from the specific IP address. Assign the route table to the domain.

Answer(s): B

Explanation:

Network ACLs (Access Control Lists) operate at the subnet level and allow for rules to explicitly deny traffic from specific IP addresses. By creating an inbound rule in the network ACL to deny traffic from the suspicious IP address, the company can block traffic to the Amazon SageMaker domain from that IP. This approach works because network ACLs are evaluated before traffic reaches the security groups, making them effective for blocking traffic at the subnet level.

Reveal Solution Next Question

QUESTION: 31

A company is gathering audio, video, and text data in various languages. The company needs to use a large language model (LLM) to summarize the gathered data that is in Spanish.

Which solution will meet these requirements in the LEAST amount of time?

Train and deploy a model in Amazon SageMaker to convert the data into English text. Train and deploy an LLM in SageMaker to summarize the text.
Use Amazon Transcribe and Amazon Translate to convert the data into English text. Use Amazon Bedrock with the Jurassic model to summarize the text.
Use Amazon Rekognition and Amazon Translate to convert the data into English text. Use Amazon Bedrock with the Anthropic Claude model to summarize the text.
Use Amazon Comprehend and Amazon Translate to convert the data into English text. Use Amazon Bedrock with the Stable Diffusion model to summarize the text.

Answer(s): B

Explanation:

Amazon Transcribe is well-suited for converting audio data into text, including Spanish.
Amazon Translate can efficiently translate Spanish text into English if needed.

Amazon Bedrock, with the Jurassic model, is designed for tasks like text summarization and can handle large language models (LLMs) seamlessly. This combination provides a low-code, managed solution to process audio, video, and text data with minimal time and effort.

Reveal Solution Next Question

QUESTION: 32

A financial company receives a high volume of real-time market data streams from an external provider. The streams consist of thousands of JSON records every second.

The company needs to implement a scalable solution on AWS to identify anomalous data points.

Which solution will meet these requirements with the LEAST operational overhead?

Ingest real-time data into Amazon Kinesis data streams. Use the built-in RANDOM_CUT_FOREST function in Amazon Managed Service for Apache Flink to process the data streams and to detect data anomalies.
Ingest real-time data into Amazon Kinesis data streams. Deploy an Amazon SageMaker endpoint for real- time outlier detection. Create an AWS Lambda function to detect anomalies. Use the data streams to invoke the Lambda function.
Ingest real-time data into Apache Kafka on Amazon EC2 instances. Deploy an Amazon SageMaker endpoint for real-time outlier detection. Create an AWS Lambda function to detect anomalies. Use the data streams to invoke the Lambda function.
Send real-time data to an Amazon Simple Queue Service (Amazon SQS) FIFO queue. Create an AWS Lambda function to consume the queue messages. Program the Lambda function to start an AWS Glue extract, transform, and load (ETL) job for batch processing and anomaly detection.

Answer(s): A

Explanation:

This solution is the most efficient and involves the least operational overhead:
Amazon Kinesis data streams efficiently handle real-time ingestion of high-volume streaming data.

Amazon Managed Service for Apache Flink provides a fully managed environment for stream processing with built-in support for RANDOM_CUT_FOREST, an algorithm designed for anomaly detection in real-time streaming data.
This approach eliminates the need for deploying and managing additional infrastructure like SageMaker endpoints, Lambda functions, or external tools, making it the most scalable and operationally simple solution.

Reveal Solution Next Question

Amazon AWS Certified Machine Learning Engineer - Associate MLA-C01 Exam (page: 4) Amazon AWS Certified Machine Learning Engineer - Associate MLA-C01 Updated on: 15-Feb-2026

QUESTION: 25

Explanation:

QUESTION: 26

Explanation:

QUESTION: 27

Explanation:

QUESTION: 28

Explanation:

QUESTION: 29

QUESTION: 30

Explanation:

QUESTION: 31

Explanation:

QUESTION: 32

Explanation:

Amazon AWS Certified Machine Learning Engineer - Associate MLA-C01 Exam (page: 4)
Amazon AWS Certified Machine Learning Engineer - Associate MLA-C01
Updated on: 15-Feb-2026