Amazon AWS Certified Machine Learning - Specialty Exam (page: 4)
Amazon AWS Certified Machine Learning - Specialty (MLS-C01)
Updated on: 09-Feb-2026

A company wants to classify user behavior as either fraudulent or normal. Based on internal research, a Machine Learning Specialist would like to build a binary classifier based on two features: age of account and transaction month. The class distribution for these features is illustrated in the figure provided.


Based on this information, which model would have the HIGHEST recall with respect to the fraudulent class?

  1. Decision tree
  2. Linear support vector machine (SVM)
  3. Naive Bayesian classifier
  4. Single Perceptron with sigmoidal activation function

Answer(s): C



A Machine Learning Specialist kicks off a hyperparameter tuning job for a tree-based ensemble model using Amazon SageMaker with Area Under the ROC Curve (AUC) as the objective metric. This workflow will eventually be deployed in a pipeline that retrains and tunes hyperparameters each night to model click-through on data that goes stale every 24 hours.

With the goal of decreasing the amount of time it takes to train these models, and ultimately to decrease costs, the Specialist wants to reconfigure the input hyperparameter range(s).

Which visualization will accomplish this?

  1. A histogram showing whether the most important input feature is Gaussian.
  2. A scatter plot with points colored by target variable that uses t-Distributed Stochastic Neighbor Embedding (t-SNE) to visualize the large number of input variables in an easier-to-read dimension.
  3. A scatter plot showing the performance of the objective metric over each training iteration.
  4. A scatter plot showing the correlation between maximum tree depth and the objective metric.

Answer(s): B



A Machine Learning Specialist is creating a new natural language processing application that processes a dataset comprised of 1 million sentences. The aim is to then run Word2Vec to generate embeddings of the sentences and enable different types of predictions.

Here is an example from the dataset:

"The quck BROWN FOX jumps over the lazy dog.”

Which of the following are the operations the Specialist needs to perform to correctly sanitize and prepare the data in a repeatable manner? (Choose three.)

  1. Perform part-of-speech tagging and keep the action verb and the nouns only.
  2. Normalize all words by making the sentence lowercase.
  3. Remove stop words using an English stopword dictionary.
  4. Correct the typography on "quck" to "quick.”
  5. One-hot encode all words in the sentence.
  6. Tokenize the sentence into words.

Answer(s): B,C,F



A company is using Amazon Polly to translate plaintext documents to speech for automated company announcements. However, company acronyms are being mispronounced in the current documents.
How should a Machine Learning Specialist address this issue for future documents?

  1. Convert current documents to SSML with pronunciation tags.
  2. Create an appropriate pronunciation lexicon.
  3. Output speech marks to guide in pronunciation.
  4. Use Amazon Lex to preprocess the text files for pronunciation

Answer(s): A


Reference:

https://docs.aws.amazon.com/polly/latest/dg/ssml.html



An insurance company is developing a new device for vehicles that uses a camera to observe drivers’ behavior and alert them when they appear distracted. The company created approximately 10,000 training images in a controlled environment that a Machine Learning Specialist will use to train and evaluate machine learning models.

During the model evaluation, the Specialist notices that the training error rate diminishes faster as the number of epochs increases and the model is not accurately inferring on the unseen test images.

Which of the following should be used to resolve this issue? (Choose two.)

  1. Add vanishing gradient to the model.
  2. Perform data augmentation on the training data.
  3. Make the neural network architecture complex.
  4. Use gradient checking in the model.
  5. Add L2 regularization to the model.

Answer(s): B,E



When submitting Amazon SageMaker training jobs using one of the built-in algorithms, which common parameters MUST be specified? (Choose three.)

  1. The training channel identifying the location of training data on an Amazon S3 bucket.
  2. The validation channel identifying the location of validation data on an Amazon S3 bucket.
  3. The IAM role that Amazon SageMaker can assume to perform tasks on behalf of the users.
  4. Hyperparameters in a JSON array as documented for the algorithm used.
  5. The Amazon EC2 instance class specifying whether training will be run using CPU or GPU.
  6. The output path specifying where on an Amazon S3 bucket the trained model will persist.

Answer(s): A,E,F



A monitoring service generates 1 TB of scale metrics record data every minute. A Research team performs queries on this data using Amazon Athena. The queries run slowly due to the large volume of data, and the team requires better performance.

How should the records be stored in Amazon S3 to improve query performance?

  1. CSV files
  2. Parquet files
  3. Compressed JSON
  4. RecordIO

Answer(s): B



Machine Learning Specialist is working with a media company to perform classification on popular articles from the company's website. The company is using random forests to classify how popular an article will be before it is published. A sample of the data being used is below.


Given the dataset, the Specialist wants to convert the Day_Of_Week column to binary values.
What technique should be used to convert this column to binary values?

  1. Binarization
  2. One-hot encoding
  3. Tokenization
  4. Normalization transformation

Answer(s): B



Viewing Page 4 of 36



Share your comments for Amazon AWS Certified Machine Learning - Specialty exam with other users:

Reddy 12/14/2023 2:42:00 AM

these are pretty useful
Anonymous