Microsoft DP-100 Exam (page: 18)
Microsoft Designing and Implementing a Data Science Solution on Azure
Updated on: 15-Feb-2026

Viewing Page 18 of 102

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You are a data scientist using Azure Machine Learning Studio.
You need to normalize values to produce an output column into bins to predict a target column.
Solution: Apply an Equal Width with Custom Start and Stop binning mode.
Does the solution meet the goal?

  1. Yes
  2. No

Answer(s): B

Explanation:

Use the Entropy MDL binning mode which has a target column.


Reference:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/group-data-into-bins



Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You are a data scientist using Azure Machine Learning Studio.
You need to normalize values to produce an output column into bins to predict a target column.
Solution: Apply a Quantiles binning mode with a PQuantile normalization.
Does the solution meet the goal?

  1. Yes
  2. No

Answer(s): A

Explanation:

Use the Entropy MDL binning mode which has a target column.


Reference:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/group-data-into-bins



HOTSPOT (Drag and Drop is not supported)
You are evaluating a Python NumPy array that contains six data points defined as follows: data = [10, 20, 30, 40, 50, 60]
You must generate the following output by using the k-fold algorithm implantation in the Python Scikit-learn machine learning library: train: [10 40 50 60], test: [20 30] train: [20 30 40 60], test: [10 50] train: [10 20 30 50], test: [40 60]
You need to implement a cross-validation to generate the output.
How should you complete the code segment? To answer, select the appropriate code segment in the dialog box in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:

  1. See Explanation section for answer.

Answer(s): A

Explanation:


Box 1: k-fold
Box 2: 3
K-Folds cross-validator provides train/test indices to split data in train/test sets. Split dataset into k consecutive folds (without shuffling by default).
The parameter n_splits ( int, default=3) is the number of folds. Must be at least 2.
Box 3: data
Example: Example:
>>>
>>> from sklearn.model_selection import KFold
>>> X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
>>> y = np.array([1, 2, 3, 4])
>>> kf = KFold(n_splits=2)
>>> kf.get_n_splits(X)
>>> print(kf)
KFold(n_splits=2, random_state=None, shuffle=False)
>>> for train_index, test_index in kf.split(X):
... print("TRAIN:", train_index, "TEST:", test_index)
... X_train, X_test = X[train_index], X[test_index]
... y_train, y_test = y[train_index], y[test_index]
TRAIN: [2 3] TEST: [0 1]
TRAIN: [0 1] TEST: [2 3]


Reference:

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.KFold.html



You are with a time series dataset in Azure Machine Learning Studio.
You need to split your dataset into training and testing subsets by using the Split Data module.
Which splitting mode should you use?

  1. Recommender Split
  2. Regular Expression Split
  3. Relative Expression Split
  4. Split Rows with the Randomized split parameter set to true

Answer(s): C

Explanation:

Split Rows: Use this option if you just want to divide the data into two parts. You can specify the percentage of data to put in each split, but by default, the data is divided 50-50.
Incorrect Answers:
B: Regular Expression Split: Choose this option when you want to divide your dataset by testing a single column for a value.
C: Relative Expression Split: Use this option whenever you want to apply a condition to a number column.


Reference:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/split-data



HOTSPOT (Drag and Drop is not supported)
You are preparing to build a deep learning convolutional neural network model for image classification. You create a script to train the model using CUDA devices.
You must submit an experiment that runs this script in the Azure Machine Learning workspace.
The following compute resources are available:
-a Microsoft Surface device on which Microsoft Office has been installed. Corporate IT policies prevent the installation of additional software
-a Compute Instance named ds-workstation in the workspace with 2 CPUs and 8 GB of memory
-an Azure Machine Learning compute target named cpu-cluster with eight CPU-based nodes
-an Azure Machine Learning compute target named gpu-cluster with four CPU and GPU-based nodes
You need to specify the compute resources to be used for running the code to submit the experiment, and for running the script in order to minimize model training time.
Which resources should the data scientist use? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:

  1. See Explanation section for answer.

Answer(s): A

Explanation:


Box 1: the ds-workstation compute instance
A workstation notebook instance is good enough to run experiments.
Box 2: the gpu-cluster compute target
Just as GPUs revolutionized deep learning through unprecedented training and inferencing performance, RAPIDS enables traditional machine learning practitioners to unlock game-changing performance with GPUs. With RAPIDS on Azure Machine Learning service, users can accelerate the entire machine learning pipeline, including data processing, training and inferencing, with GPUs from the NC_v3, NC_v2, ND or ND_v2 families. Users can unlock performance gains of more than 20X (with 4 GPUs), slashing training times from hours to minutes and dramatically reducing time-to-insight.


Reference:

https://azure.microsoft.com/sv-se/blog/azure-machine-learning-service-now-supports-nvidia-s-rapids/



Viewing Page 18 of 102



Share your comments for Microsoft DP-100 exam with other users:

Ashfaq Nasir 1/17/2024 1:19:00 AM

best study material for exam
Anonymous


gayathiri 7/6/2023 12:10:00 AM

i need dump
UNITED STATES


ryo 9/10/2023 2:27:00 PM

very helpful
MEXICO


Freddie 12/12/2023 12:37:00 PM

helpful dump questions
SOUTH AFRICA