Microsoft DP-203 Exam (page: 12)
Microsoft Data Engineering on Azure
Updated on: 28-Feb-2026

Viewing Page 12 of 75

You have an Azure Synapse Analytics Apache Spark pool named Pool1.
You plan to load JSON files from an Azure Data Lake Storage Gen2 container into the tables in Pool1. The structure and data types vary by file.
You need to load the files into the tables. The solution must maintain the source data types. What should you do?

  1. Use a Conditional Split transformation in an Azure Synapse data flow.
  2. Use a Get Metadata activity in Azure Data Factory.
  3. Load the data by using the OPENROWSET Transact-SQL command in an Azure Synapse Analytics serverless SQL pool.
  4. Load the data by using PySpark.

Answer(s): D



You have an Azure Databricks workspace named workspace1 in the Standard pricing tier. Workspace1 contains an all-purpose cluster named cluster1.
You need to reduce the time it takes for cluster1 to start and scale up. The solution must minimize costs. What should you do first?

  1. Configure a global init script for workspace1.
  2. Create a cluster policy in workspace1.
  3. Upgrade workspace1 to the Premium pricing tier.
  4. Create a pool in workspace1.

Answer(s): D

Explanation:

You can use Databricks Pools to Speed up your Data Pipelines and Scale Clusters Quickly.
Databricks Pools, a managed cache of virtual machine instances that enables clusters to start and scale 4 times faster.


Reference:

https://databricks.com/blog/2019/11/11/databricks-pools-speed-up-data-pipelines.html



HOTSPOT (Drag and Drop is not supported)
You are building an Azure Stream Analytics job that queries reference data from a product catalog file. The file is updated daily.
The reference data input details for the file are shown in the Input exhibit. (Click the Input tab.)


The storage account container view is shown in the Refdata exhibit. (Click the Refdata tab.)


You need to configure the Stream Analytics job to pick up the new reference data.
What should you configure? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:

  1. See Explanation section for answer.

Answer(s): A

Explanation:



Box 1: {date}/product.csv
In the 2nd exhibit we see: Location: refdata / 2020-03-20
Note: Path Pattern: This is a required property that is used to locate your blobs within the specified container. Within the path, you may choose to specify one or more instances of the following 2 variables:
{date}, {time}
Example 1: products/{date}/{time}/product-list.csv
Example 2: products/{date}/product-list.csv
Example 3: product-list.csv
Box 2: YYYY-MM-DD
Note: Date Format [optional]: If you have used {date} within the Path Pattern that you specified, then you can select the date format in which your blobs are organized from the drop-down of supported formats.
Example: YYYY/MM/DD, MM/DD/YYYY, etc.


Reference:

https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-use-reference-data



HOTSPOT (Drag and Drop is not supported)
You have the following Azure Stream Analytics query.


For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.
Hot Area:

  1. See Explanation section for answer.

Answer(s): A

Explanation:



Box 1: No
Note: You can now use a new extension of Azure Stream Analytics SQL to specify the number of partitions of a stream when reshuffling the data.
The outcome is a stream that has the same partition scheme. Please see below for an example: WITH step1 AS (SELECT * FROM [input1] PARTITION BY DeviceID INTO 10),
step2 AS (SELECT * FROM [input2] PARTITION BY DeviceID INTO 10)
SELECT * INTO [output] FROM step1 PARTITION BY DeviceID UNION step2 PARTITION BY DeviceID Note: The new extension of Azure Stream Analytics SQL includes a keyword INTO that allows you to specify the number of partitions for a stream when performing reshuffling using a PARTITION BY statement.
Box 2: Yes
When joining two streams of data explicitly repartitioned, these streams must have the same partition key and partition count.
Box 3: Yes
Streaming Units (SUs) represents the computing resources that are allocated to execute a Stream Analytics job. The higher the number of SUs, the more CPU and memory resources are allocated for your job.
In general, the best practice is to start with 6 SUs for queries that don't use PARTITION BY. Here there are 10 partitions, so 6x10 = 60 SUs is good.
Note: Remember, Streaming Unit (SU) count, which is the unit of scale for Azure Stream Analytics, must be adjusted so the number of physical resources available to the job can fit the partitioned flow. In general, six SUs is a good number to assign to each partition. In case there are insufficient resources assigned to the job, the system will only apply the repartition if it benefits the job.


Reference:

https://azure.microsoft.com/en-in/blog/maximize-throughput-with-repartitioning-in-azure-stream-analytics/ https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-streaming-unit-consumption



HOTSPOT (Drag and Drop is not supported)
You are building a database in an Azure Synapse Analytics serverless SQL pool. You have data stored in Parquet files in an Azure Data Lake Storege Gen2 container. Records are structured as shown in the following sample.
{
"id": 123,
"address_housenumber": "19c",
"address_line": "Memory Lane",
"applicant1_name": "Jane",
"applicant2_name": "Dev"
}

The records contain two applicants at most.
You need to build a table that includes only the address fields.

How should you complete the Transact-SQL statement? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.
Hot Area:

  1. See Explanation section for answer.

Answer(s): A

Explanation:



Box 1: CREATE EXTERNAL TABLE
An external table points to data located in Hadoop, Azure Storage blob, or Azure Data Lake Storage. External tables are used to read data from files or write data to files in Azure Storage. With Synapse SQL, you can use external tables to read external data using dedicated SQL pool or serverless SQL pool.
Syntax:
CREATE EXTERNAL TABLE { database_name.schema_name.table_name | schema_name.table_name | table_name }
( <column_definition> [ ,...n ] ) WITH (
LOCATION = 'folder_or_filepath', DATA_SOURCE = external_data_source_name, FILE_FORMAT = external_file_format_name

Box 2. OPENROWSET
When using serverless SQL pool, CETAS is used to create an external table and export query results to Azure Storage Blob or Azure Data Lake Storage Gen2.
Example:
AS
SELECT decennialTime, stateName, SUM(population) AS population FROM
OPENROWSET(BULK 'https://azureopendatastorage.blob.core.windows.net/censusdatacontainer/release/ us_population_county/year=*/*.parquet',
FORMAT='PARQUET') AS [r]
GROUP BY decennialTime, stateName GO


Reference:

https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/develop-tables-external-tables



Viewing Page 12 of 75



Share your comments for Microsoft DP-203 exam with other users:

San 11/14/2023 12:46:00 AM

goood helping
Anonymous


Wang 6/9/2022 10:05:00 PM

pay attention to questions. they are very tricky. i waould say about 80 to 85% of the questions are in this exam dump.
UNITED STATES


Mary 5/16/2023 4:50:00 AM

wish you would allow more free questions
Anonymous


thomas 9/12/2023 4:28:00 AM

great simulation
Anonymous


Sandhya 12/9/2023 12:57:00 AM

very g inood
Anonymous


Agathenta 12/16/2023 1:36:00 PM

q35 should be a
Anonymous


MD. SAIFUL ISLAM 6/22/2023 5:21:00 AM

sap c_ts450_2021
Anonymous


Satya 7/24/2023 3:18:00 AM

nice questions
UNITED STATES


sk 5/13/2023 2:10:00 AM

ecellent materil for unserstanding
INDIA


Gerard 6/29/2023 11:14:00 AM

good so far
Anonymous


Limbo 10/9/2023 3:08:00 AM

this is way too informative
BOTSWANA


Tejasree 8/26/2023 1:46:00 AM

very helpfull
UNITED STATES


Yolostar Again 10/12/2023 3:02:00 PM

q.189 - answers are incorrect.
Anonymous


Shikha Bakra 9/10/2023 5:16:00 PM

awesome job in getting these questions
AUSTRALIA


Kevin 10/20/2023 2:01:00 AM

i cant find aws certified practitioner clf-c01 exam in aws website but i found aws certified practitioner clf-c02 exam. can everyone please verify the difference between the two clf-c01 and clf-c02? thank you
UNITED STATES


D Mario 6/19/2023 10:38:00 PM

grazie mille. i got a satisfactory mark in my exam test today because of this exam dumps. sorry for my english.
ITALY


Bharat Kumar Saraf 10/31/2023 4:36:00 AM

some of the answers are incorrect. need to be reviewed.
HONG KONG


JP 7/13/2023 12:21:00 PM

so far so good
Anonymous


Kiky V 8/8/2023 6:32:00 PM

i am really liking it
Anonymous


trying 7/28/2023 12:37:00 PM

thanks good stuff
UNITED STATES


exampei 10/4/2023 2:40:00 PM

need dump c_tadm_23
Anonymous


Eman Sawalha 6/10/2023 6:18:00 AM

next time i will write a full review
GREECE


johnpaul 11/15/2023 7:55:00 AM

first time using this site
ROMANIA


omiornil@gmail.com 7/25/2023 9:36:00 AM

please sent me oracle 1z0-1105-22 pdf
BANGLADESH


John 8/29/2023 8:59:00 PM

very helpful
Anonymous


Kvana 9/28/2023 12:08:00 PM

good info about oml
UNITED STATES


Checo Lee 7/3/2023 5:45:00 PM

very useful to practice
UNITED STATES


dixitdnoh@gmail.com 8/27/2023 2:58:00 PM

this website is very helpful.
UNITED STATES


Sanjay 8/14/2023 8:07:00 AM

good content
INDIA


Blessious Phiri 8/12/2023 2:19:00 PM

so challenging
Anonymous


PAYAL 10/17/2023 7:14:00 AM

17 should be d ,for morequery its scale out
Anonymous


Karthik 10/12/2023 10:51:00 AM

nice question
Anonymous


Godmode 5/7/2023 10:52:00 AM

yes.
NETHERLANDS


Bhuddhiman 7/30/2023 1:18:00 AM

good mateial
Anonymous