Microsoft DP-203 Exam (page: 10)
Microsoft Data Engineering on Azure
Updated on: 12-Jan-2026

Viewing Page 10 of 75

You are designing a fact table named FactPurchase in an Azure Synapse Analytics dedicated SQL pool. The table contains purchases from suppliers for a retail store. FactPurchase will contain the following columns.


FactPurchase will have 1 million rows of data added daily and will contain three years of data.
Transact-SQL queries similar to the following query will be executed daily.

SELECT
SupplierKey, StockItemKey, COUNT(*)
FROM FactPurchase
WHERE DateKey >= 20210101
AND DateKey <= 20210131
GROUP By SupplierKey, StockItemKey

Which table distribution will minimize query times?

  1. replicated
  2. hash-distributed on PurchaseKey
  3. round-robin
  4. hash-distributed on DateKey

Answer(s): B

Explanation:

Hash-distributed tables improve query performance on large fact tables, and are the focus of this article. Round-robin tables are useful for improving loading speed.

Incorrect:
Not D: Do not use a date column. . All data for the same date lands in the same distribution. If several users are all filtering on the same date, then only 1 of the 60 distributions do all the processing work.


Reference:

https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute



You are implementing a batch dataset in the Parquet format.
Data files will be produced be using Azure Data Factory and stored in Azure Data Lake Storage Gen2. The files will be consumed by an Azure Synapse Analytics serverless SQL pool.
You need to minimize storage costs for the solution. What should you do?

  1. Use Snappy compression for files.
  2. Use OPENROWSET to query the Parquet files.
  3. Create an external table that contains a subset of columns from the Parquet files.
  4. Store all data as string in the Parquet files.

Answer(s): A



DRAG DROP (Drag and Drop is not supported)
You need to build a solution to ensure that users can query specific files in an Azure Data Lake Storage Gen2 account from an Azure Synapse Analytics serverless SQL pool.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.
Select and Place:

  1. See Explanation section for answer.

Answer(s): A

Explanation:




Step 1: Create an external data source
You can create external tables in Synapse SQL pools via the following steps:
1. CREATE EXTERNAL DATA SOURCE to reference an external Azure storage and specify the credential that should be used to access the storage.
2. CREATE EXTERNAL FILE FORMAT to describe format of CSV or Parquet files.
3. CREATE EXTERNAL TABLE on top of the files placed on the data source with the same file format.
Step 2: Create an external file format object
Creating an external file format is a prerequisite for creating an external table.
Step 3: Create an external table


Reference:

https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/develop-tables-external-tables



You are designing a data mart for the human resources (HR) department at your company. The data mart will contain employee information and employee transactions.
From a source system, you have a flat extract that has the following fields:
-EmployeeID
-FirstName
-LastName
-Recipient
-GrossAmount
-TransactionID
-GovernmentID
-NetAmountPaid
-TransactionDate

You need to design a star schema data model in an Azure Synapse Analytics dedicated SQL pool for the data mart.

Which two tables should you create? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.

  1. a dimension table for Transaction
  2. a dimension table for EmployeeTransaction
  3. a dimension table for Employee
  4. a fact table for Employee
  5. a fact table for Transaction

Answer(s): C,E

Explanation:

C: Dimension tables contain attribute data that might change but usually changes infrequently. For example, a customer's name and address are stored in a dimension table and updated only when the customer's profile changes. To minimize the size of a large fact table, the customer's name and address don't need to be in every row of a fact table. Instead, the fact table and the dimension table can share a customer ID. A query can join the two tables to associate a customer's profile and transactions.
E: Fact tables contain quantitative data that are commonly generated in a transactional system, and then loaded into the dedicated SQL pool. For example, a retail business generates sales transactions every day, and then loads the data into a dedicated SQL pool fact table for analysis.


Reference:

https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tablesoverview



You are designing a dimension table for a data warehouse. The table will track the value of the dimension attributes over time and preserve the history of the data by adding new rows as the data changes.
Which type of slowly changing dimension (SCD) should you use?

  1. Type 0
  2. Type 1
  3. Type 2
  4. Type 3

Answer(s): C

Explanation:

A Type 2 SCD supports versioning of dimension members. Often the source system doesn't store versions, so the data warehouse load process detects and manages changes in a dimension table. In this case, the dimension table must use a surrogate key to provide a unique reference to a version of the dimension member. It also includes columns that define the date range validity of the version (for example, StartDate and EndDate) and possibly a flag column (for example, IsCurrent) to easily filter by current dimension members.

Incorrect Answers:
B: A Type 1 SCD always reflects the latest values, and when changes in source data are detected, the dimension table data is overwritten.
D: A Type 3 SCD supports storing two versions of a dimension member as separate columns. The table includes a column for the current value of a member plus either the original or previous value of the member. So Type 3 uses additional columns to track one key instance of history, rather than storing additional rows to track each change like in a Type 2 SCD.


Reference:

https://docs.microsoft.com/en-us/learn/modules/populate-slowly-changing-dimensions-azure-synapseanalytics-pipelines/3-choose-between-dimension-types



Viewing Page 10 of 75



Share your comments for Microsoft DP-203 exam with other users:

Tamer Barakat 12/7/2023 5:17:00 PM

nice questions
Anonymous


Daryl 8/1/2022 11:33:00 PM

i really like the support team in this website. they are fast in communication and very helpful.
UNITED KINGDOM


Curtis Nakawaki 6/29/2023 9:13:00 PM

a good contemporary exam review
UNITED STATES


x-men 5/23/2023 1:02:00 AM

q23, its an array, isnt it? starts with [ and end with ]. its an array of objects, not object.
UNITED STATES


abuti 7/21/2023 6:24:00 PM

cool very helpfull
Anonymous


Krishneel 3/17/2023 10:34:00 AM

i just passed. this exam dumps is the same one from prepaway and examcollection. it has all the real test questions.
INDIA


Regor 12/4/2023 2:01:00 PM

is this a valid prince2 practitioner dumps?
UNITED KINGDOM


asl 9/14/2023 3:59:00 PM

all are relatable questions
CANADA


Siyya 1/19/2024 8:30:00 PM

might help me to prepare for the exam
Anonymous


Ted 6/21/2023 11:11:00 PM

just paid and downlaod the 2 exams using the 50% sale discount. so far i was able to download the pdf and the test engine. all looks good.
GERMANY


Paul K 11/27/2023 2:28:00 AM

i think it should be a,c. option d goes against the principle of building anything custom unless there are no work arounds available
INDIA


ph 6/16/2023 12:41:00 AM

very legible
Anonymous


sephs2001 7/31/2023 10:42:00 PM

is this exam accurate or helpful?
Anonymous


ash 7/11/2023 3:00:00 AM

please upload dump, i have exam in 2 days
INDIA


Sneha 8/17/2023 6:29:00 PM

this is useful
CANADA


sachin 12/27/2023 2:45:00 PM

question 232 answer should be perimeter not netowrk layer. wrong answer selected
Anonymous


tomAws 7/18/2023 5:05:00 AM

nice questions
BRAZIL


Rahul 6/11/2023 2:07:00 AM

hi team, could you please provide this dump ?
INDIA


TeamOraTech 12/5/2023 9:49:00 AM

very helpful to clear the exam and understand the concept.
Anonymous


Curtis 7/12/2023 8:20:00 PM

i think it is great that you are helping people when they need it. thanks.
UNITED STATES


sam 7/17/2023 6:22:00 PM

cannot evaluate yet
Anonymous


nutz 7/20/2023 1:54:00 AM

a laptops wireless antenna is most likely located in the bezel of the lid
UNITED STATES


rajesh soni 1/17/2024 6:53:00 AM

good examplae to learn basic
INDIA


Tanya 10/25/2023 7:07:00 AM

this is useful information
Anonymous


Nasir Mahmood 12/11/2023 7:32:00 AM

looks usefull
Anonymous


Jason 9/30/2023 1:07:00 PM

question 81 should be c.
CANADA


TestPD1 8/10/2023 12:22:00 PM

question 18 : response isnt a ?
EUROPEAN UNION


ally 8/19/2023 5:31:00 PM

plaese add questions
TURKEY


DIA 10/7/2023 5:59:00 AM

is dumps still valid ?
FRANCE


Annie 7/7/2023 8:33:00 AM

thanks for this
EUROPEAN UNION


arnie 9/17/2023 6:38:00 AM

please upload questions
Anonymous


Tanuj Rana 7/22/2023 2:33:00 AM

please upload the question dump for professional machinelearning
Anonymous


Future practitioner 8/10/2023 1:26:00 PM

question 4 answer is c. this site shows the correct answer as b. "adopt a consumption model" is clearly a cost optimization design principle. looks like im done using this site to study!!!
Anonymous


Ace 8/3/2023 10:37:00 AM

number 52 answer is d
UNITED STATES