Microsoft DP-203 Exam (page: 14)
Microsoft Data Engineering on Azure
Updated on: 06-Aug-2025

Viewing Page 14 of 75

You need to implement a Type 3 slowly changing dimension (SCD) for product category data in an Azure Synapse Analytics dedicated SQL pool.

You have a table that was created by using the following Transact-SQL statement.


Which two columns should you add to the table? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.






Answer(s): B,E

Explanation:

A Type 3 SCD supports storing two versions of a dimension member as separate columns. The table includes a column for the current value of a member plus either the original or previous value of the member. So Type 3 uses additional columns to track one key instance of history, rather than storing additional rows to track each change like in a Type 2 SCD.
This type of tracking may be used for one or two columns in a dimension table. It is not common to use it for many members of the same table. It is often used in combination with Type 1 or Type 2 members.


Reference:

https://k21academy.com/microsoft-azure/azure-data-engineer-dp203-q-a-day-2-live-session-review/



DRAG DROP (Drag and Drop is not supported)
You have an Azure subscription.

You plan to build a data warehouse in an Azure Synapse Analytics dedicated SQL pool named pool1 that will contain staging tables and a dimensional model. Pool1 will contain the following tables.


You need to design the table storage for pool1. The solution must meet the following requirements:

-Maximize the performance of data loading operations to Staging.WebSessions.
-Minimize query times for reporting queries against the dimensional model.

Which type of table distribution should you use for each table? To answer, drag the appropriate table distribution types to the correct tables. Each table distribution type may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point.

  1. See Explanation section for answer.

Answer(s): A

Explanation:



Box 1: Replicated
The best table storage option for a small table is to replicate it across all the Compute nodes.

Box 2: Hash
Hash-distribution improves query performance on large fact tables.

Box 3: Round-robin
Round-robin distribution is useful for improving loading speed.


Reference:

https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute



HOTSPOT (Drag and Drop is not supported)
You have an Azure Synapse Analytics dedicated SQL pool.

You need to create a table named FactInternetSales that will be a large fact table in a dimensional model. FactInternetSales will contain 100 million rows and two columns named SalesAmount and OrderQuantity. Queries executed on FactInternetSales will aggregate the values in SalesAmount and OrderQuantity from the last year for a specific product. The solution must minimize the data size and query execution time.

How should you complete the code? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.

  1. See Explanation section for answer.

Answer(s): A

Explanation:



Box 1: (CLUSTERED COLUMNSTORE INDEX
CLUSTERED COLUMNSTORE INDEX
Columnstore indexes are the standard for storing and querying large data warehousing fact tables. This index uses column-based data storage and query processing to achieve gains up to 10 times the query performance in your data warehouse over traditional row-oriented storage. You can also achieve gains up to 10 times the data compression over the uncompressed data size. Beginning with SQL Server 2016 (13.x) SP1, columnstore indexes enable operational analytics: the ability to run performant real-time analytics on a transactional workload.

Note: Clustered columnstore index
A clustered columnstore index is the physical storage for the entire table.


To reduce fragmentation of the column segments and improve performance, the columnstore index might store some data temporarily into a clustered index called a deltastore and a B-tree list of IDs for deleted rows. The deltastore operations are handled behind the scenes. To return the correct query results, the clustered columnstore index combines query results from both the columnstore and the deltastore.

Box 2: HASH([ProductKey])
A hash distributed table distributes rows based on the value in the distribution column. A hash distributed table is designed to achieve high performance for queries on large tables.

Choose a distribution column with data that distributes evenly

Incorrect:
* Not HASH([OrderDateKey]). Is not a date column. All data for the same date lands in the same distribution. If several users are all filtering on the same date, then only 1 of the 60 distributions do all the processing work

* A replicated table has a full copy of the table available on every Compute node. Queries run fast on replicated tables since joins on replicated tables don't require data movement. Replication requires extra storage, though, and isn't practical for large tables.

* A round-robin table distributes table rows evenly across all distributions. The rows are distributed randomly. Loading data into a round-robin table is fast. Keep in mind that queries can require more data movement than the other distribution methods.


Reference:

https://docs.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-overview
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-overview
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute



You have an Azure Synapse Analytics dedicated SQL pool that contains a table named Table1. Table1 contains the following:

-One billion rows
-A clustered columnstore index
-A hash-distributed column named Product Key
-A column named Sales Date that is of the date data type and cannot be null

Thirty million rows will be added to Table1 each month.

You need to partition Table1 based on the Sales Date column. The solution must optimize query performance and data loading.

How often should you create a partition?

  1. once per month
  2. once per year
  3. once per day
  4. once per week

Answer(s): B

Explanation:

Need a minimum 1 million rows per distribution. Each table is 60 distributions. 30 millions rows is added each month. Need 2 months to get a minimum of 1 million rows per distribution in a new partition.

Note: When creating partitions on clustered columnstore tables, it is important to consider how many rows belong to each partition. For optimal compression and performance of clustered columnstore tables, a minimum of 1 million rows per distribution and partition is needed. Before partitions are created, dedicated SQL pool already divides each table into 60 distributions.

Any partitioning added to a table is in addition to the distributions created behind the scenes. Using this example, if the sales fact table contained 36 monthly partitions, and given that a dedicated SQL pool has 60 distributions, then the sales fact table should contain 60 million rows per month, or 2.1 billion rows when all months are populated. If a table contains fewer than the recommended minimum number of rows per partition, consider using fewer partitions in order to increase the number of rows per partition.


Reference:

https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-partition



You have an Azure Databricks workspace that contains a Delta Lake dimension table named Table1.

Table1 is a Type 2 slowly changing dimension (SCD) table.
You need to apply updates from a source table to Table1.

Which Apache Spark SQL operation should you use?

  1. CREATE
  2. UPDATE
  3. ALTER
  4. MERGE

Answer(s): D

Explanation:

The Delta provides the ability to infer the schema for data input which further reduces the effort required in managing the schema changes. The Slowly Changing Data(SCD) Type 2 records all the changes made to each key in the dimensional table. These operations require updating the existing rows to mark the previous values of the keys as old and then inserting new rows as the latest values. Also, Given a source table with the updates and the target table with dimensional data, SCD Type 2 can be expressed with the merge.

Example:
// Implementing SCD Type 2 operation using merge function
customersTable
.as("customers")
.merge(
stagedUpdates.as("staged_updates"),
"customers.customerId = mergeKey")
.whenMatched("customers.current = true AND customers.address <> staged_updates.address")
.updateExpr(Map(
"current" -> "false",
"endDate" -> "staged_updates.effectiveDate"))
.whenNotMatched()
.insertExpr(Map(
"customerid" -> "staged_updates.customerId",
"address" -> "staged_updates.address",
"current" -> "true",
"effectiveDate" -> "staged_updates.effectiveDate",
"endDate" -> "null"))
.execute()
}


Reference:

https://www.projectpro.io/recipes/what-is-slowly-changing-data-scd-type-2-operation-delta-table-databricks



Viewing Page 14 of 75



Share your comments for Microsoft DP-203 exam with other users:

srija 8/14/2023 8:53:00 AM

very helpful
EUROPEAN UNION


Thembelani 5/30/2023 2:17:00 AM

i am writing this exam tomorrow and have dumps
Anonymous


Anita 10/1/2023 4:11:00 PM

can i have the icdl excel exam
Anonymous


Ben 9/9/2023 7:35:00 AM

please upload it
Anonymous


anonymous 9/20/2023 11:27:00 PM

hye when will post again the past year question for this h13-311_v3 part since i have to for my test tommorow…thank you very much
Anonymous


Randall 9/28/2023 8:25:00 PM

on question 22, option b-once per session is also valid.
Anonymous


Tshegofatso 8/28/2023 11:51:00 AM

this website is very helpful
SOUTH AFRICA


philly 9/18/2023 2:40:00 PM

its my first time exam
SOUTH AFRICA


Beexam 9/4/2023 9:06:00 PM

correct answers are device configuration-enable the automatic installation of webview2 runtime. & policy management- prevent users from submitting feedback.
NEW ZEALAND


RAWI 7/9/2023 4:54:00 AM

is this dump still valid? today is 9-july-2023
SWEDEN


Annie 6/7/2023 3:46:00 AM

i need this exam.. please upload these are really helpful
PAKISTAN


Shubhra Rathi 8/26/2023 1:08:00 PM

please upload the oracle 1z0-1059-22 dumps
Anonymous


Shiji 10/15/2023 1:34:00 PM

very good questions
INDIA


Rita Rony 11/27/2023 1:36:00 PM

nice, first step to exams
Anonymous


Aloke Paul 9/11/2023 6:53:00 AM

is this valid for chfiv9 as well... as i am reker 3rd time...
CHINA


Calbert Francis 1/15/2024 8:19:00 PM

great exam for people taking 220-1101
UNITED STATES


Ayushi Baria 11/7/2023 7:44:00 AM

this is very helpfull for me
Anonymous


alma 8/25/2023 1:20:00 PM

just started preparing for the exam
UNITED KINGDOM


CW 7/10/2023 6:46:00 PM

these are the type of questions i need.
UNITED STATES


Nobody 8/30/2023 9:54:00 PM

does this actually work? are they the exam questions and answers word for word?
Anonymous


Salah 7/23/2023 9:46:00 AM

thanks for providing these questions
Anonymous


Ritu 9/15/2023 5:55:00 AM

interesting
CANADA


Ron 5/30/2023 8:33:00 AM

these dumps are pretty good.
Anonymous


Sowl 8/10/2023 6:22:00 PM

good questions
UNITED STATES


Blessious Phiri 8/15/2023 2:02:00 PM

dbua is used for upgrading oracle database
Anonymous


Richard 10/24/2023 6:12:00 AM

i am thrilled to say that i passed my amazon web services mls-c01 exam, thanks to study materials. they were comprehensive and well-structured, making my preparation efficient.
Anonymous


Janjua 5/22/2023 3:31:00 PM

please upload latest ibm ace c1000-056 dumps
GERMANY


Matt 12/30/2023 11:18:00 AM

if only explanations were provided...
FRANCE


Rasha 6/29/2023 8:23:00 PM

yes .. i need the dump if you can help me
Anonymous


Anonymous 7/25/2023 8:05:00 AM

good morning, could you please upload this exam again?
SPAIN


AJ 9/24/2023 9:32:00 AM

hi please upload sre foundation and practitioner exam questions
Anonymous


peter parker 8/10/2023 10:59:00 AM

the exam is listed as 80 questions with a pass mark of 70%, how is your 50 questions related?
Anonymous


Berihun 7/13/2023 7:29:00 AM

all questions are so important and covers all ccna modules
Anonymous


nspk 1/19/2024 12:53:00 AM

q 44. ans:- b (goto setup > order settings > select enable optional price books for orders) reference link --> https://resources.docs.salesforce.com/latest/latest/en-us/sfdc/pdf/sfom_impl_b2b_b2b2c.pdf(decide whether you want to enable the optional price books feature. if so, select enable optional price books for orders. you can use orders in salesforce while managing price books in an external platform. if you’re using d2c commerce, you must select enable optional price books for orders.)
Anonymous