Microsoft Data Engineering on Azure (Replaced with DP-700) DP-203 Exam Questions in PDF

Free Microsoft DP-203 Dumps Questions (page: 11)

DRAG DROP (Drag and Drop is not supported)
You have data stored in thousands of CSV files in Azure Data Lake Storage Gen2. Each file has a header row followed by a properly formatted carriage return (/r) and line feed (/n).
You are implementing a pattern that batch loads the files daily into an enterprise data warehouse in Azure Synapse Analytics by using PolyBase.
You need to skip the header row when you import the files into the data warehouse. Before building the loading pattern, you need to prepare the required database objects in Azure Synapse Analytics.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
NOTE: Each correct selection is worth one point
Select and Place:

  1. See Explanation section for answer.

Answer(s): A

Explanation:




Step 1: Create an external data source that uses the abfs location
Create External Data Source to reference Azure Data Lake Store Gen 1 or 2
Step 2: Create an external file format and set the First_Row option.
Create External File Format.
Step 3: Use CREATE EXTERNAL TABLE AS SELECT (CETAS) and configure the reject options to specify reject values or percentages
To use PolyBase, you must create external tables to reference your external data. Use reject options.
Note: REJECT options don't apply at the time this CREATE EXTERNAL TABLE AS SELECT statement is run.
Instead, they're specified here so that the database can use them at a later time when it imports data from the external table. Later, when the CREATE TABLE AS SELECT statement selects data from the external table, the database will use the reject options to determine the number or percentage of rows that can fail to import before it stops the import.


Reference:

https://docs.microsoft.com/en-us/sql/relational-databases/polybase/polybase-t-sql-objects https://docs.microsoft.com/en-us/sql/t-sql/statements/create-external-table-as-select-transact-sql



HOTSPOT (Drag and Drop is not supported)
You are building an Azure Synapse Analytics dedicated SQL pool that will contain a fact table for transactions from the first half of the year 2020.
You need to ensure that the table meets the following requirements:
-Minimizes the processing time to delete data that is older than 10 years
-Minimizes the I/O for queries that use year-to-date values

How should you complete the Transact-SQL statement? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:

  1. See Explanation section for answer.

Answer(s): A

Explanation:




Box 1: PARTITION
RANGE RIGHT FOR VALUES is used with PARTITION.
Part 2: [TransactionDateID] Partition on the date column.
Example: Creating a RANGE RIGHT partition function on a datetime column
The following partition function partitions a table or index into 12 partitions, one for each month of a year's worth of values in a datetime column.
CREATE PARTITION FUNCTION [myDateRangePF1] (datetime)
AS RANGE RIGHT FOR VALUES ('20030201', '20030301', '20030401',
'20030501', '20030601', '20030701', '20030801',
'20030901', '20031001', '20031101', '20031201');


Reference:

https://docs.microsoft.com/en-us/sql/t-sql/statements/create-partition-function-transact-sql



You are performing exploratory analysis of the bus fare data in an Azure Data Lake Storage Gen2 account by using an Azure Synapse Analytics serverless SQL pool.
You execute the Transact-SQL query shown in the following exhibit.



What do the query results include?

  1. Only CSV files in the tripdata_2020 subfolder.
  2. All files that have file names that beginning with "tripdata_2020".
  3. All CSV files that have file names that contain "tripdata_2020".
  4. Only CSV that have file names that beginning with "tripdata_2020".

Answer(s): D



DRAG DROP (Drag and Drop is not supported)
You use PySpark in Azure Databricks to parse the following JSON input.


You need to output the data in the following tabular format.



How should you complete the PySpark code? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all. You may need to drag the spit bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point.
Select and Place:

  1. See Explanation section for answer.

Answer(s): A

Explanation:



Box 1: select
Box 2: explode
Bop 3: alias
pyspark.sql.Column.alias returns this column aliased with a new name or names (in the case of expressions that return more than one column, such as explode).


Reference:

https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.sql.Column.alias.html https://docs.microsoft.com/en-us/azure/databricks/sql/language-manual/functions/explode



HOTSPOT (Drag and Drop is not supported)
You are designing an application that will store petabytes of medical imaging data.
When the data is first created, the data will be accessed frequently during the first week. After one month, the data must be accessible within 30 seconds, but files will be accessed infrequently. After one year, the data will be accessed infrequently but must be accessible within five minutes.
You need to select a storage strategy for the data. The solution must minimize costs.

Which storage tier should you use for each time frame? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.
Hot Area:

  1. See Explanation section for answer.

Answer(s): A

Explanation:



Box 1: Hot
Hot tier - An online tier optimized for storing data that is accessed or modified frequently. The Hot tier has the highest storage costs, but the lowest access costs.
Box 2: Cool
Cool tier - An online tier optimized for storing data that is infrequently accessed or modified. Data in the Cool tier should be stored for a minimum of 30 days. The Cool tier has lower storage costs and higher access costs compared to the Hot tier.
Box 3: Cool
Not Archive tier - An offline tier optimized for storing data that is rarely accessed, and that has flexible latency requirements, on the order of hours. Data in the Archive tier should be stored for a minimum of 180 days.


Reference:

https://docs.microsoft.com/en-us/azure/storage/blobs/access-tiers-overview https://www.altaro.com/hyper-v/azure-archive-storage/



Share your comments for Microsoft DP-203 exam with other users:

M
mouna
9/27/2023 8:53:00 AM

good questions

B
Bhavya
9/12/2023 7:18:00 AM

help to practice csa exam

M
Malik
9/28/2023 1:09:00 PM

nice tip and well documented

R
rodrigo
6/22/2023 7:55:00 AM

i need the exam

D
Dan
6/29/2023 1:53:00 PM

please upload

A
Ale M
11/22/2023 6:38:00 PM

prepping for fsc exam

A
ahmad hassan
9/6/2023 3:26:00 AM

pd1 with great experience

Ž
Žarko
9/5/2023 3:35:00 AM

@t it seems like azure service bus message quesues could be the best solution

S
Shiji
10/15/2023 1:08:00 PM

helpful to check your understanding.

D
Da Costa
8/27/2023 11:43:00 AM

question 128 the answer should be static not auto

B
bot
7/26/2023 6:45:00 PM

more comments here

K
Kaleemullah
12/31/2023 1:35:00 AM

great support to appear for exams

B
Bsmaind
8/20/2023 9:26:00 AM

useful dumps

B
Blessious Phiri
8/13/2023 8:37:00 AM

making progress

N
Nabla
9/17/2023 10:20:00 AM

q31 answer should be d i think

V
vladputin
7/20/2023 5:00:00 AM

is this real?

N
Nick W
9/29/2023 7:32:00 AM

q10: c and f are also true. q11: this is outdated. you no longer need ownership on a pipe to operate it

N
Naveed
8/28/2023 2:48:00 AM

good questions with simple explanation

C
cert
9/24/2023 4:53:00 PM

admin guide (windows) respond to malicious causality chains. when the cortex xdr agent identifies a remote network connection that attempts to perform malicious activity—such as encrypting endpoint files—the agent can automatically block the ip address to close all existing communication and block new connections from this ip address to the endpoint. when cortex xdrblocks an ip address per endpoint, that address remains blocked throughout all agent profiles and policies, including any host-firewall policy rules. you can view the list of all blocked ip addresses per endpoint from the action center, as well as unblock them to re-enable communication as appropriate. this module is supported with cortex xdr agent 7.3.0 and later. select the action mode to take when the cortex xdr agent detects remote malicious causality chains: enabled (default)—terminate connection and block ip address of the remote connection. disabled—do not block remote ip addresses. to allow specific and known s

Y
Yves
8/29/2023 8:46:00 PM

very inciting

M
Miguel
10/16/2023 11:18:00 AM

question 5, it seems a instead of d, because: - care plan = case - patient = person account - product = product2;

B
Byset
9/25/2023 12:49:00 AM

it look like real one

D
Debabrata Das
8/28/2023 8:42:00 AM

i am taking oracle fcc certification test next two days, pls share question dumps

N
nITA KALE
8/22/2023 1:57:00 AM

i need dumps

C
CV
9/9/2023 1:54:00 PM

its time to comptia sec+

S
SkepticReader
8/1/2023 8:51:00 AM

question 35 has an answer for a different question. i believe the answer is "a" because it shut off the firewall. "0" in registry data means that its false (aka off).

N
Nabin
10/16/2023 4:58:00 AM

helpful content

B
Blessious Phiri
8/15/2023 3:19:00 PM

oracle 19c is complex db

S
Sreenivas
10/24/2023 12:59:00 AM

helpful for practice

L
Liz
9/11/2022 11:27:00 PM

support team is fast and deeply knowledgeable. i appreciate that a lot.

N
Namrata
7/15/2023 2:22:00 AM

helpful questions

L
lipsa
11/8/2023 12:54:00 PM

thanks for question

E
Eli
6/18/2023 11:27:00 PM

the software is provided for free so this is a big change. all other sites are charging for that. also that fucking examtopic site that says free is not free at all. you are hit with a pay-wall.

O
open2exam
10/29/2023 1:14:00 PM

i need exam questions nca 6.5 any help please ?

AI Tutor 👋 I’m here to help!