Microsoft DP-203 Exam (page: 3)
Microsoft Data Engineering on Azure
Updated on: 28-Jul-2025

Viewing Page 3 of 75

You have files and folders in Azure Data Lake Storage Gen2 for an Azure Synapse workspace as shown in the following exhibit.


You create an external table named ExtTable that has LOCATION='/topfolder/'.
When you query ExtTable by using an Azure Synapse Analytics serverless SQL pool, which files are returned?

  1. File2.csv and File3.csv only
  2. File1.csv and File4.csv only
  3. File1.csv, File2.csv, File3.csv, and File4.csv
  4. File1.csv only

Answer(s): B



HOTSPOT (Drag and Drop is not supported)
You are planning the deployment of Azure Data Lake Storage Gen2. You have the following two reports that will access the data lake:

-Report1: Reads three columns from a file that contains 50 columns.
-Report2: Queries a single record based on a timestamp.

You need to recommend in which format to store the data in the data lake to support the reports. The solution must minimize read times. What should you recommend for each report? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.
Hot Area:

  1. See Explanation section for answer.

Answer(s): A

Explanation:




Report1: CSV
CSV: The destination writes records as delimited data.

Report2: AVRO
AVRO supports timestamps.
Not Parquet, TSV: Not options for Azure Data Lake Storage Gen2.


Reference:

https://streamsets.com/documentation/datacollector/latest/help/datacollector/UserGuide/Destinations/ADLS-G2-D.html



You are designing the folder structure for an Azure Data Lake Storage Gen2 container.

Users will query data by using a variety of services including Azure Databricks and Azure Synapse Analytics serverless SQL pools. The data will be secured by subject area. Most queries will include data from the current year or current month.

Which folder structure should you recommend to support fast queries and simplified folder security?

  1. /{SubjectArea}/{DataSource}/{DD}/{MM}/{YYYY}/{FileData}_{YYYY}_{MM}_{DD}.csv
  2. /{DD}/{MM}/{YYYY}/{SubjectArea}/{DataSource}/{FileData}_{YYYY}_{MM}_{DD}.csv
  3. /{YYYY}/{MM}/{DD}/{SubjectArea}/{DataSource}/{FileData}_{YYYY}_{MM}_{DD}.csv
  4. /{SubjectArea}/{DataSource}/{YYYY}/{MM}/{DD}/{FileData}_{YYYY}_{MM}_{DD}.csv

Answer(s): D

Explanation:

There's an important reason to put the date at the end of the directory structure. If you want to lock down certain regions or subject matters to users/ groups, then you can easily do so with the POSIX permissions. Otherwise, if there was a need to restrict a certain security group to viewing just the UK data or certain planes, with the date structure in front a separate permission would be required for numerous directories under every hour directory.
Additionally, having the date structure in front would exponentially increase the number of directories as time went on.

Note: In IoT workloads, there can be a great deal of data being landed in the data store that spans across numerous products, devices, organizations, and customers. It’s important to pre-plan the directory layout for organization, security, and efficient processing of the data for down-stream consumers. A general template to consider might be the following layout:

{Region}/{SubjectMatter(s)}/{yyyy}/{mm}/{dd}/{hh}/



HOTSPOT (Drag and Drop is not supported)
You need to output files from Azure Data Factory.

Which file format should you use for each type of output? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.
Hot Area:

  1. See Explanation section for answer.

Answer(s): A

Explanation:


Box 1: Parquet
Parquet stores data in columns, while Avro stores data in a row-based format. By their very nature, column-oriented data stores are optimized for read- heavy analytical workloads, while row-based databases are best for write-heavy transactional workloads.

Box 2: Avro
An Avro schema is created using JSON format. AVRO supports timestamps.

Note: Azure Data Factory supports the following file formats (not GZip or TXT). Avro format
Binary format

Delimited text format Excel format
JSON format ORC format Parquet format XML format


Reference:

https://www.datanami.com/2018/05/16/big-data-file-formats-demystified



HOTSPOT (Drag and Drop is not supported)
You use Azure Data Factory to prepare data to be queried by Azure Synapse Analytics serverless SQL pools.

Files are initially ingested into an Azure Data Lake Storage Gen2 account as 10 small JSON files. Each file contains the same data attributes and data from a subsidiary of your company.

You need to move the files to a different folder and transform the data to meet the following requirements:
-Provide the fastest possible query times.
-Automatically infer the schema from the underlying files.

How should you configure the Data Factory copy activity? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.
Hot Area:

  1. See Explanation section for answer.

Answer(s): A

Explanation:


Box 1: Preserver herarchy
Compared to the flat namespace on Blob storage, the hierarchical namespace greatly improves the performance of directory management operations, which improves overall job performance.

Box 2: Parquet
Azure Data Factory parquet format is supported for Azure Data Lake Storage Gen2. Parquet supports the schema property.


Reference:

https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction
https://docs.microsoft.com/en-us/azure/data-factory/format-parquet



Viewing Page 3 of 75



Share your comments for Microsoft DP-203 exam with other users:

Franklin Allagoa 7/5/2023 5:16:00 AM

i want hcia exam dumps
Anonymous


SSA 12/24/2023 1:18:00 PM

good training
Anonymous


BK 8/11/2023 12:23:00 PM

very useful
INDIA


Deepika Narayanan 7/13/2023 11:05:00 PM

yes need this exam dumps
Anonymous


Blessious Phiri 8/15/2023 3:31:00 PM

these questions are a great eye opener
Anonymous


Jagdesh 9/8/2023 8:17:00 AM

thank you for providing these questions and answers. they helped me pass my exam. you guys are great.
CANADA


TS 7/18/2023 3:32:00 PM

good knowledge
Anonymous


Asad Khan 11/1/2023 2:44:00 AM

answer 10 should be a because only a new project will be created & the organization is the same.
Anonymous


Raj 9/12/2023 3:49:00 PM

can you please upload the dump again
UNITED STATES


Christian Klein 6/23/2023 1:32:00 PM

is it legit questions from sap certifications ?
UNITED STATES


anonymous 1/12/2024 3:34:00 PM

question 16 should be b (changing the connector settings on the monitor) pc and monitor were powered on. the lights on the pc are on indicating power. the monitor is showing an error text indicating that it is receiving power too. this is a clear sign of having the wrong input selected on the monitor. thus, the "connector setting" needs to be switched from hdmi to display port on the monitor so it receives the signal from the pc, or the other way around (display port to hdmi).
UNITED STATES


NSPK 1/18/2024 10:26:00 AM

q 10. ans is d (in the target org: open deployment settings, click edit next to the source org. select allow inbound changes and save
Anonymous


mohamed abdo 9/1/2023 4:59:00 AM

very useful
Anonymous


Tom 3/18/2022 8:00:00 PM

i purchased this exam dumps from another website with way more questions but they were all invalid and outdate. this exam dumps was right to the point and all from recent exam. it was a hard pass.
UNITED KINGDOM


Edrick GOP 10/24/2023 6:00:00 AM

it was a good experience and i got 90% in the 200-901 exam.
Anonymous


anonymous 8/10/2023 2:28:00 AM

hi please upload this
Anonymous


Bakir 7/6/2023 7:24:00 AM

please upload it
UNITED KINGDOM


Aman 6/18/2023 1:27:00 PM

really need this dump. can you please help.
UNITED KINGDOM


Neela Para 1/8/2024 6:39:00 PM

really good and covers many areas explaining the answer.
NEW ZEALAND


Karan Patel 8/15/2023 12:51:00 AM

yes, can you please upload the exam?
UNITED STATES


NISHAD 11/7/2023 11:28:00 AM

how many questions are there in these dumps?
UNITED STATES


Pankaj 7/3/2023 3:57:00 AM

hi team, please upload this , i need it.
UNITED STATES


DN 9/4/2023 11:19:00 PM

question 14 - run terraform import: this is the recommended best practice for bringing manually created or destroyed resources under terraform management. you use terraform import to associate an existing resource with a terraform resource configuration. this ensures that terraform is aware of the resource, and you can subsequently manage it with terraform.
Anonymous


Zhiguang 8/19/2023 11:37:00 PM

please upload dump. thanks in advance.
Anonymous


deedee 12/23/2023 5:51:00 PM

great great
UNITED STATES


Asad Khan 11/1/2023 3:10:00 AM

answer 16 should be b your organizational policies require you to use virtual machines directly
Anonymous


Sale Danasabe 10/24/2023 5:21:00 PM

the question are kind of tricky of you didnt get the hnag on it.
Anonymous


Luis 11/16/2023 1:39:00 PM

can anyone tell me if this is for rhel8 or rhel9?
UNITED STATES


hik 1/19/2024 1:47:00 PM

good content
UNITED STATES


Blessious Phiri 8/15/2023 2:18:00 PM

pdb and cdb are critical to the database
Anonymous


Zuned 10/22/2023 4:39:00 AM

till 104 questions are free, lets see how it helps me in my exam today.
UNITED STATES


Muhammad Rawish Siddiqui 12/3/2023 12:11:00 PM

question # 56, answer is true not false.
SAUDI ARABIA


Amaresh Vashishtha 8/27/2023 1:33:00 AM

i would be requiring dumps to prepare for certification exam
Anonymous


Asad 9/8/2023 1:01:00 AM

very helpful
PAKISTAN