Databricks Databricks-Certified-Professional-Data-Engineer Exam (page: 7)
Databricks Certified Data Engineer Professional
Updated on: 02-Jan-2026

A junior data engineer is working to implement logic for a Lakehouse table named silver_device_recordings. The source data contains 100 unique fields in a highly nested JSON structure.

The silver_device_recordings table will be used downstream to power several production monitoring dashboards and a production model. At present, 45 of the 100 fields are being used in at least one of these applications.

The data engineer is trying to determine the best approach for dealing with schema declaration given the highly-nested structure of the data and the numerous fields.
Which of the following accurately presents information about Delta Lake and Databricks that may impact their decision-making process?

  1. The Tungsten encoding used by Databricks is optimized for storing string data; newly-added native support for querying JSON strings means that string types are always most efficient.
  2. Because Delta Lake uses Parquet for data storage, data types can be easily evolved by just modifying file footer information in place.
  3. Human labor in writing code is the largest cost associated with data engineering workloads; as such, automating table declaration logic should be a priority in all migration workloads.
  4. Because Databricks will infer schema using types that allow all observed data to be processed, setting types manually provides greater assurance of data quality enforcement.
  5. Schema inference and evolution on Databricks ensure that inferred types will always accurately match the data types used by downstream systems.

Answer(s): D



The data engineering team maintains the following code:


Assuming that this code produces logically correct results and the data in the source tables has been de-duplicated and validated, which statement describes what will occur when this code is executed?

  1. A batch job will update the enriched_itemized_orders_by_account table, replacing only those rows that have different values than the current version of the table, using accountID as the primary key.
  2. The enriched_itemized_orders_by_account table will be overwritten using the current valid version of data in each of the three tables referenced in the join logic.
  3. An incremental job will leverage information in the state store to identify unjoined rows in the source tables and write these rows to the enriched_iteinized_orders_by_account table.
  4. An incremental job will detect if new rows have been written to any of the source tables; if new rows are detected, all results will be recalculated and used to overwrite the enriched_itemized_orders_by_account table.
  5. No computation will occur until enriched_itemized_orders_by_account is queried; upon query materialization, results will be calculated using the current valid version of data in each of the three tables referenced in the join logic.

Answer(s): B



The data engineering team is migrating an enterprise system with thousands of tables and views into the Lakehouse. They plan to implement the target architecture using a series of bronze, silver, and gold tables. Bronze tables will almost exclusively be used by production data engineering workloads, while silver tables will be used to support both data engineering and machine learning workloads. Gold tables will largely serve business intelligence and reporting purposes. While personal identifying information (PII) exists in all tiers of data, pseudonymization and anonymization rules are in place for all data at the silver and gold levels.
The organization is interested in reducing security concerns while maximizing the ability to collaborate across diverse teams.
Which statement exemplifies best practices for implementing this system?

  1. Isolating tables in separate databases based on data quality tiers allows for easy permissions management through database ACLs and allows physical separation of default storage locations for managed tables.
  2. Because databases on Databricks are merely a logical construct, choices around database organization do not impact security or discoverability in the Lakehouse.
  3. Storing all production tables in a single database provides a unified view of all data assets available throughout the Lakehouse, simplifying discoverability by granting all users view privileges on this database.
  4. Working in the default Databricks database provides the greatest security when working with managed tables, as these will be created in the DBFS root.
  5. Because all tables must live in the same storage containers used for the database they're createdin, organizations should be prepared to create between dozens and thousands of databasesdepending on their data isolation requirements.

Answer(s): A



The data architect has mandated that all tables in the Lakehouse should be configured as external Delta Lake tables. Which approach will ensure that this requirement is met?

  1. Whenever a database is being created, make sure that the LOCATION keyword is used
  2. When configuring an external data warehouse for all table storage, leverage Databricks for all ELT.
  3. Whenever a table is being created, make sure that the LOCATION keyword is used.
  4. When tables are created, make sure that the EXTERNAL keyword is used in the CREATE TABLE statement.
  5. When the workspace is being configured, make sure that external cloud object storage has been mounted.

Answer(s): C



To reduce storage and compute costs, the data engineering team has been tasked with curating a series of aggregate tables leveraged by business intelligence dashboards, customer-facing applications, production machine learning models, and ad hoc analytical queries.

The data engineering team has been made aware of new requirements from a customer-facing application, which is the only downstream workload they manage entirely. As a result, an aggregate table used by numerous teams across the organization will need to have a number of fields renamed, and additional fields will also be added.

Which of the solutions addresses the situation while minimally interrupting other teams in the organization without increasing the number of tables that need to be managed?

  1. Send all users notice that the schema for the table will be changing; include in the communication the logic necessary to revert the new table schema to match historic queries.
  2. Configure a new table with all the requisite fields and new names and use this as the source for the customer-facing application; create a view that maintains the original data schema and table name by aliasing select fields from the new table.
  3. Create a new table with the required schema and new fields and use Delta Lake's deep clone functionality to sync up changes committed to one table to the corresponding table.
  4. Replace the current table definition with a logical view defined with the query logic currently writing the aggregate table; create a new table to power the customer-facing application.
  5. Add a table comment warning all users that the table schema and field names will be changing on a given date; overwrite the table in place to the specifications of the customer-facing application.

Answer(s): B



Viewing Page 7 of 37



Share your comments for Databricks Databricks-Certified-Professional-Data-Engineer exam with other users:

Calbert Francis 1/15/2024 8:19:00 PM

great exam for people taking 220-1101
UNITED STATES


Ayushi Baria 11/7/2023 7:44:00 AM

this is very helpfull for me
Anonymous


alma 8/25/2023 1:20:00 PM

just started preparing for the exam
UNITED KINGDOM


CW 7/10/2023 6:46:00 PM

these are the type of questions i need.
UNITED STATES


Nobody 8/30/2023 9:54:00 PM

does this actually work? are they the exam questions and answers word for word?
Anonymous


Salah 7/23/2023 9:46:00 AM

thanks for providing these questions
Anonymous


Ritu 9/15/2023 5:55:00 AM

interesting
CANADA


Ron 5/30/2023 8:33:00 AM

these dumps are pretty good.
Anonymous


Sowl 8/10/2023 6:22:00 PM

good questions
UNITED STATES


Blessious Phiri 8/15/2023 2:02:00 PM

dbua is used for upgrading oracle database
Anonymous


Richard 10/24/2023 6:12:00 AM

i am thrilled to say that i passed my amazon web services mls-c01 exam, thanks to study materials. they were comprehensive and well-structured, making my preparation efficient.
Anonymous


Janjua 5/22/2023 3:31:00 PM

please upload latest ibm ace c1000-056 dumps
GERMANY


Matt 12/30/2023 11:18:00 AM

if only explanations were provided...
FRANCE


Rasha 6/29/2023 8:23:00 PM

yes .. i need the dump if you can help me
Anonymous


Anonymous 7/25/2023 8:05:00 AM

good morning, could you please upload this exam again?
SPAIN


AJ 9/24/2023 9:32:00 AM

hi please upload sre foundation and practitioner exam questions
Anonymous


peter parker 8/10/2023 10:59:00 AM

the exam is listed as 80 questions with a pass mark of 70%, how is your 50 questions related?
Anonymous


Berihun 7/13/2023 7:29:00 AM

all questions are so important and covers all ccna modules
Anonymous


nspk 1/19/2024 12:53:00 AM

q 44. ans:- b (goto setup > order settings > select enable optional price books for orders) reference link --> https://resources.docs.salesforce.com/latest/latest/en-us/sfdc/pdf/sfom_impl_b2b_b2b2c.pdf(decide whether you want to enable the optional price books feature. if so, select enable optional price books for orders. you can use orders in salesforce while managing price books in an external platform. if you’re using d2c commerce, you must select enable optional price books for orders.)
Anonymous


Muhammad Rawish Siddiqui 12/2/2023 5:28:00 AM

"cost of replacing data if it were lost" is also correct.
SAUDI ARABIA


Anonymous 7/14/2023 3:17:00 AM

pls upload the questions
UNITED STATES


Mukesh 7/10/2023 4:14:00 PM

good questions
UNITED KINGDOM


Elie Abou Chrouch 12/11/2023 3:38:00 AM

question 182 - correct answer is d. ethernet frame length is 64 - 1518b. length of user data containing is that frame: 46 - 1500b.
Anonymous


Damien 9/23/2023 8:37:00 AM

i need this exam pls
Anonymous


Nani 9/10/2023 12:02:00 PM

its required for me, please make it enable to access. thanks
UNITED STATES


ethiopia 8/2/2023 2:18:00 AM

seems good..
ETHIOPIA


whoAreWeReally 12/19/2023 8:29:00 PM

took the test last week, i did have about 15 - 20 word for word from this site on the test. (only was able to cram 600 of the questions from this site so maybe more were there i didnt review) had 4 labs, bgp, lacp, vrf with tunnels and actually had to skip a lab due to time. lots of automation syntax questions.
EUROPEAN UNION


vs 9/2/2023 12:19:00 PM

no comments
Anonymous


john adenu 11/14/2023 11:02:00 AM

nice questions bring out the best in you.
Anonymous


Osman 11/21/2023 2:27:00 PM

really helpful
Anonymous


Edward 9/13/2023 5:27:00 PM

question #50 and question #81 are exactly the same questions, azure site recovery provides________for virtual machines. the first says that it is fault tolerance is the answer and second says disater recovery. from my research, it says it should be disaster recovery. can anybody explain to me why? thank you
CANADA


Monti 5/24/2023 11:14:00 PM

iam thankful for these exam dumps questions, i would not have passed without this exam dumps.
UNITED STATES


Anon 10/25/2023 10:48:00 PM

some of the answers seem to be inaccurate. q10 for example shouldnt it be an m custom column?
MALAYSIA


PeterPan 10/18/2023 10:22:00 AM

are the question real or fake?
Anonymous