A Spark job is taking longer than expected. Using the Spark UI, a data engineer notes that the Min, Median, and Max Durations for tasks in a particular stage show the minimum and median time to complete a task as roughly the same, but the max duration for a task to be roughly 100 times as long as the minimum.Which situation is causing increased duration of the overall job?
Answer(s): D
Each configuration below is identical to the extent that each cluster has 400 GB total of RAM, 160 total cores and only one Executor per VM.Given a job with at least one wide transformation, which of the following cluster configurations will result inmaximum performance?
Answer(s): B
A junior data engineer has implemented the following code block.The view new_events contains a batch of records with the same schema as the events Delta table. The event_id field serves as a unique key for this table.When this query is executed, what will happen with new records that have the same event_id as an existing record?
A junior data engineer seeks to leverage Delta Lake's Change Data Feed functionality to create a Type 1 table representing all of the values that have ever been valid for all rows in a bronze table created with the propertydelta.enableChangeDataFeed = true. They plan to execute the following code as a daily job:Which statement describes the execution and results of running the above query multiple times?
A new data engineer notices that a critical field was omitted from an application that writes its Kafka source to Delta Lake. This happened even though the critical field was in the Kafka source. That field was further missing from data written to dependent, long-term storage. The retention threshold on the Kafka service is seven days.The pipeline has been in production for three months.Which describes how Delta Lake can help to avoid data loss of this nature in the future?
Answer(s): E
A nightly job ingests data into a Delta Lake table using the following code:The next step in the pipeline requires a function that returns an object that can be used to manipulate new records that have not yet been processed to the next table in the pipeline.Which code snippet completes this function definition?def new_records():
Answer(s): A
A junior data engineer is working to implement logic for a Lakehouse table named silver_device_recordings. The source data contains 100 unique fields in a highly nested JSON structure.The silver_device_recordings table will be used downstream to power several production monitoring dashboards and a production model. At present, 45 of the 100 fields are being used in at least one of these applications.The data engineer is trying to determine the best approach for dealing with schema declaration given the highly-nested structure of the data and the numerous fields.Which of the following accurately presents information about Delta Lake and Databricks that may impact their decision-making process?
The data engineering team maintains the following code:Assuming that this code produces logically correct results and the data in the source tables has been de- duplicated and validated, which statement describes what will occur when this code is executed?
Share your comments for Databricks Certified Data Engineer Professional exam with other users:
nice practice dumps
nokia 4a0-114 dumps
great content and wonderful to have the answers with explanation
for question #118, the answer is option c. the screen shot is showing the drop down, but the answer is marked incorrectly please update . thanks for sharing such nice questions.
the correct answer for the question 29 is d.
question no 22: correct answers: bc, 1 per session 1 per page 1 per component always
these are pretty useful
awesome
yes please upload
great job whoever put this together, for the greater good! thanks!
just started to view all questions for the exam
helpful material
hope for the best
will post exam has finished
really correct and good analyze!
excellent thanks a lot
will post once pass the cka exam
good content
q:32 answer has to be option c
nice questions
i really like the support team in this website. they are fast in communication and very helpful.
a good contemporary exam review
q23, its an array, isnt it? starts with [ and end with ]. its an array of objects, not object.
cool very helpfull
i just passed. this exam dumps is the same one from prepaway and examcollection. it has all the real test questions.
is this a valid prince2 practitioner dumps?
all are relatable questions
might help me to prepare for the exam
just paid and downlaod the 2 exams using the 50% sale discount. so far i was able to download the pdf and the test engine. all looks good.
i think it should be a,c. option d goes against the principle of building anything custom unless there are no work arounds available
very legible
is this exam accurate or helpful?
please upload dump, i have exam in 2 days
this is useful