Databricks Databricks Certified Associate Developer for Apache Spark 3.0 Exam (page: 5)
Databricks Certified Associate Developer for Apache Spark
Updated on: 09-Apr-2026

Which of the following code blocks can be used to save DataFrame transactionsDf to memory only, recalculating partitions that do not fit in memory when they are needed?

  1. from pyspark import StorageLevel transactionsDf.cache(StorageLevel.MEMORY_ONLY)
  2. transactionsDf.cache()
  3. transactionsDf.storage_level('MEMORY_ONLY')
  4. transactionsDf.persist()
  5. transactionsDf.clear_persist()
  6. from pyspark import StorageLevel transactionsDf.persist(StorageLevel.MEMORY_ONLY)

Answer(s): F

Explanation:

from pyspark import StorageLevel transactionsDf.persist(StorageLevel.MEMORY_ONLY)
Correct. Note that the storage level MEMORY_ONLY means that all partitions that do not fit into memory will be recomputed when they are needed.
transactionsDf.cache()
This is wrong because the default storage level of DataFrame.cache() is MEMORY_AND_DISK, meaning that partitions that do not fit into memory are stored on disk.
transactionsDf.persist()
This is wrong because the default storage level of DataFrame.persist() is MEMORY_AND_DISK. transactionsDf.clear_persist()
Incorrect, since clear_persist() is not a method of DataFrame. transactionsDf.storage_level('MEMORY_ONLY')
Wrong. storage_level is not a method of DataFrame.
More info: RDD Programming Guide - Spark 3.0.0 Documentation, pyspark.sql.DataFrame.persist —
PySpark 3.0.0 documentation (https://bit.ly/3sxHLVC , https://bit.ly/3j2N6B9)



The code block displayed below contains an error. The code block should create DataFrame itemsAttributesDf which has columns itemId and attribute and lists every attribute from the attributes column in DataFrame itemsDf next to the itemId of the respective row in itemsDf. Find the error.
A sample of DataFrame itemsDf is below.

Code block:
itemsAttributesDf = itemsDf.explode("attributes").alias("attribute").select("attribute", "itemId")

  1. Since itemId is the index, it does not need to be an argument to the select() method.
  2. The alias() method needs to be called after the select() method.
  3. The explode() method expects a Column object rather than a string.
  4. explode() is not a method of DataFrame. explode() should be used inside the select() method instead.
  5. The split() method should be used inside the select() method instead of the explode() method.

Answer(s): D

Explanation:

The correct code block looks like this:

Then, the first couple of rows of itemAttributesDf look like this:

explode() is not a method of DataFrame. explode() should be used inside the select() method instead.
This is correct.
The split() method should be used inside the select() method instead of the explode() method.
No, the split() method is used to split strings into parts. However, column attributs is an array of strings. In this case, the explode() method is appropriate.
Since itemId is the index, it does not need to be an argument to the select() method. No, itemId still needs to be selected, whether it is used as an index or not.
The explode() method expects a Column object rather than a string.
No, a string works just fine here. This being said, there are some valid alternatives to passing in a string:

The alias() method needs to be called after the select() method. No.
More info: pyspark.sql.functions.explode — PySpark 3.1.1 documentation (https://bit.ly/2QUZI1J) Static notebook | Dynamic notebook: See test 1, Question: 22 (
Databricks import instructions) (https://flrs.github.io/spark_practice_tests_code/#1/22.html , https://bit.ly/sparkpracticeexams_import_instructions)



Which of the following code blocks reads in parquet file /FileStore/imports.parquet as a DataFrame?

  1. spark.mode("parquet").read("/FileStore/imports.parquet")
  2. spark.read.path("/FileStore/imports.parquet", source="parquet")
  3. spark.read().parquet("/FileStore/imports.parquet")
  4. spark.read.parquet("/FileStore/imports.parquet")
  5. spark.read().format('parquet').open("/FileStore/imports.parquet")

Answer(s): D

Explanation:

Static notebook | Dynamic notebook: See test 1, Question: 23 (
Databricks import instructions) (https://flrs.github.io/spark_practice_tests_code/#1/23.html ,
https://bit.ly/sparkpracticeexams_import_instructions)



The code block shown below should convert up to 5 rows in DataFrame transactionsDf that have the value 25 in column storeId into a Python list. Choose the answer that correctly fills the blanks in the code block to accomplish this. Code block:
transactionsDf. 1 ( 2 ). 3 ( 4 )

  1. 1. filter
    2. "storeId"==25
    3. collect 4. 5
  2. 1. filter
    2. col("storeId")==25
    3. toLocalIterator 4. 5
  3. 1. select
    2. storeId==25
    3. head 4. 5
  4. 1. filter
    2. col("storeId")==25
    3. take 4. 5
  5. 1. filter
    2. col("storeId")==25
    3. collect 4. 5

Answer(s): D

Explanation:

The correct code block is: transactionsDf.filter(col("storeId")==25).take(5)
Any of the options with collect will not work because collect does not take any arguments, and in both cases the argument 5 is given.
The option with toLocalIterator will not work because the only argument to toLocalIterator is prefetchPartitions which is a boolean, so passing 5 here does not make sense.
The option using head will not work because the expression passed to select is not proper syntax. It would work if the expression would be col("storeId")==25.
Static notebook | Dynamic notebook: See test 1, Question: 24 (
Databricks import instructions) (https://flrs.github.io/spark_practice_tests_code/#1/24.html ,
https://bit.ly/sparkpracticeexams_import_instructions)



Which of the following code blocks reads JSON file imports.json into a DataFrame?

  1. spark.read().mode("json").path("/FileStore/imports.json")
  2. spark.read.format("json").path("/FileStore/imports.json")
  3. spark.read("json", "/FileStore/imports.json")
  4. spark.read.json("/FileStore/imports.json")
  5. spark.read().json("/FileStore/imports.json")

Answer(s): D

Explanation:

Static notebook | Dynamic notebook: See test 1, Question: 25 (
Databricks import instructions) (https://flrs.github.io/spark_practice_tests_code/#1/25.html ,
https://bit.ly/sparkpracticeexams_import_instructions)



Viewing Page 5 of 37



Share your comments for Databricks Databricks Certified Associate Developer for Apache Spark 3.0 exam with other users:

Blessious Phiri 8/13/2023 10:32:00 AM

very explainable
Anonymous


m7md ibrahim 5/26/2023 6:21:00 PM

i think answer of q 462 is variance analysis
Anonymous


Tehu 5/25/2023 12:25:00 PM

hi i need see questions
Anonymous


Ashfaq Nasir 1/17/2024 1:19:00 AM

best study material for exam
Anonymous


Roberto 11/27/2023 12:33:00 AM

very interesting repository
ITALY


Nale 9/18/2023 1:51:00 PM

american history 1
Anonymous


Tanvi 9/27/2023 4:02:00 AM

good level of questions
Anonymous


Boopathy 8/17/2023 1:03:00 AM

i need this dump kindly upload it
Anonymous


s_123 8/12/2023 4:28:00 PM

do we need c# coding to be az204 certified
Anonymous


Blessious Phiri 8/15/2023 3:38:00 PM

excellent topics covered
Anonymous


Manasa 12/5/2023 3:15:00 AM

are these really financial cloud questions and answers, seems these are basic admin question and answers
Anonymous


Not Robot 5/14/2023 5:33:00 PM

are these comments real
Anonymous


kriah 9/4/2023 10:44:00 PM

please upload the latest dumps
UNITED STATES


ed 12/17/2023 1:41:00 PM

a company runs its workloads on premises. the company wants to forecast the cost of running a large application on aws. which aws service or tool can the company use to obtain this information? pricing calculator ... the aws pricing calculator is primarily used for estimating future costs
UNITED STATES


Muru 12/29/2023 10:23:00 AM

looks interesting
Anonymous


Tech Lady 10/17/2023 12:36:00 PM

thanks! that’s amazing
Anonymous


Mike 8/20/2023 5:12:00 PM

the exam dumps are helping me get a solid foundation on the practical techniques and practices needed to be successful in the auditing world.
UNITED STATES


Nobody 9/18/2023 6:35:00 PM

q 14 should be dmz sever1 and notepad.exe why does note pad have a 443 connection
Anonymous


Muhammad Rawish Siddiqui 12/4/2023 12:17:00 PM

question # 108, correct answers are business growth and risk reduction.
SAUDI ARABIA


Emmah 7/29/2023 9:59:00 AM

are these valid chfi questions
KENYA


Mort 10/19/2023 7:09:00 PM

question: 162 should be dlp (b)
EUROPEAN UNION


Eknath 10/4/2023 1:21:00 AM

good exam questions
INDIA


Nizam 6/16/2023 7:29:00 AM

I have to say this is really close to real exam. Passed my exam with this.
EUROPEAN UNION


poran 11/20/2023 4:43:00 AM

good analytics question
Anonymous


Antony 11/23/2023 11:36:00 AM

this looks accurate
INDIA


Ethan 8/23/2023 12:52:00 AM

question 46, the answer should be data "virtualization" (not visualization).
Anonymous


nSiva 9/22/2023 5:58:00 AM

its useful.
UNITED STATES


Ranveer 7/26/2023 7:26:00 PM

Pass this exam 3 days ago. The PDF version and the Xengine App is quite useful.
SOUTH AFRICA


Sanjay 8/15/2023 10:22:00 AM

informative for me.
UNITED STATES


Tom 12/12/2023 8:53:00 PM

question 134s answer shoule be "dlp"
JAPAN


Alex 11/7/2023 11:02:00 AM

in 72 the answer must be [sys_user_has_role] table.
Anonymous


Finn 5/4/2023 10:21:00 PM

i appreciated the mix of multiple-choice and short answer questions. i passed my exam this morning.
IRLAND


AJ 7/13/2023 8:33:00 AM

great to find this website, thanks
UNITED ARAB EMIRATES


Curtis Nakawaki 6/29/2023 9:11:00 PM

examination questions seem to be relevant.
UNITED STATES