Databricks Certified Associate Developer for Apache Spark Certified Associate Developer for Apache Spark Exam Questions in PDF

Free Databricks Certified Associate Developer for Apache Spark Dumps Questions (page: 5)

Which of the following code blocks can be used to save DataFrame transactionsDf to memory only, recalculating partitions that do not fit in memory when they are needed?

  1. from pyspark import StorageLevel transactionsDf.cache(StorageLevel.MEMORY_ONLY)
  2. transactionsDf.cache()
  3. transactionsDf.storage_level('MEMORY_ONLY')
  4. transactionsDf.persist()
  5. transactionsDf.clear_persist()
  6. from pyspark import StorageLevel transactionsDf.persist(StorageLevel.MEMORY_ONLY)

Answer(s): F

Explanation:

from pyspark import StorageLevel transactionsDf.persist(StorageLevel.MEMORY_ONLY)
Correct. Note that the storage level MEMORY_ONLY means that all partitions that do not fit into memory will be recomputed when they are needed.
transactionsDf.cache()
This is wrong because the default storage level of DataFrame.cache() is MEMORY_AND_DISK, meaning that partitions that do not fit into memory are stored on disk.
transactionsDf.persist()
This is wrong because the default storage level of DataFrame.persist() is MEMORY_AND_DISK. transactionsDf.clear_persist()
Incorrect, since clear_persist() is not a method of DataFrame. transactionsDf.storage_level('MEMORY_ONLY')
Wrong. storage_level is not a method of DataFrame.
More info: RDD Programming Guide - Spark 3.0.0 Documentation, pyspark.sql.DataFrame.persist —
PySpark 3.0.0 documentation (https://bit.ly/3sxHLVC , https://bit.ly/3j2N6B9)



The code block displayed below contains an error. The code block should create DataFrame itemsAttributesDf which has columns itemId and attribute and lists every attribute from the attributes column in DataFrame itemsDf next to the itemId of the respective row in itemsDf. Find the error.
A sample of DataFrame itemsDf is below.

Code block:
itemsAttributesDf = itemsDf.explode("attributes").alias("attribute").select("attribute", "itemId")

  1. Since itemId is the index, it does not need to be an argument to the select() method.
  2. The alias() method needs to be called after the select() method.
  3. The explode() method expects a Column object rather than a string.
  4. explode() is not a method of DataFrame. explode() should be used inside the select() method instead.
  5. The split() method should be used inside the select() method instead of the explode() method.

Answer(s): D

Explanation:

The correct code block looks like this:

Then, the first couple of rows of itemAttributesDf look like this:

explode() is not a method of DataFrame. explode() should be used inside the select() method instead.
This is correct.
The split() method should be used inside the select() method instead of the explode() method.
No, the split() method is used to split strings into parts. However, column attributs is an array of strings. In this case, the explode() method is appropriate.
Since itemId is the index, it does not need to be an argument to the select() method. No, itemId still needs to be selected, whether it is used as an index or not.
The explode() method expects a Column object rather than a string.
No, a string works just fine here. This being said, there are some valid alternatives to passing in a string:

The alias() method needs to be called after the select() method. No.
More info: pyspark.sql.functions.explode — PySpark 3.1.1 documentation (https://bit.ly/2QUZI1J) Static notebook | Dynamic notebook: See test 1, Question: 22 (
Databricks import instructions) (https://flrs.github.io/spark_practice_tests_code/#1/22.html , https://bit.ly/sparkpracticeexams_import_instructions)



Which of the following code blocks reads in parquet file /FileStore/imports.parquet as a DataFrame?

  1. spark.mode("parquet").read("/FileStore/imports.parquet")
  2. spark.read.path("/FileStore/imports.parquet", source="parquet")
  3. spark.read().parquet("/FileStore/imports.parquet")
  4. spark.read.parquet("/FileStore/imports.parquet")
  5. spark.read().format('parquet').open("/FileStore/imports.parquet")

Answer(s): D

Explanation:

Static notebook | Dynamic notebook: See test 1, Question: 23 (
Databricks import instructions) (https://flrs.github.io/spark_practice_tests_code/#1/23.html ,
https://bit.ly/sparkpracticeexams_import_instructions)



The code block shown below should convert up to 5 rows in DataFrame transactionsDf that have the value 25 in column storeId into a Python list. Choose the answer that correctly fills the blanks in the code block to accomplish this. Code block:
transactionsDf. 1 ( 2 ). 3 ( 4 )

  1. 1. filter
    2. "storeId"==25
    3. collect 4. 5
  2. 1. filter
    2. col("storeId")==25
    3. toLocalIterator 4. 5
  3. 1. select
    2. storeId==25
    3. head 4. 5
  4. 1. filter
    2. col("storeId")==25
    3. take 4. 5
  5. 1. filter
    2. col("storeId")==25
    3. collect 4. 5

Answer(s): D

Explanation:

The correct code block is: transactionsDf.filter(col("storeId")==25).take(5)
Any of the options with collect will not work because collect does not take any arguments, and in both cases the argument 5 is given.
The option with toLocalIterator will not work because the only argument to toLocalIterator is prefetchPartitions which is a boolean, so passing 5 here does not make sense.
The option using head will not work because the expression passed to select is not proper syntax. It would work if the expression would be col("storeId")==25.
Static notebook | Dynamic notebook: See test 1, Question: 24 (
Databricks import instructions) (https://flrs.github.io/spark_practice_tests_code/#1/24.html ,
https://bit.ly/sparkpracticeexams_import_instructions)



Which of the following code blocks reads JSON file imports.json into a DataFrame?

  1. spark.read().mode("json").path("/FileStore/imports.json")
  2. spark.read.format("json").path("/FileStore/imports.json")
  3. spark.read("json", "/FileStore/imports.json")
  4. spark.read.json("/FileStore/imports.json")
  5. spark.read().json("/FileStore/imports.json")

Answer(s): D

Explanation:

Static notebook | Dynamic notebook: See test 1, Question: 25 (
Databricks import instructions) (https://flrs.github.io/spark_practice_tests_code/#1/25.html ,
https://bit.ly/sparkpracticeexams_import_instructions)



Share your comments for Databricks Certified Associate Developer for Apache Spark exam with other users:

M
Muhammad Rawish Siddiqui
12/9/2023 7:40:00 AM

question # 267: federated operating model is also correct.

M
Mayar
9/22/2023 4:58:00 AM

its helpful alot.

S
Sandeep
7/25/2022 11:58:00 PM

the questiosn from this braindumps are same as in the real exam. my passing mark was 84%.

E
Eman Sawalha
6/10/2023 6:09:00 AM

it is an exam that measures your understanding of cloud computing resources provided by aws. these resources are aligned under 6 categories: storage, compute, database, infrastructure, pricing and network. with all of the services and typees of services under each category

M
Mars
11/16/2023 1:53:00 AM

good and very useful

R
ronaldo7
10/24/2023 5:34:00 AM

i cleared the az-104 exam by scoring 930/1000 on the exam. it was all possible due to this platform as it provides premium quality service. thank you!

P
Palash Ghosh
9/11/2023 8:30:00 AM

easy questions

N
Noor
10/2/2023 7:48:00 AM

could you please upload ad0-127 dumps

K
Kotesh
7/27/2023 2:30:00 AM

good content

B
Biswa
11/20/2023 9:07:00 AM

understanding about joins

J
Jimmy Lopez
8/25/2023 10:19:00 AM

please upload oracle cloud infrastructure 2023 foundations associate exam braindumps. thank you.

L
Lily
4/24/2023 10:50:00 PM

questions made studying easy and enjoyable, passed on the first try!

J
John
8/7/2023 12:12:00 AM

has anyone recently attended safe 6.0 exam? did you see any questions from here?

B
Big Dog
6/24/2023 4:47:00 PM

question 13 should be dhcp option 43, right?

B
B.Khan
4/19/2022 9:43:00 PM

the buy 1 get 1 is a great deal. so far i have only gone over exam. it looks promissing. i report back once i write my exam.

G
Ganesh
12/24/2023 11:56:00 PM

is this dump good

A
Albin
10/13/2023 12:37:00 AM

good ................

P
Passed
1/16/2022 9:40:00 AM

passed

H
Harsh
6/12/2023 1:43:00 PM

yes going good

S
Salesforce consultant
1/2/2024 1:32:00 PM

good questions for practice

R
Ridima
9/12/2023 4:18:00 AM

need dump and sap notes for c_s4cpr_2308 - sap certified application associate - sap s/4hana cloud, public edition - sourcing and procurement

T
Tanvi Rajput
10/6/2023 6:50:00 AM

question 11: d i personally feel some answers are wrong.

A
Anil
7/18/2023 9:38:00 AM

nice questions

C
Chris
8/26/2023 1:10:00 AM

looking for c1000-158: ibm cloud technical advocate v4 questions

S
sachin
6/27/2023 1:22:00 PM

can you share the pdf

B
Blessious Phiri
8/13/2023 10:26:00 AM

admin ii is real technical stuff

L
Luis Manuel
7/13/2023 9:30:00 PM

could you post the link

V
vijendra
8/18/2023 7:54:00 AM

hello send me dumps

S
Simeneh
7/9/2023 8:46:00 AM

it is very nice

J
john
11/16/2023 5:13:00 PM

i gave the amazon dva-c02 tests today and passed. very helpful.

T
Tao
11/20/2023 8:53:00 AM

there is an incorrect word in the problem statement. for example, in question 1, there is the word "speci c". this is "specific. in the other question, there is the word "noti cation". this is "notification. these mistakes make this site difficult for me to use.

P
patricks
10/24/2023 6:02:00 AM

passed my az-120 certification exam today with 90% marks. studied using the dumps highly recommended to all.

A
Ananya
9/14/2023 5:17:00 AM

i need it, plz make it available

J
JM
12/19/2023 2:41:00 PM

q47: intrusion prevention system is the correct answer, not patch management. by definition, there are no patches available for a zero-day vulnerability. the way to prevent an attacker from exploiting a zero-day vulnerability is to use an ips.

AI Tutor 👋 I’m here to help!