Databricks Certified Associate Developer for Apache Spark Databricks Certified Associate Developer for Apache Spark 3.0 Exam Questions in PDF

Free Databricks Databricks Certified Associate Developer for Apache Spark 3.0 Dumps Questions (page: 6)

Which of the following code blocks returns a DataFrame that has all columns of DataFrame transactionsDf and an additional column predErrorSquared which is the squared value of column predError in DataFrame transactionsDf?

  1. transactionsDf.withColumn("predError", pow(col("predErrorSquared"), 2))
  2. transactionsDf.withColumnRenamed("predErrorSquared", pow(predError, 2))
  3. transactionsDf.withColumn("predErrorSquared", pow(col("predError"), lit(2)))
  4. transactionsDf.withColumn("predErrorSquared", pow(predError, lit(2)))
  5. transactionsDf.withColumn("predErrorSquared", "predError"**2)

Answer(s): C

Explanation:

While only one of these code blocks works, the DataFrame API is pretty flexible when it comes to accepting columns into the pow() method. The following code blocks would also work: transactionsDf.withColumn("predErrorSquared", pow("predError", 2)) transactionsDf.withColumn("predErrorSquared", pow("predError", lit(2)))
Static notebook | Dynamic notebook: See test 1, Question: 26 (
Databricks import instructions) (https://flrs.github.io/spark_practice_tests_code/#1/26.html ,
https://bit.ly/sparkpracticeexams_import_instructions)



The code block displayed below contains an error. The code block should return a new DataFrame
that only contains rows from DataFrame transactionsDf in which the value in column predError is at least 5. Find the error. Code block:
transactionsDf.where("col(predError) >= 5")

  1. The argument to the where method should be "predError >= 5".
  2. Instead of where(), filter() should be used.
  3. The expression returns the original DataFrame transactionsDf and not a new DataFrame. To avoid this, the code block should be transactionsDf.toNewDataFrame().where("col(predError) >= 5").
  4. The argument to the where method cannot be a string.
  5. Instead of >=, the SQL operator GEQ should be used.

Answer(s): A

Explanation:

The argument to the where method cannot be a string. It can be a string, no problem here.
Instead of where(), filter() should be used.
No, that does not matter. In PySpark, where() and filter() are equivalent. Instead of >=, the SQL operator GEQ should be used.
Incorrect.
The expression returns the original DataFrame transactionsDf and not a new DataFrame. To avoid this, the code block should be transactionsDf.toNewDataFrame().where("col(predError) >= 5").
No, Spark returns a new DataFrame.
Static notebook | Dynamic notebook: See test 1, Question: 27 (
Databricks import instructions) (https://flrs.github.io/spark_practice_tests_code/#1/27.html ,
https://bit.ly/sparkpracticeexams_import_instructions)



Which of the following code blocks saves DataFrame transactionsDf in location
/FileStore/transactions.csv as a CSV file and throws an error if a file already exists in the location?

  1. transactionsDf.write.save("/FileStore/transactions.csv")
  2. transactionsDf.write.format("csv").mode("error").path("/FileStore/transactions.csv")
  3. transactionsDf.write.format("csv").mode("ignore").path("/FileStore/transactions.csv")
  4. transactionsDf.write("csv").mode("error").save("/FileStore/transactions.csv")
  5. transactionsDf.write.format("csv").mode("error").save("/FileStore/transactions.csv")

Answer(s): E

Explanation:

Static notebook | Dynamic notebook: See test 1, question 28 (
Databricks import instructions) (https://flrs.github.io/spark_practice_tests_code/#1/28.html ,
https://bit.ly/sparkpracticeexams_import_instructions)



The code block shown below should return a DataFrame with two columns, itemId and col. In this DataFrame, for each element in column attributes of DataFrame itemDf there should be a separate row in which the column itemId contains the associated itemId from DataFrame itemsDf. The new DataFrame should only contain rows for rows in DataFrame itemsDf in which the column attributes contains the element cozy.

A sample of DataFrame itemsDf is below.

Code block:
itemsDf. 1 ( 2 ). 3 ( 4 , 5 ( 6 ))

  1. 1. filter
    2. array_contains("cozy")
    3. select
    4. "itemId"
    5. explode
    6. "attributes"
  2. 1. where
    2. "array_contains(attributes, 'cozy')"
    3. select
    4. itemId
    5. explode
    6. attributes
  3. 1. filter
    2. "array_contains(attributes, 'cozy')"
    3. select
    4. "itemId"
    5. map
    6. "attributes"
  4. 1. filter
    2. "array_contains(attributes, cozy)"
    3. select
    4. "itemId"
    5. explode
    6. "attributes"
  5. 1. filter
    2. "array_contains(attributes, 'cozy')"
    3. select
    4. "itemId"
    5. explode
    6. "attributes"

Answer(s): E

Explanation:

The correct code block is:
itemsDf.filter("array_contains(attributes, 'cozy')").select("itemId", explode("attributes"))
The key here is understanding how to use array_contains(). You can either use it as an expression in a string, or you can import it from pyspark.sql.functions. In that case, the following would also work:
itemsDf.filter(array_contains("attributes", "cozy")).select("itemId", explode("attributes")) Static notebook | Dynamic notebook: See test 1, Question: 29 (
Databricks import instructions) (https://flrs.github.io/spark_practice_tests_code/#1/29.html ,
https://bit.ly/sparkpracticeexams_import_instructions)



The code block displayed below contains an error. The code block should return the average of rows in column value grouped by unique storeId. Find the error.
Code block: transactionsDf.agg("storeId").avg("value")

  1. Instead of avg("value"), avg(col("value")) should be used.
  2. The avg("value") should be specified as a second argument to agg() instead of being appended to it.
  3. All column names should be wrapped in col() operators.
  4. agg should be replaced by groupBy.
  5. "storeId" and "value" should be swapped.

Answer(s): D

Explanation:

Static notebook | Dynamic notebook: See test 1, Question: 30 (
Databricks import instructions) (https://flrs.github.io/spark_practice_tests_code/#1/30.html ,
https://bit.ly/sparkpracticeexams_import_instructions)



Share your comments for Databricks Databricks Certified Associate Developer for Apache Spark 3.0 exam with other users:

C
Calbert Francis
1/15/2024 8:19:00 PM

great exam for people taking 220-1101

A
Ayushi Baria
11/7/2023 7:44:00 AM

this is very helpfull for me

A
alma
8/25/2023 1:20:00 PM

just started preparing for the exam

C
CW
7/10/2023 6:46:00 PM

these are the type of questions i need.

N
Nobody
8/30/2023 9:54:00 PM

does this actually work? are they the exam questions and answers word for word?

S
Salah
7/23/2023 9:46:00 AM

thanks for providing these questions

R
Ritu
9/15/2023 5:55:00 AM

interesting

R
Ron
5/30/2023 8:33:00 AM

these dumps are pretty good.

S
Sowl
8/10/2023 6:22:00 PM

good questions

B
Blessious Phiri
8/15/2023 2:02:00 PM

dbua is used for upgrading oracle database

R
Richard
10/24/2023 6:12:00 AM

i am thrilled to say that i passed my amazon web services mls-c01 exam, thanks to study materials. they were comprehensive and well-structured, making my preparation efficient.

J
Janjua
5/22/2023 3:31:00 PM

please upload latest ibm ace c1000-056 dumps

M
Matt
12/30/2023 11:18:00 AM

if only explanations were provided...

R
Rasha
6/29/2023 8:23:00 PM

yes .. i need the dump if you can help me

A
Anonymous
7/25/2023 8:05:00 AM

good morning, could you please upload this exam again?

A
AJ
9/24/2023 9:32:00 AM

hi please upload sre foundation and practitioner exam questions

P
peter parker
8/10/2023 10:59:00 AM

the exam is listed as 80 questions with a pass mark of 70%, how is your 50 questions related?

B
Berihun
7/13/2023 7:29:00 AM

all questions are so important and covers all ccna modules

N
nspk
1/19/2024 12:53:00 AM

q 44. ans:- b (goto setup > order settings > select enable optional price books for orders) reference link --> https://resources.docs.salesforce.com/latest/latest/en-us/sfdc/pdf/sfom_impl_b2b_b2b2c.pdf(decide whether you want to enable the optional price books feature. if so, select enable optional price books for orders. you can use orders in salesforce while managing price books in an external platform. if you’re using d2c commerce, you must select enable optional price books for orders.)

M
Muhammad Rawish Siddiqui
12/2/2023 5:28:00 AM

"cost of replacing data if it were lost" is also correct.

A
Anonymous
7/14/2023 3:17:00 AM

pls upload the questions

M
Mukesh
7/10/2023 4:14:00 PM

good questions

E
Elie Abou Chrouch
12/11/2023 3:38:00 AM

question 182 - correct answer is d. ethernet frame length is 64 - 1518b. length of user data containing is that frame: 46 - 1500b.

D
Damien
9/23/2023 8:37:00 AM

i need this exam pls

N
Nani
9/10/2023 12:02:00 PM

its required for me, please make it enable to access. thanks

E
ethiopia
8/2/2023 2:18:00 AM

seems good..

W
whoAreWeReally
12/19/2023 8:29:00 PM

took the test last week, i did have about 15 - 20 word for word from this site on the test. (only was able to cram 600 of the questions from this site so maybe more were there i didnt review) had 4 labs, bgp, lacp, vrf with tunnels and actually had to skip a lab due to time. lots of automation syntax questions.

V
vs
9/2/2023 12:19:00 PM

no comments

J
john adenu
11/14/2023 11:02:00 AM

nice questions bring out the best in you.

O
Osman
11/21/2023 2:27:00 PM

really helpful

E
Edward
9/13/2023 5:27:00 PM

question #50 and question #81 are exactly the same questions, azure site recovery provides________for virtual machines. the first says that it is fault tolerance is the answer and second says disater recovery. from my research, it says it should be disaster recovery. can anybody explain to me why? thank you

M
Monti
5/24/2023 11:14:00 PM

iam thankful for these exam dumps questions, i would not have passed without this exam dumps.

A
Anon
10/25/2023 10:48:00 PM

some of the answers seem to be inaccurate. q10 for example shouldnt it be an m custom column?

P
PeterPan
10/18/2023 10:22:00 AM

are the question real or fake?

AI Tutor 👋 I’m here to help!