The code block displayed below contains an error. The code block is intended to perform an outer join of DataFrames transactionsDf and itemsDf on columns productId and itemId, respectively.
Find the error.
Code block:
transactionsDf.join(itemsDf, [itemsDf.itemId, transactionsDf.productId], "outer")
- The "outer" argument should be eliminated, since "outer" is the default join type.
- The join type needs to be appended to the join() operator, like join().outer() instead of listing it as the last argument inside the join() call.
- The term [itemsDf.itemId, transactionsDf.productId] should be replaced by itemsDf.itemId == transactionsDf.productId.
- The term [itemsDf.itemId, transactionsDf.productId] should be replaced by itemsDf.col("itemId")
== transactionsDf.col("productId"). - The "outer" argument should be eliminated from the call and join should be replaced by joinOuter.
Answer(s): C
Explanation:
Correct code block:
transactionsDf.join(itemsDf, itemsDf.itemId == transactionsDf.productId, "outer") Static notebook | Dynamic notebook: See test 1, Question: 33 (
Databricks import instructions) (https://flrs.github.io/spark_practice_tests_code/#1/33.html ,
https://bit.ly/sparkpracticeexams_import_instructions)
Reveal Solution Next Question