Cloudera CCA175 Exam (page: 3)
Cloudera CCA Spark and Hadoop Developer Exam
Updated on: 12-Feb-2026

Viewing Page 3 of 21

Problem Scenario 71 :
Write down a Spark script using Python,
In which it read a file "Content.txt" (On hdfs) with following content.
After that split each row as (key, value), where key is first word in line and entire line as value.
Filter out the empty lines.
And save this key value in "problem86" as Sequence file(On hdfs)
Part 2 : Save as sequence file , where key as null and entire line as value. Read back the stored sequence files.
Content.txt
Hello this is ABCTECH.com
This is XYZTECH.com
Apache Spark Training
This is Spark Learning Session

Spark is faster than MapReduce

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
Step 1:
# Import SparkContext and SparkConf
from pyspark import SparkContext, SparkConf

Step 2:
#load data from hdfs
contentRDD = sc.textFile(MContent.txt")

Step 3:
#filter out non-empty lines
nonemptyjines = contentRDD.filter(lambda x: len(x) > 0)

Step 4:
#Split line based on space (Remember : It is mandatory to convert is in tuple} words = nonempty_lines.map(lambda x: tuple(x.split('', 1)))
words.saveAsSequenceFile("problem86")

Step 5: Check contents in directory problem86 hdfs dfs -cat problem86/part*
Step 6: Create key, value pair (where key is null)
nonempty_lines.map(lambda line: (None, Mne}).saveAsSequenceFile("problem86_1")
Step 7: Reading back the sequence file data using spark. seqRDD = sc.sequenceFile("problem86_1")
Step 8: Print the content to validate the same.
for line in seqRDD.collect():
print(line)



Problem Scenario 12 : You have been given following mysql database details as well as other info.
user=retail_dba
password=cloudera
database=retail_db
jdbc URL = jdbc:mysql://quickstart:3306/retail_db
Please accomplish following.

1. Create a table in retailedb with following definition.
CREATE table departments_new (department_id int(11), department_name varchar(45),
created_date T1MESTAMP DEFAULT NOW());
2. Now isert records from departments table to departments_new
3. Now import data from departments_new table to hdfs.
4. Insert following 5 records in departmentsnew table. Insert into departments_new
values(110, "Civil" , null); Insert into departments_new values(111, "Mechanical" , null);
Insert into departments_new values(112, "Automobile" , null); Insert into departments_new
values(113, "Pharma" , null);
Insert into departments_new values(114, "Social Engineering" , null);
5. Now do the incremental import based on created_date column.

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
Step 1: Login to musql db
mysql --user=retail_dba -password=cloudera
show databases;
use retail db; show tables;
Step 2: Create a table as given in problem statement. CREATE table departments_new (department_id int(11), department_name varchar(45), createddate T1MESTAMP DEFAULT NOW());
show tables;
Step 3: isert records from departments table to departments_new insert into departments_new select a.", null from departments a;
Step 4: Import data from departments new table to hdfs.
sqoop import \
-connect jdbc:mysql://quickstart:330G/retail_db \
~username=retail_dba \
-password=cloudera \
-table departments_new\
--target-dir /user/cloudera/departments_new \
--split-by departments
Stpe 5 : Check the imported data.
hdfs dfs -cat /user/cloudera/departmentsnew/part"
Step 6: Insert following 5 records in departmentsnew table. Insert into departments_new values(110, "Civil" , null); Insert into departments_new values(111, "Mechanical" , null); Insert into departments_new values(112, "Automobile" , null); Insert into departments_new values(113, "Pharma" , null); Insert into departments_new values(114, "Social Engineering" , null); commit;
Stpe 7 : Import incremetal data based on created_date column.
sqoop import \
-connect jdbc:mysql://quickstart:330G/retaiI_db \
-username=retail_dba \
-password=cloudera \
--table departments_new\
-target-dir /user/cloudera/departments_new \
-append \
-check-column created_date \
-incremental lastmodified \
-split-by departments \
-last-value "2016-01-30 12:07:37.0"
Step 8: Check the imported value.
hdfs dfs -cat /user/cloudera/departmentsnew/part"



Problem Scenario 29 : Please accomplish the following exercises using HDFS command line options.

1. Create a directory in hdfs named hdfs_commands.
2. Create a file in hdfs named data.txt in hdfs_commands.
3. Now copy this data.txt file on local filesystem, however while copying file please make
sure file properties are not changed e.g. file permissions.
4. Now create a file in local directory named data_local.txt and move this file to hdfs in
hdfs_commands directory.
5. Create a file data_hdfs.txt in hdfs_commands directory and copy it to local file system.
6. Create a file in local filesystem named file1.txt and put it to hdfs

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
Step 1: Create directory
hdfs dfs -mkdir hdfs_commands
Step 2: Create a file in hdfs named data.txt in hdfs_commands. hdfs dfs -touchz hdfs_commands/data.txt
Step 3: Now copy this data.txt file on local filesystem, however while copying file please make sure file properties are not changed e.g. file permissions.
hdfs dfs -copyToLocal -p hdfs_commands/data.txt/home/cloudera/Desktop/HadoopExam
Step 4: Now create a file in local directory named data_local.txt and move this file to hdfs in hdfs_commands directory.
touch data_local.txt
hdfs dfs -moveFromLocal /home/cloudera/Desktop/HadoopExam/dataJocal.txt hdfs_commands/
Step 5: Create a file data_hdfs.txt in hdfs_commands directory and copy it to local file system.
hdfs dfs -touchz hdfscommands/data hdfs.txt
hdfs dfs -getfrdfs_commands/data_hdfs.txt /home/cloudera/Desktop/HadoopExam/
Step 6: Create a file in local filesystem named filel .txt and put it to hdfs touch filel.txt
hdfs dfs -put/home/cloudera/Desktop/HadoopExam/file1.txt hdfs_commands/



Problem Scenario 86 : In Continuation of previous question, please accomplish following activities.

1. Select Maximum, minimum, average , Standard Deviation, and total quantity.
2. Select minimum and maximum price for each product code.
3. Select Maximum, minimum, average , Standard Deviation, and total quantity for each
product code, hwoever make sure Average and Standard deviation will have maximum two
decimal values.
4. Select all the product code and average price only where product count is more than or
equal to 3.
5. Select maximum, minimum , average and total of all the products for each code. Also
produce the same across all the products.

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
Step 1: Select Maximum, minimum, average , Standard Deviation, and total quantity. val results = sqlContext.sql('.....SELECT MAX(price) AS MAX , MIN(price) AS MIN , AVG(price) AS Average, STD(price) AS STD, SUM(quantity) AS total_products FROM products......)
results. showQ
Step 2: Select minimum and maximum price for each product code. val results = sqlContext.sql(......SELECT code, MAX(price) AS Highest Price', MIN(price) AS Lowest Price'
FROM products GROUP BY code......)
results. showQ
Step 3: Select Maximum, minimum, average , Standard Deviation, and total quantity for each product code, hwoever make sure Average and Standard deviation will have maximum two decimal values.
val results = sqlContext.sql(......SELECT code, MAX(price), MIN(price), CAST(AVG(price} AS DECIMAL(7, 2)) AS Average', CAST(STD(price) AS DECIMAL(7, 2)) AS 'Std Dev\ SUM(quantity) FROM products
GROUP BY code......)
results. showQ
Step 4: Select all the product code and average price only where product count is more than or equal to 3.
val results = sqlContext.sql(......SELECT code AS Product Code', COUNTf) AS Count',
CAST(AVG(price) AS DECIMAL(7, 2)) AS Average' FROM products GROUP BY code HAVING Count >=3"M") results. showQ
Step 5: Select maximum, minimum , average and total of all the products for each code.
Also produce the same across all the products.
val results = sqlContext.sql( """SELECT
code,
MAX(price),
MIN(pnce),
CAST(AVG(price) AS DECIMAL(7, 2)) AS Average',
SUM(quantity)-
FROM products
GROUP BY code
WITH ROLLUP""" )
results. show()



Problem Scenario 9 : You have been given following mysql database details as well as other info.
user=retail_dba
password=cloudera
database=retail_db
jdbc URL = jdbc:mysql://quickstart:3306/retail_db
Please accomplish following.

1. Import departments table in a directory.
2. Again import departments table same directory (However, directory already exist hence
it should not overrride and append the results)
3. Also make sure your results fields are terminated by '|' and lines terminated by '\n\

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution:
Step 1: Clean the hdfs file system, if they exists clean out.
hadoop fs -rm -R departments
hadoop fs -rm -R categories
hadoop fs -rm -R products
hadoop fs -rm -R orders
hadoop fs -rm -R order_items
hadoop fs -rm -R customers
Step 2: Now import the department table as per requirement.
sqoop import \
-connect jdbc:mysql://quickstart:330G/retaiI_db \
--username=retail_dba \
-password=cloudera \
-table departments \
-target-dir=departments \
-fields-terminated-by '|' \
-lines-terminated-by '\n' \
-ml
Step 3: Check imported data.
hdfs dfs -Is departments
hdfs dfs -cat departments/part-m-00000
Step 4: Now again import data and needs to appended.
sqoop import \
-connect jdbc:mysql://quickstart:3306/retail_db \
--username=retail_dba \
-password=cloudera \
-table departments \
-target-dir departments \
-append \
-tields-terminated-by '|' \
-lines-termtnated-by '\n' \
-ml
Step 5: Again Check the results
hdfs dfs -Is departments
hdfs dfs -cat departments/part-m-00001



Viewing Page 3 of 21



Share your comments for Cloudera CCA175 exam with other users:

JM 12/19/2023 1:22:00 PM

answer to 134 is casb. while data loss prevention is the goal, in order to implement dlp in cloud applications you need to deploy a casb.
UNITED STATES


Neo 7/26/2023 9:36:00 AM

are these brain dumps sufficient enough to go write exam after practicing them? or does one need more material this wont be enough?
SOUTH AFRICA


Bilal 8/22/2023 6:33:00 AM

i did attend the required cources and i need to be sure that i am ready to take the exam, i would ask you please to share the questions, to be sure that i am fit to proceed with taking the exam.
Anonymous


John 11/12/2023 8:48:00 PM

why only give explanations on some, and not all questions and their respective answers?
UNITED STATES


Biswa 11/20/2023 8:50:00 AM

refresh db knowledge
Anonymous


Shalini Sharma 10/17/2023 8:29:00 AM

interested for sap certification
JAPAN


ethan 9/24/2023 12:38:00 PM

could you please upload practice questions for scr exam ?
HONG KONG


vijay joshi 8/19/2023 3:15:00 AM

please upload free oracle cloud infrastructure 2023 foundations associate exam braindumps
Anonymous


Ayodele Talabi 8/25/2023 9:25:00 PM

sweating! they are tricky
CANADA


Romero 3/23/2022 4:20:00 PM

i never use these dumps sites but i had to do it for this exam as it is impossible to pass without using these question dumps.
UNITED STATES


John Kennedy 9/20/2023 3:33:00 AM

good practice and well sites.
Anonymous


Nenad 7/12/2022 11:05:00 PM

passed my first exam last week and pass the second exam this morning. thank you sir for all the help and these brian dumps.
INDIA


Lucky 10/31/2023 2:01:00 PM

does anyone who attended exam csa 8.8, can confirm these questions are really coming ? or these are just for practicing?
HONG KONG


Prateek 9/18/2023 11:13:00 AM

kindly share the dumps
UNITED STATES


Irfan 11/25/2023 1:26:00 AM

very nice content
Anonymous


php 6/16/2023 12:49:00 AM

passed today
Anonymous


Durga 6/23/2023 1:22:00 AM

hi can you please upload questions
Anonymous


JJ 5/28/2023 4:32:00 AM

please upload quetions
THAILAND


Norris 1/3/2023 8:06:00 PM

i passed my exam thanks to this braindumps questions. these questions are valid in us and i highly recommend it!
UNITED STATES


abuti 7/21/2023 6:10:00 PM

are they truely latest
Anonymous


Curtis Nakawaki 7/5/2023 8:46:00 PM

questions appear contemporary.
UNITED STATES


Vv 12/2/2023 6:31:00 AM

good to prepare in this site
UNITED STATES


praveenkumar 11/20/2023 11:57:00 AM

very helpful to crack first attempt
Anonymous


asad Raza 5/15/2023 5:38:00 AM

please upload this exam
CHINA


Reeta 7/17/2023 5:22:00 PM

please upload the c_activate22 dump questions with answer
SWEDEN


Wong 12/20/2023 11:34:00 AM

q10 - the answer should be a. if its c, the criteria will meet if either the prospect is not part of the suppression lists or if the job title contains vice president
MALAYSIA


david 12/12/2023 12:38:00 PM

this was on the exam as of 1211/2023
Anonymous


Tink 7/24/2023 9:23:00 AM

great for prep
GERMANY


Jaro 12/18/2023 3:12:00 PM

i think in question 7 the first answer should be power bi portal (not power bi)
Anonymous


9eagles 4/7/2023 10:04:00 AM

on question 10 and so far 2 wrong answers as evident in the included reference link.
Anonymous


Tai 8/28/2023 5:28:00 AM

wonderful material
SOUTH AFRICA


VoiceofMidnight 12/29/2023 4:48:00 PM

i passed!! ...but barely! got 728, but needed 720 to pass. the exam hit me with labs right out of the gate! then it went to multiple choice. protip: study the labs!
UNITED STATES


A K 8/3/2023 11:56:00 AM

correct answer for question 92 is c -aws shield
Anonymous


Nitin Mindhe 11/27/2023 6:12:00 AM

great !! it is really good
IRELAND