Cloudera CCA175 Exam (page: 1)
Cloudera CCA Spark and Hadoop Developer Exam
Updated on: 11-Aug-2025

Viewing Page 1 of 21

CORRECT TEXT
Problem Scenario 28 : You need to implement near real time solutions for collecting information when submitted in file with below
Data
echo "IBM, 100, 20160104" >> /tmp/spooldir2/.bb.txt
echo "IBM, 103, 20160105" >> /tmp/spooldir2/.bb.txt
mv /tmp/spooldir2/.bb.txt /tmp/spooldir2/bb.txt
After few mins
echo "IBM, 100.2, 20160104" >> /tmp/spooldir2/.dr.txt
echo "IBM, 103.1, 20160105" >> /tmp/spooldir2/.dr.txt
mv /tmp/spooldir2/.dr.txt /tmp/spooldir2/dr.txt
You have been given below directory location (if not available than create it) /tmp/spooldir2 .
As soon as file committed in this directory that needs to be available in hdfs in /tmp/flume/primary as well as /tmp/flume/secondary location.
However, note that/tmp/flume/secondary is optional, if transaction failed which writes in this directory need not to be rollback.
Write a flume configuration file named flumeS.conf and use it to load data in hdfs with following additional properties .

1. Spool /tmp/spooldir2 directory
2. File prefix in hdfs sholuld be events
3. File suffix should be .log
4. If file is not committed and in use than it should have _ as prefix.
5. Data should be written as text to hdfs

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :Step 1: Create directory mkdir /tmp/spooldir2
Step 2: Create flume configuration file, with below configuration for source, sink and channel and save it in flume8.conf.
agent1 .sources = source1
agent1.sinks = sink1a sink1bagent1.channels = channel1a channel1b
agent1.sources.source1.channels = channel1a channel1b agent1.sources.source1.selector.type = replicating
agent1.sources.source1.selector.optional = channel1b agent1.sinks.sink1a.channel = channel1a
agent1 .sinks.sink1b.channel = channel1b
agent1.sources.source1.type = spooldir
agent1 .sources.sourcel.spoolDir = /tmp/spooldir2
agent1.sinks.sink1a.type = hdfs
agent1 .sinks, sink1a.hdfs. path = /tmp/flume/primary agent1 .sinks.sink1a.hdfs.tilePrefix = events
agent1 .sinks.sink1a.hdfs.fileSuffix = .log
agent1 .sinks.sink1a.hdfs.fileType = Data Stream
agent1 .sinks.sink1b.type = hdfs
agent1 .sinks.sink1b.hdfs.path = /tmp/flume/secondary agent1 .sinks.sink1b.hdfs.filePrefix = events
agent1.sinks.sink1b.hdfs.fileSuffix = .log
agent1 .sinks.sink1b.hdfs.fileType = Data Stream
agent1.channels.channel1a.type = file
agent1.channels.channel1b.type = memory
step 4 : Run below command which will use this configuration file and append data in hdfs.
Start flume service:
flume-ng agent -conf /home/cloudera/flumeconf -conf-file /home/cloudera/flumeconf/flume8.conf --name age
Step 5: Open another terminal and create a file in /tmp/spooldir2/
echo "IBM, 100, 20160104" » /tmp/spooldir2/.bb.txt
echo "IBM, 103, 20160105" » /tmp/spooldir2/.bb.txt mv /tmp/spooldir2/.bb.txt /tmp/spooldir2/bb.txt
After few mins
echo "IBM.100.2, 20160104" »/tmp/spooldir2/.dr.txt
echo "IBM, 103.1, 20160105" » /tmp/spooldir2/.dr.txt mv /tmp/spooldir2/.dr.txt /tmp/spooldir2/dr.txt



Problem Scenario 83 : In Continuation of previous question, please accomplish following activities.

1. Select all the records with quantity >= 5000 and name starts with 'Pen'
2. Select all the records with quantity >= 5000, price is less than 1.24 and name starts with
'Pen'
3. Select all the records witch does not have quantity >= 5000 and name does not starts
with 'Pen'
4. Select all the products which name is 'Pen Red', 'Pen Black'
5. Select all the products which has price BETWEEN 1.0 AND 2.0 AND quantity
BETWEEN 1000 AND 2000.

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :Step 1: Select all the records with quantity >= 5000 and name starts with 'Pen' val results = sqlContext.sql(......SELECT * FROM products WHERE quantity >= 5000 AND name LIKE 'Pen %.......)
results.show()
Step 2: Select all the records with quantity >= 5000 , price is less than 1.24 and name starts with 'Pen'
val results = sqlContext.sql(......SELECT * FROM products WHERE quantity >= 5000 AND price < 1.24 AND name LIKE 'Pen %.......)
results. showQ
Step 3: Select all the records witch does not have quantity >= 5000 and name does not starts with 'Pen'
val results = sqlContext.sql('.....SELECT * FROM products WHERE NOT (quantity >= 5000 AND name LIKE 'Pen %')......)
results. showQ
Step 4: Select all the products wchich name is 'Pen Red', 'Pen Black' val results = sqlContext.sql('.....SELECT' FROM products WHERE name IN ('Pen Red', 'Pen Black')......)
results. showQ
Step 5: Select all the products which has price BETWEEN 1.0 AND 2.0 AND quantity BETWEEN 1000 AND 2000.
val results = sqlContext.sql(......SELECT * FROM products WHERE (price BETWEEN 1.0 AND 2.0) AND (quantity BETWEEN 1000 AND 2000)......) results. show()



Problem Scenario 82 : You have been given table in Hive with following structure (Which you have created in previous exercise).
productid int code string name string quantity int price float
Using SparkSQL accomplish following activities.

1. Select all the products name and quantity having quantity <= 2000
2. Select name and price of the product having code as 'PEN'
3. Select all the products, which name starts with PENCIL
4. Select all products which "name" begins with 'P\ followed by any two characters,
followed by space, followed by zero or more characters

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
Step 1: Copy following tile (Mandatory Step in Cloudera QuickVM) if you have not done it.
sudo su root
cp /usr/lib/hive/conf/hive-site.xml /usr/lib/sparkVconf/
Step 2: Now start spark-shell
Step 3 ; Select all the products name and quantity having quantity <= 2000 val results = sqlContext.sql(......SELECT name, quantity FROM products WHERE quantity <= 2000......)
results.showQ
Step 4: Select name and price of the product having code as 'PEN' val results = sqlContext.sql(......SELECT name, price FROM products WHERE code = 'PEN.......)
results. showQ
Step 5: Select all the products , which name starts with PENCIL val results = sqlContext.sql(......SELECT name, price FROM products WHERE upper(name) LIKE 'PENCIL%.......}
results. showQ
Step 6: select all products which "name" begins with 'P', followed by any two characters, followed by space, followed byzero or more characters -- "name" begins with 'P', followed by any two characters,
- followed by space, followed by zero or more characters val results = sqlContext.sql(......SELECT name, price FROM products WHERE name LIKE 'P_ %.......)
results. show()



Problem Scenario 20 : You have been given MySQL DB with following details.
user=retail_dba
password=cloudera
database=retail_db
table=retail_db.categories
jdbc URL = jdbc:mysql://quickstart:3306/retail_db
Please accomplish following activities.

1. Write a Sqoop Job which will import "retaildb.categories" table to hdfs, in a directory
name "categories_targetJob".

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :Step 1: Connecting to existing MySQL Database mysql -user=retail_dba -- password=cloudera retail_db
Step 2: Show all the available tables show tables;
Step 3: Below is the command to create Sqoop Job (Please note that - import space is mandatory)
sqoop job -create sqoopjob \ -- import \
-connect "jdbc:mysql://quickstart:3306/retail_db" \
-username=retail_dba \
-password=cloudera \
-table categories \
-target-dir categories_targetJob \
-fields-terminated-by '|' \
-lines-terminated-by '\n'
Step 4: List all the Sqoop Jobs sqoop job --list
Step 5: Show details of the Sqoop Job sqoop job --show sqoopjob
Step 6: Execute the sqoopjob sqoopjob --exec sqoopjob
Step 7: Check the output of import job
hdfs dfs -Is categories_target_job
hdfs dfs -cat categories_target_job/part*



Problem Scenario 59 : You have been given below code snippet.
val x = sc.parallelize(1 to 20)
val y = sc.parallelize(10 to 30) operationl
z.collect
Write a correct code snippet for operationl which will produce desired output, shown below. Array[lnt] = Array(16, 12, 20, 13, 17, 14, 18, 10, 19, 15, 11)

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
val z = x.intersection(y)
intersection : Returns the elements in the two RDDs which are the same.



Viewing Page 1 of 21



Share your comments for Cloudera CCA175 exam with other users:

Pankaj 7/3/2023 3:57:00 AM

hi team, please upload this , i need it.
UNITED STATES


DN 9/4/2023 11:19:00 PM

question 14 - run terraform import: this is the recommended best practice for bringing manually created or destroyed resources under terraform management. you use terraform import to associate an existing resource with a terraform resource configuration. this ensures that terraform is aware of the resource, and you can subsequently manage it with terraform.
Anonymous


Zhiguang 8/19/2023 11:37:00 PM

please upload dump. thanks in advance.
Anonymous


deedee 12/23/2023 5:51:00 PM

great great
UNITED STATES


Asad Khan 11/1/2023 3:10:00 AM

answer 16 should be b your organizational policies require you to use virtual machines directly
Anonymous


Sale Danasabe 10/24/2023 5:21:00 PM

the question are kind of tricky of you didnt get the hnag on it.
Anonymous


Luis 11/16/2023 1:39:00 PM

can anyone tell me if this is for rhel8 or rhel9?
UNITED STATES


hik 1/19/2024 1:47:00 PM

good content
UNITED STATES


Blessious Phiri 8/15/2023 2:18:00 PM

pdb and cdb are critical to the database
Anonymous


Zuned 10/22/2023 4:39:00 AM

till 104 questions are free, lets see how it helps me in my exam today.
UNITED STATES


Muhammad Rawish Siddiqui 12/3/2023 12:11:00 PM

question # 56, answer is true not false.
SAUDI ARABIA


Amaresh Vashishtha 8/27/2023 1:33:00 AM

i would be requiring dumps to prepare for certification exam
Anonymous


Asad 9/8/2023 1:01:00 AM

very helpful
PAKISTAN


Blessious Phiri 8/13/2023 3:10:00 PM

control file is the heart of rman backup
Anonymous


Senthil 9/19/2023 5:47:00 AM

hi could you please upload the ibm c2090-543 dumps
Anonymous


Harry 6/27/2023 7:20:00 AM

appriciate if you could upload this again
AUSTRALIA


Anonymous 7/10/2023 4:10:00 AM

please upload the dump
SWEDEN


Raja 6/20/2023 5:30:00 AM

i found some questions answers mismatch with explanation answers. please properly update
UNITED STATES


Doora 11/30/2023 4:20:00 AM

nothing to mention
Anonymous


deally 1/19/2024 3:41:00 PM

knowable questions
UNITED STATES


Sonia 7/23/2023 4:03:00 PM

very helpfull
UNITED STATES


binEY 10/6/2023 5:15:00 AM

good questions
Anonymous


Neha 9/28/2023 1:58:00 PM

its helpful
Anonymous


Desmond 1/5/2023 9:11:00 PM

i just took my oracle exam and let me tell you, this exam dumps was a lifesaver! without them, iam not sure i would have passed. the questions were tricky and the answers were obscure, but the exam dumps had everything i needed. i would recommend to anyone looking to pass their oracle exams with flying colors (and a little bit of cheating) lol.
SINGAPORE


Davidson OZ 9/9/2023 6:37:00 PM

22. if you need to make sure that one computer in your hot-spot network can access the internet without hot-spot authentication, which menu allows you to do this? answer is ip binding and not wall garden. wall garden allows specified websites to be accessed with users authentication to the hotspot
Anonymous


381 9/2/2023 4:31:00 PM

is question 1 correct?
Anonymous


Laurent 10/6/2023 5:09:00 PM

good content
Anonymous


Sniper69 5/9/2022 11:04:00 PM

manged to pass the exam with this exam dumps.
UNITED STATES


Deepak 12/27/2023 2:37:00 AM

good questions
SINGAPORE


dba 9/23/2023 3:10:00 AM

can we please have the latest exam questions?
Anonymous


Prasad 9/29/2023 7:27:00 AM

please help with jn0-649 latest dumps
HONG KONG


GTI9982 7/31/2023 10:15:00 PM

please i need this dump. thanks
CANADA


Elton Riva 12/12/2023 8:20:00 PM

i have to take the aws certified developer - associate dva-c02 in the next few weeks and i wanted to know if the questions on your website are the same as the official exam.
Anonymous


Berihun Desalegn Wonde 7/13/2023 11:00:00 AM

all questions are more important
Anonymous