Cloudera CCA175 Exam (page: 1)
Cloudera CCA Spark and Hadoop Developer Exam
Updated on: 25-Sep-2025

Viewing Page 1 of 21

CORRECT TEXT
Problem Scenario 28 : You need to implement near real time solutions for collecting information when submitted in file with below
Data
echo "IBM, 100, 20160104" >> /tmp/spooldir2/.bb.txt
echo "IBM, 103, 20160105" >> /tmp/spooldir2/.bb.txt
mv /tmp/spooldir2/.bb.txt /tmp/spooldir2/bb.txt
After few mins
echo "IBM, 100.2, 20160104" >> /tmp/spooldir2/.dr.txt
echo "IBM, 103.1, 20160105" >> /tmp/spooldir2/.dr.txt
mv /tmp/spooldir2/.dr.txt /tmp/spooldir2/dr.txt
You have been given below directory location (if not available than create it) /tmp/spooldir2 .
As soon as file committed in this directory that needs to be available in hdfs in /tmp/flume/primary as well as /tmp/flume/secondary location.
However, note that/tmp/flume/secondary is optional, if transaction failed which writes in this directory need not to be rollback.
Write a flume configuration file named flumeS.conf and use it to load data in hdfs with following additional properties .

1. Spool /tmp/spooldir2 directory
2. File prefix in hdfs sholuld be events
3. File suffix should be .log
4. If file is not committed and in use than it should have _ as prefix.
5. Data should be written as text to hdfs

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :Step 1: Create directory mkdir /tmp/spooldir2
Step 2: Create flume configuration file, with below configuration for source, sink and channel and save it in flume8.conf.
agent1 .sources = source1
agent1.sinks = sink1a sink1bagent1.channels = channel1a channel1b
agent1.sources.source1.channels = channel1a channel1b agent1.sources.source1.selector.type = replicating
agent1.sources.source1.selector.optional = channel1b agent1.sinks.sink1a.channel = channel1a
agent1 .sinks.sink1b.channel = channel1b
agent1.sources.source1.type = spooldir
agent1 .sources.sourcel.spoolDir = /tmp/spooldir2
agent1.sinks.sink1a.type = hdfs
agent1 .sinks, sink1a.hdfs. path = /tmp/flume/primary agent1 .sinks.sink1a.hdfs.tilePrefix = events
agent1 .sinks.sink1a.hdfs.fileSuffix = .log
agent1 .sinks.sink1a.hdfs.fileType = Data Stream
agent1 .sinks.sink1b.type = hdfs
agent1 .sinks.sink1b.hdfs.path = /tmp/flume/secondary agent1 .sinks.sink1b.hdfs.filePrefix = events
agent1.sinks.sink1b.hdfs.fileSuffix = .log
agent1 .sinks.sink1b.hdfs.fileType = Data Stream
agent1.channels.channel1a.type = file
agent1.channels.channel1b.type = memory
step 4 : Run below command which will use this configuration file and append data in hdfs.
Start flume service:
flume-ng agent -conf /home/cloudera/flumeconf -conf-file /home/cloudera/flumeconf/flume8.conf --name age
Step 5: Open another terminal and create a file in /tmp/spooldir2/
echo "IBM, 100, 20160104" » /tmp/spooldir2/.bb.txt
echo "IBM, 103, 20160105" » /tmp/spooldir2/.bb.txt mv /tmp/spooldir2/.bb.txt /tmp/spooldir2/bb.txt
After few mins
echo "IBM.100.2, 20160104" »/tmp/spooldir2/.dr.txt
echo "IBM, 103.1, 20160105" » /tmp/spooldir2/.dr.txt mv /tmp/spooldir2/.dr.txt /tmp/spooldir2/dr.txt



Problem Scenario 83 : In Continuation of previous question, please accomplish following activities.

1. Select all the records with quantity >= 5000 and name starts with 'Pen'
2. Select all the records with quantity >= 5000, price is less than 1.24 and name starts with
'Pen'
3. Select all the records witch does not have quantity >= 5000 and name does not starts
with 'Pen'
4. Select all the products which name is 'Pen Red', 'Pen Black'
5. Select all the products which has price BETWEEN 1.0 AND 2.0 AND quantity
BETWEEN 1000 AND 2000.

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :Step 1: Select all the records with quantity >= 5000 and name starts with 'Pen' val results = sqlContext.sql(......SELECT * FROM products WHERE quantity >= 5000 AND name LIKE 'Pen %.......)
results.show()
Step 2: Select all the records with quantity >= 5000 , price is less than 1.24 and name starts with 'Pen'
val results = sqlContext.sql(......SELECT * FROM products WHERE quantity >= 5000 AND price < 1.24 AND name LIKE 'Pen %.......)
results. showQ
Step 3: Select all the records witch does not have quantity >= 5000 and name does not starts with 'Pen'
val results = sqlContext.sql('.....SELECT * FROM products WHERE NOT (quantity >= 5000 AND name LIKE 'Pen %')......)
results. showQ
Step 4: Select all the products wchich name is 'Pen Red', 'Pen Black' val results = sqlContext.sql('.....SELECT' FROM products WHERE name IN ('Pen Red', 'Pen Black')......)
results. showQ
Step 5: Select all the products which has price BETWEEN 1.0 AND 2.0 AND quantity BETWEEN 1000 AND 2000.
val results = sqlContext.sql(......SELECT * FROM products WHERE (price BETWEEN 1.0 AND 2.0) AND (quantity BETWEEN 1000 AND 2000)......) results. show()



Problem Scenario 82 : You have been given table in Hive with following structure (Which you have created in previous exercise).
productid int code string name string quantity int price float
Using SparkSQL accomplish following activities.

1. Select all the products name and quantity having quantity <= 2000
2. Select name and price of the product having code as 'PEN'
3. Select all the products, which name starts with PENCIL
4. Select all products which "name" begins with 'P\ followed by any two characters,
followed by space, followed by zero or more characters

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
Step 1: Copy following tile (Mandatory Step in Cloudera QuickVM) if you have not done it.
sudo su root
cp /usr/lib/hive/conf/hive-site.xml /usr/lib/sparkVconf/
Step 2: Now start spark-shell
Step 3 ; Select all the products name and quantity having quantity <= 2000 val results = sqlContext.sql(......SELECT name, quantity FROM products WHERE quantity <= 2000......)
results.showQ
Step 4: Select name and price of the product having code as 'PEN' val results = sqlContext.sql(......SELECT name, price FROM products WHERE code = 'PEN.......)
results. showQ
Step 5: Select all the products , which name starts with PENCIL val results = sqlContext.sql(......SELECT name, price FROM products WHERE upper(name) LIKE 'PENCIL%.......}
results. showQ
Step 6: select all products which "name" begins with 'P', followed by any two characters, followed by space, followed byzero or more characters -- "name" begins with 'P', followed by any two characters,
- followed by space, followed by zero or more characters val results = sqlContext.sql(......SELECT name, price FROM products WHERE name LIKE 'P_ %.......)
results. show()



Problem Scenario 20 : You have been given MySQL DB with following details.
user=retail_dba
password=cloudera
database=retail_db
table=retail_db.categories
jdbc URL = jdbc:mysql://quickstart:3306/retail_db
Please accomplish following activities.

1. Write a Sqoop Job which will import "retaildb.categories" table to hdfs, in a directory
name "categories_targetJob".

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :Step 1: Connecting to existing MySQL Database mysql -user=retail_dba -- password=cloudera retail_db
Step 2: Show all the available tables show tables;
Step 3: Below is the command to create Sqoop Job (Please note that - import space is mandatory)
sqoop job -create sqoopjob \ -- import \
-connect "jdbc:mysql://quickstart:3306/retail_db" \
-username=retail_dba \
-password=cloudera \
-table categories \
-target-dir categories_targetJob \
-fields-terminated-by '|' \
-lines-terminated-by '\n'
Step 4: List all the Sqoop Jobs sqoop job --list
Step 5: Show details of the Sqoop Job sqoop job --show sqoopjob
Step 6: Execute the sqoopjob sqoopjob --exec sqoopjob
Step 7: Check the output of import job
hdfs dfs -Is categories_target_job
hdfs dfs -cat categories_target_job/part*



Problem Scenario 59 : You have been given below code snippet.
val x = sc.parallelize(1 to 20)
val y = sc.parallelize(10 to 30) operationl
z.collect
Write a correct code snippet for operationl which will produce desired output, shown below. Array[lnt] = Array(16, 12, 20, 13, 17, 14, 18, 10, 19, 15, 11)

  1. See the explanation for Step by Step Solution and configuration.

Answer(s): A

Explanation:

Solution :
val z = x.intersection(y)
intersection : Returns the elements in the two RDDs which are the same.



Viewing Page 1 of 21



Share your comments for Cloudera CCA175 exam with other users:

LOL what a joke 9/10/2023 9:09:00 AM

some of the answers are incorrect, i would be wary of using this until an admin goes back and reviews all the answers
UNITED STATES


Muhammad Rawish Siddiqui 12/9/2023 7:40:00 AM

question # 267: federated operating model is also correct.
SAUDI ARABIA


Mayar 9/22/2023 4:58:00 AM

its helpful alot.
Anonymous


Sandeep 7/25/2022 11:58:00 PM

the questiosn from this braindumps are same as in the real exam. my passing mark was 84%.
INDIA


Eman Sawalha 6/10/2023 6:09:00 AM

it is an exam that measures your understanding of cloud computing resources provided by aws. these resources are aligned under 6 categories: storage, compute, database, infrastructure, pricing and network. with all of the services and typees of services under each category
GREECE


Mars 11/16/2023 1:53:00 AM

good and very useful
TAIWAN PROVINCE OF CHINA


ronaldo7 10/24/2023 5:34:00 AM

i cleared the az-104 exam by scoring 930/1000 on the exam. it was all possible due to this platform as it provides premium quality service. thank you!
UNITED STATES


Palash Ghosh 9/11/2023 8:30:00 AM

easy questions
Anonymous


Noor 10/2/2023 7:48:00 AM

could you please upload ad0-127 dumps
INDIA


Kotesh 7/27/2023 2:30:00 AM

good content
Anonymous


Biswa 11/20/2023 9:07:00 AM

understanding about joins
Anonymous


Jimmy Lopez 8/25/2023 10:19:00 AM

please upload oracle cloud infrastructure 2023 foundations associate exam braindumps. thank you.
Anonymous


Lily 4/24/2023 10:50:00 PM

questions made studying easy and enjoyable, passed on the first try!
UNITED STATES


John 8/7/2023 12:12:00 AM

has anyone recently attended safe 6.0 exam? did you see any questions from here?
Anonymous


Big Dog 6/24/2023 4:47:00 PM

question 13 should be dhcp option 43, right?
UNITED STATES


B.Khan 4/19/2022 9:43:00 PM

the buy 1 get 1 is a great deal. so far i have only gone over exam. it looks promissing. i report back once i write my exam.
INDIA


Ganesh 12/24/2023 11:56:00 PM

is this dump good
Anonymous


Albin 10/13/2023 12:37:00 AM

good ................
EUROPEAN UNION


Passed 1/16/2022 9:40:00 AM

passed
GERMANY


Harsh 6/12/2023 1:43:00 PM

yes going good
Anonymous


Salesforce consultant 1/2/2024 1:32:00 PM

good questions for practice
FRANCE


Ridima 9/12/2023 4:18:00 AM

need dump and sap notes for c_s4cpr_2308 - sap certified application associate - sap s/4hana cloud, public edition - sourcing and procurement
Anonymous


Tanvi Rajput 10/6/2023 6:50:00 AM

question 11: d i personally feel some answers are wrong.
UNITED KINGDOM


Anil 7/18/2023 9:38:00 AM

nice questions
Anonymous


Chris 8/26/2023 1:10:00 AM

looking for c1000-158: ibm cloud technical advocate v4 questions
Anonymous


sachin 6/27/2023 1:22:00 PM

can you share the pdf
Anonymous


Blessious Phiri 8/13/2023 10:26:00 AM

admin ii is real technical stuff
Anonymous


Luis Manuel 7/13/2023 9:30:00 PM

could you post the link
UNITED STATES


vijendra 8/18/2023 7:54:00 AM

hello send me dumps
Anonymous


Simeneh 7/9/2023 8:46:00 AM

it is very nice
Anonymous


john 11/16/2023 5:13:00 PM

i gave the amazon dva-c02 tests today and passed. very helpful.
Anonymous


Tao 11/20/2023 8:53:00 AM

there is an incorrect word in the problem statement. for example, in question 1, there is the word "speci c". this is "specific. in the other question, there is the word "noti cation". this is "notification. these mistakes make this site difficult for me to use.
Anonymous


patricks 10/24/2023 6:02:00 AM

passed my az-120 certification exam today with 90% marks. studied using the dumps highly recommended to all.
Anonymous


Ananya 9/14/2023 5:17:00 AM

i need it, plz make it available
UNITED STATES