Databricks Certified Generative AI Engineer Associate Databricks-Generative-AI-Engineer-Associate Exam Questions in PDF

Free Databricks Databricks-Generative-AI-Engineer-Associate Dumps Questions (page: 1)

A Generative Al Engineer has created a RAG application to look up answers to questions about a series of fantasy novels that are being asked on the author's web forum. The fantasy novel texts are chunked and embedded into a vector store with metadata (page number, chapter number, book title), retrieved with the user's query, and provided to an LLM for response generation. The Generative AI Engineer used their intuition to pick the chunking strategy and associated configurations but now wants to more methodically choose the best values.

Which TWO strategies should the Generative AI Engineer take to optimize their chunking strategy and parameters? (Choose two.)

  1. Change embedding models and compare performance.
  2. Add a classifier for user queries that predicts which book will best contain the answer. Use this to filter retrieval.
  3. Choose an appropriate evaluation metric (such as recall or NDCG) and experiment with changes in the chunking strategy, such as splitting chunks by paragraphs or chapters.
    Choose the strategy that gives the best performance metric.
  4. Pass known questions and best answers to an LLM and instruct the LLM to provide the best token count. Use a summary statistic (mean, median, etc.) of the best token counts to choose chunk size.
  5. Create an LLM-as-a-judge metric to evaluate how well previous questions are answered by the most appropriate chunk. Optimize the chunking parameters based upon the values of the metric.

Answer(s): C,E

Explanation:

To optimize a chunking strategy for a Retrieval-Augmented Generation (RAG) application, the Generative AI Engineer needs a structured approach to evaluating the chunking strategy, ensuring that the chosen configuration retrieves the most relevant information and leads to accurate and coherent LLM responses. Here's why C and E are the correct strategies:

Strategy C: Evaluation Metrics (Recall, NDCG)

Define an evaluation metric: Common evaluation metrics such as recall, precision, or NDCG (Normalized Discounted Cumulative Gain) measure how well the retrieved chunks match the user's query and the expected response.

Recall measures the proportion of relevant information retrieved.

NDCG is often used when you want to account for both the relevance of retrieved chunks and the ranking or order in which they are retrieved.

Experiment with chunking strategies: Adjusting chunking strategies based on text structure (e.g., splitting by paragraph, chapter, or a fixed number of tokens) allows the engineer to experiment with various ways of slicing the text. Some chunks may better align with the user's query than others.

Evaluate performance: By using recall or NDCG, the engineer can methodically test various chunking strategies to identify which one yields the highest performance. This ensures that the chunking method provides the most relevant information when embedding and retrieving data from the vector store.

Strategy E: LLM-as-a-Judge Metric

Use the LLM as an evaluator: After retrieving chunks, the LLM can be used to evaluate the quality of answers based on the chunks provided. This could be framed as a "judge" function, where the LLM compares how well a given chunk answers previous user queries.

Optimize based on the LLM's judgment: By having the LLM assess previous answers and rate their relevance and accuracy, the engineer can collect feedback on how well different chunking configurations perform in real-world scenarios.

This metric could be a qualitative judgment on how closely the retrieved information matches the user's intent.

Tune chunking parameters: Based on the LLM's judgment, the engineer can adjust the chunk size or structure to better align with the LLM's responses, optimizing retrieval for future queries.

By combining these two approaches, the engineer ensures that the chunking strategy is systematically evaluated using both quantitative (recall/NDCG) and qualitative (LLM judgment) methods. This balanced optimization process results in improved retrieval relevance and, consequently, better response generation by the LLM.



A Generative AI Engineer is designing a RAG application for answering user questions on technical regulations as they learn a new sport.

What are the steps needed to build this RAG application and deploy it?

  1. Ingest documents from a source ­> Index the documents and saves to Vector Search ­> User submits queries against an LLM ­> LLM retrieves relevant documents ­> Evaluate model ­> LLM generates a response ­> Deploy it using Model Serving
  2. Ingest documents from a source ­> Index the documents and save to Vector Search ­> User submits queries against an LLM ­> LLM retrieves relevant documents ­> LLM generates a response -> Evaluate model ­> Deploy it using Model Serving
  3. Ingest documents from a source ­> Index the documents and save to Vector Search ­> Evaluate model ­> Deploy it using Model Serving
  4. User submits queries against an LLM ­> Ingest documents from a source ­> Index the documents and save to Vector Search ­> LLM retrieves relevant documents ­> LLM generates a response ­> Evaluate model ­> Deploy it using Model Serving

Answer(s): B

Explanation:

The Generative AI Engineer needs to follow a methodical pipeline to build and deploy a Retrieval- Augmented Generation (RAG) application. The steps outlined in option B accurately reflect this process:

Ingest documents from a source: This is the first step, where the engineer collects documents (e.g., technical regulations) that will be used for retrieval when the application answers user questions.

Index the documents and save to Vector Search: Once the documents are ingested, they need to be embedded using a technique like embeddings (e.g., with a pre-trained model like BERT) and stored in a vector database (such as Pinecone or FAISS). This enables fast retrieval based on user queries.

User submits queries against an LLM: Users interact with the application by submitting their queries.
These queries will be passed to the LLM.

LLM retrieves relevant documents: The LLM works with the vector store to retrieve the most relevant documents based on their vector representations.

LLM generates a response: Using the retrieved documents, the LLM generates a response that is tailored to the user's question.

Evaluate model: After generating responses, the system must be evaluated to ensure the retrieved documents are relevant and the generated response is accurate. Metrics such as accuracy, relevance, and user satisfaction can be used for evaluation.

Deploy it using Model Serving: Once the RAG pipeline is ready and evaluated, it is deployed using a model-serving platform such as Databricks Model Serving. This enables real-time inference and response generation for users.

By following these steps, the Generative AI Engineer ensures that the RAG application is both efficient and effective for the task of answering technical regulation questions.



A Generative AI Engineer just deployed an LLM application at a digital marketing company that assists with answering customer service inquiries.

Which metric should they monitor for their customer service LLM application in production?

  1. Number of customer inquiries processed per unit of time
  2. Energy usage per query
  3. Final perplexity scores for the training of the model
  4. HuggingFace Leaderboard values for the base LLM

Answer(s): A

Explanation:

When deploying an LLM application for customer service inquiries, the primary focus is on measuring the operational efficiency and quality of the responses. Here's why A is the correct metric:

Number of customer inquiries processed per unit of time: This metric tracks the throughput of the customer service system, reflecting how many customer inquiries the LLM application can handle in a given time period (e.g., per minute or hour). High throughput is crucial in customer service applications where quick response times are essential to user satisfaction and business efficiency.

Real-time performance monitoring: Monitoring the number of queries processed is an important part of ensuring that the model is performing well under load, especially during peak traffic times. It also helps ensure the system scales properly to meet demand.

Why other options are not ideal:

B . Energy usage per query: While energy efficiency is a consideration, it is not the primary concern for a customer-facing application where user experience (i.e., fast and accurate responses) is critical.

C . Final perplexity scores for the training of the model: Perplexity is a metric for model training, but it doesn't reflect the real-time operational performance of an LLM in production.

D . HuggingFace Leaderboard values for the base LLM: The HuggingFace Leaderboard is more relevant during model selection and benchmarking. However, it is not a direct measure of the model's performance in a specific customer service application in production.

Focusing on throughput (inquiries processed per unit time) ensures that the LLM application is meeting business needs for fast and efficient customer service responses.



A Generative AI Engineer is building a Generative AI system that suggests the best matched employee team member to newly scoped projects. The team member is selected from a very large team. The match should be based upon project date availability and how well their employee profile matches the project scope. Both the employee profile and project scope are unstructured text.

How should the Generative Al Engineer architect their system?

  1. Create a tool for finding available team members given project dates. Embed all project scopes into a vector store, perform a retrieval using team member profiles to find the best team member.
  2. Create a tool for finding team member availability given project dates, and another tool that uses an LLM to extract keywords from project scopes. Iterate through available team members' profiles and perform keyword matching to find the best available team member.
  3. Create a tool to find available team members given project dates. Create a second tool that can calculate a similarity score for a combination of team member profile and the project scope. Iterate through the team members and rank by best score to select a team member.
  4. Create a tool for finding available team members given project dates. Embed team profiles into a vector store and use the project scope and filtering to perform retrieval to find the available best matched team members.

Answer(s): D

Explanation:

Problem Context: The problem involves matching team members to new projects based on two main factors:

Availability: Ensure the team members are available during the project dates.

Profile-Project Match: Use the employee profiles (unstructured text) to find the best match for a project's scope (also unstructured text).

The two main inputs are the employee profiles and project scopes, both of which are unstructured. This means traditional rule-based systems (e.g., simple keyword matching) would be inefficient, especially when working with large datasets.

Explanation of Options: Let's break down the provided options to understand why D is the most optimal answer.

Option A suggests embedding project scopes into a vector store and then performing retrieval using team member profiles.
While embedding project scopes into a vector store is a valid technique, it skips an important detail: the focus should primarily be on embedding employee profiles because we're matching the profiles to a new project, not the other way around.

Option B involves using a large language model (LLM) to extract keywords from the project scope and perform keyword matching on employee profiles.
While LLMs can help with keyword extraction, this approach is too simplistic and doesn't leverage advanced retrieval techniques like vector embeddings, which can handle the nuanced and rich semantics of unstructured data. This approach may miss out on subtle but important similarities.

Option C suggests calculating a similarity score between each team member's profile and project scope.
While this is a good idea, it doesn't specify how to handle the unstructured nature of data efficiently. Iterating through each member's profile individually could be computationally expensive in large teams. It also lacks the mention of using a vector store or an efficient retrieval mechanism.

Option D is the correct approach. Here's why:

Embedding team profiles into a vector store: Using a vector store allows for efficient similarity searches on unstructured data. Embedding the team member profiles into vectors captures their semantics in a way that is far more flexible than keyword-based matching.

Using project scope for retrieval: Instead of matching keywords, this approach suggests using vector embeddings and similarity search algorithms (e.g., cosine similarity) to find the team members whose profiles most closely align with the project scope.

Filtering based on availability: Once the best-matched candidates are retrieved based on profile similarity, filtering them by availability ensures that the system provides a practically useful result.

This method efficiently handles large-scale datasets by leveraging vector embeddings and similarity search techniques, both of which are fundamental tools in Generative AI engineering for handling unstructured text.

Technical Reference:
Vector embeddings: In this approach, the unstructured text (employee profiles and project scopes) is converted into high-dimensional vectors using pretrained models (e.g., BERT, Sentence-BERT, or custom embeddings). These embeddings capture the semantic meaning of the text, making it easier to perform similarity-based retrieval.

Vector stores: Solutions like FAISS or Milvus allow storing and retrieving large numbers of vector embeddings quickly. This is critical when working with large teams where querying through individual profiles sequentially would be inefficient.

LLM Integration: Large language models can assist in generating embeddings for both employee profiles and project scopes. They can also assist in fine-tuning similarity measures, ensuring that the retrieval system captures the nuances of the text data.

Filtering: After retrieving the most similar profiles based on the project scope, filtering based on availability ensures that only team members who are free for the project are considered.

This system is scalable, efficient, and makes use of the latest techniques in Generative AI, such as vector embeddings and semantic search.



A Generative AI Engineer is designing an LLM-powered live sports commentary platform. The platform provides real-time updates and LLM-generated analyses for any users who would like to have live summaries, rather than reading a series of potentially outdated news articles.

Which tool below will give the platform access to real-time data for generating game analyses based on the latest game scores?

  1. DatabrickslQ
  2. Foundation Model APIs
  3. Feature Serving
  4. AutoML

Answer(s): C

Explanation:

Problem Context: The engineer is developing an LLM-powered live sports commentary platform that needs to provide real-time updates and analyses based on the latest game scores. The critical requirement here is the capability to access and integrate real-time data efficiently with the platform for immediate analysis and reporting.

Explanation of Options:

Option A: DatabricksIQ: While DatabricksIQ offers integration and data processing capabilities, it is more aligned with data analytics rather than real-time feature serving, which is crucial for immediate updates necessary in a live sports commentary context.

Option B: Foundation Model APIs: These APIs facilitate interactions with pre-trained models and could be part of the solution, but on their own, they do not provide mechanisms to access real-time game scores.

Option C: Feature Serving: This is the correct answer as feature serving specifically refers to the real- time provision of data (features) to models for prediction. This would be essential for an LLM that generates analyses based on live game data, ensuring that the commentary is current and based on the latest events in the sport.

Option D: AutoML: This tool automates the process of applying machine learning models to real- world problems, but it does not directly provide real-time data access, which is a critical requirement for the platform.

Thus, Option C (Feature Serving) is the most suitable tool for the platform as it directly supports the real-time data needs of an LLM-powered sports commentary system, ensuring that the analyses and updates are based on the latest available information.



Viewing page 1 of 10

Share your comments for Databricks Databricks-Generative-AI-Engineer-Associate exam with other users:

AI Tutor Explanation 4/25/2026 1:53:46 PM

Question 7:
  • Correct answer: B — A risk score is computed based on the number of remediations needed compared to the industry peer average.

Explanation:
  • Risk360 uses a remediation-based score. It benchmarks how many actions are required to fix issues against peers, giving a relative risk posture.
  • Why not the others:
- A: Not just total risk events by location. - C: Time to mitigate isn’t the primary scoring method. - D: Not a four-stage breach scoring approach.
Note: The page text shows a mismatch (it lists D as the answer), but the study guide describes the remediation-based scoring (B) as the correct concept.
New Delhi, India


AI Tutor Explanation 4/25/2026 1:42:20 PM

Question 104:
  • Correct answer: D) Multi-Terabyte (TB) Range

  • Brief explanation:
- clustering keys organize data into micro-partitions to improve pruning when queries filter on those columns. - The performance benefit is most significant for very large tables; for small tables the overhead of maintaining clustering outweighs gains. - Therefore, as a best practice, define clustering keys on tables at the TB scale.
Thane, India


Community Helper 4/25/2026 2:03:10 AM

Q23: Fabric Admin is correct. Because Domain admin cannot create domains. Only Fabric Admin can among the given options. Q51: Wrapping @pipeline.parameter.param1 inside {} will return a string. But question requires the expression to return Int, so correct answer should be @pipeline.parameter.param1 (no {})
New Delhi, India


AI Tutor Explanation 4/23/2026 3:07:03 PM

Question 62:
  • Correct answer: D (per the page)

  • Note: The explanation text on the page describes option B (use ZDX score and Analyze Score to trigger the Y Engine analysis), indicating a mismatch between the stated answer and the rationale.

  • Key concept: For fast root-cause analysis, leverage telemetry and auto-correlated insights:
- Use the user’s ZDX score for AWS and run Analyze Score to activate the Y Engine, which correlates metrics across network, client, and application to pinpoint the issue quickly.
  • Why the other options are less effective:
- A: Only checks for outages; doesn’t provide actionable root-cause analysis. - C: Deep Trace helps visibility but is manual and time-consuming. - D: Packet capture is invasive and slow; not the quickest path to root cause.
Coimbatore, India


AI Tutor Explanation 4/23/2026 12:26:21 PM

Question 32:
  • Answer: A (2.4GHz)

  • Why: Lower-frequency signals have longer wavelengths and experience less attenuation when passing through walls and obstacles. Higher frequencies (5GHz, 6GHz) are more easily blocked by walls. NFC operates over very short distances and is not meant to penetrate walls. So 2.4 GHz best penetrates physical objects like walls.
Allen, United States


AI Tutor Explanation 4/21/2026 8:48:36 AM

Question 3:
  • False is the correct answer (Option B).

Why:
  • In Snowflake, a database is a metadata object that exists within a single Snowflake account. Accounts are isolated—there isn’t one database that lives in multiple accounts.
  • You can access data across accounts via data sharing or database replication, but these create separate database objects in the other accounts (e.g., a database in the consumer account created from a share), not a single shared database across accounts.

So a single database cannot exist in more than one Snowflake account.
Thane, India


Anonymous User 4/16/2026 10:54:18 AM

Question 1:
  • Correct answer: Edate = sys.argv[1]
  • Why this is correct:
- When a Databricks Job passes parameters to a notebook, those parameters are supplied to the notebook's Python process as command-line arguments. The first argument after the script name is sys.argv[1], so date = sys.argv[1] captures the passed date value directly.
  • How it compares to other options:
- date = spark.conf.get("date") reads from Spark config, not from job parameters. - input() waits for user input at runtime, which isn’t how job parameters are provided. - date = dbutils.notebooks.getParam("date") would work if the notebook were invoked via dbutils.notebook.run with parameters, not
Innisfil, Canada


Anonymous User 4/15/2026 4:42:07 AM

Question 528:
  • Correct answer: NSG flow logs for NSG1 (Option B)

  • Why:
- Traffic Analytics uses NSG flow logs to analyze traffic patterns. You must have NSG flow logs enabled for the NSGs you want to monitor. - An Azure Log Analytics workspace is also required to store and query the traffic data. - Network Watcher must be available in the subscription for traffic analytics to function.
  • What to configure (brief steps):
- Ensure Network Watcher is enabled in the East US region (for the subscription/region). - Enable NSG flow logs on NSG1. - Ensure a Log Analytics workspace exists and is accessible (read/write) so Traffic Analytics can store and query logs.
  • Why other options aren’t correct:
- “Diagnostic settings for VM1” or “Diagnostic settings for NSG1” alone don’t guarantee flow logs are captured and sent to Log Analytics, which Traffic Analytics relies on. - “Insights for VM1” is not how Traffic Analytics collects traffic data.
Hamburg, Germany


Anonymous User 4/15/2026 2:43:53 AM

Question 23:
The correct answer is Domain admin (option B), not Fabric admin.
  • Domain admin provides domain-level management: create domains/subdomains and assign workspaces within those domains, which matches the tasks while following least privilege.
  • Fabric admin is global-level access and is more privileges than needed for this scenario (it would grant broader control across the Fabric environment).
Kottayam, India


Anonymous User 4/14/2026 12:31:34 PM

Question 2:
For question 2, the key concept is the Longest Prefix Match. Routers pick the route whose subnet mask is the most specific (largest prefix length) that still matches the destination IP.
From the options:
  • A) 10.10.10.0/28 ? 10.10.10.0–10.10.10.15
  • B) 10.10.13.0/25 ? 10.10.13.0–10.10.13.127
  • C) 10.10.13.144/28 ? 10.10.13.144–10.10.13.159
  • D) 10.10.13.208/29 ? 10.10.13.208–10.10.13.215

The destination Host A’s IP must fall within 10.10.13.208–10.10.13.215 for the /29 to be the best match. Since /29 is the longest prefix among the matching options, Router1 will use 10.10.13.208/29.
Thus, the correct answer is D.
Canada


srameh 4/14/2026 10:09:29 AM

Question 3:
  • Correct answer: Phase 4, Post Accreditation

  • Explanation:
- In DITSCAP, the four phases are: - Phase 1: Definition (concept and requirements) - Phase 2: Verification (design and testing) - Phase 3: Validation (fielding and evaluation) - Phase 4: Post Accreditation (ongoing operations and lifecycle management) - The description—continuing operation of an accredited IT system and addressing changing threats throughout its life cycle—fits the Post Accreditation phase, which covers operations, maintenance, monitoring, and reauthorization as threats and environment evolve.
France


onibokun10 4/13/2026 7:50:14 PM

Question 129:
Correct answer: CNAME
  • A CNAME record creates an alias for a domain, so newapplication.comptia.org will resolve to whatever IP address www.comptia.org resolves to. This ensures both names point to the same resource without duplicating the IP.
  • Why not the others:
- SOA defines authoritative information for a zone. - MX specifies mail exchange servers. - NS designates name servers for a zone.
  • Notes: The alias name (newapplication.comptia.org) should not have other records if you use a CNAME for it, and CNAMEs aren’t used for the zone apex (root) domain. This scenario uses a subdomain, so a CNAME is appropriate.
Swindon, United Kingdom


Anonymous User 4/13/2026 6:29:58 PM

Question 1:
  • Correct answer: C

  • Why this is best:
- Uses OS Login with IAM, so SSH access is granted via Google accounts rather than distributing per-user SSH keys. - Granting the compute.osAdminLogin role to a Google group gives admin access to all team members in a centralized, auditable way. - Access is auditable: Cloud Audit Logs show who accessed which VM, satisfying the security requirement to determine who accessed a given instance.
  • How it works:
- Enable OS Login on the project/instances (enable-oslogin metadata). - Add the team’s
Germany


Anonymous User 4/13/2026 1:00:51 PM

Question 2:
  • Answer: D. Azure Advisor

  • Why: To view security-related recommendations for resources in the Compute and Apps area (including App Service Web Apps and Functions), you use Azure Advisor. Advisor surfaces personalized best-practice recommendations across resources, including security, and shows which resources are affected and the severity.

  • Why not the others:
- Azure Log Analytics is for ad-hoc querying of telemetry, not for viewing security recommendations. - Azure Event Hubs is for streaming telemetry data, not for security recommendations.
  • Quick tip: In the portal, navigate to Azure Advisor and check the Security recommendations for App Services to see actionable items and affe
Brazil


Don 4/11/2026 5:36:42 AM

Recommend using AI for Solutions rather the Answer(s) submitted here
Hamburg, Germany


Mogae Malapela 4/8/2026 6:37:56 AM

This is very interesting
Gaborone, Botswana


Anon 4/6/2026 5:22:54 PM

Are these the same questions you have to pay for in ExamTopics?
Amsterdam, The Netherlands


LRK 3/22/2026 2:38:08 PM

For Question 7 - while the answer description indicates the correct answer, the option no. mentioned is incorrect. Nice and Comprehensive. Thankyou
Paris, France


Rian 3/19/2026 9:12:10 AM

This is very good and accurate. Explanation is very helpful even thou some are not 100% right but good enough to pass.
United States


Gerrard 3/18/2026 6:58:37 AM

The DP-900 exam can be tricky if you aren't familiar with Microsoft’s specific cloud terminology. I used the practice questions from free-braindumps.com and found them incredibly helpful. The site breaks down core data concepts and Azure services in a way that actually mirrors the real test. As a resutl I passed my exam.
United States


Vineet Kumar 3/6/2026 5:26:16 AM

interesting
Anonymous


Joe 1/20/2026 8:25:24 AM

Passed this exam 2 days ago. These questions are in the exam. You are safe to use them.
UNITED STATES


NJ 12/24/2025 10:39:07 AM

Helpful to test your preparedness before giving exam
Anonymous


Ashwini 12/17/2025 8:24:45 AM

Really helped
Anonymous


Jagadesh 12/16/2025 9:57:10 AM

Good explanation
INDIA


shobha 11/29/2025 2:19:59 AM

very helpful
INDIA


Pandithurai 11/12/2025 12:16:21 PM

Question 1, Ans is - Developer,Standard,Professional Direct and Premier
Anonymous


Einstein 11/8/2025 4:13:37 AM

Passed this exam in first appointment. Great resource and valid exam dump.
Anonymous


David 10/31/2025 4:06:16 PM

Today I wrote this exam and passed, i totally relay on this practice exam. The questions were very tough, these questions are valid and I encounter the same.
UNITED STATES


Thor 10/21/2025 5:16:29 AM

Anyone used this dump recently?
NEW ZEALAND


Vladimir 9/25/2025 9:11:14 AM

173 question is A not D
Anonymous


khaos 9/21/2025 7:07:26 AM

nice questions
Anonymous


Katiso Lehasa 9/15/2025 11:21:52 PM

Thanks for the practice questions they helped me a lot.
Anonymous


Einstein 9/2/2025 7:42:00 PM

Passed this exam today. All questions are valid and this is not something you can find in ChatGPT.
UNITED KINGDOM


AI Tutor 👋 I’m here to help!