Databricks-Generative-AI-Engineer-Associate Databricks Certified Generative AI Engineer Associate Questions and Answers

Questions 4

What is the most suitable library for building a multi-step LLM-based workflow?

Options:

Pandas

TensorFlow

PySpark

LangChain

Buy Now

Questions 5

A Generative AI Engineer has been asked to design an LLM-based application that accomplishes the following business objective: answer employee HR questions using HR PDF documentation.

Which set of high level tasks should the Generative AI Engineer's system perform?

Options:

Calculate averaged embeddings for each HR document, compare embeddings to user query to find the best document. Pass the best document with the user query into an LLM with a large context window to generate a response to the employee.

Use an LLM to summarize HR documentation. Provide summaries of documentation and user query into an LLM with a large context window to generate a response to the user.

Create an interaction matrix of historical employee questions and HR documentation. Use ALS to factorize the matrix and create embeddings. Calculate the embeddings of new queries and use them to find the best HR documentation. Use an LLM to generate a response to the employee question based upon the documentation retrieved.

Split HR documentation into chunks and embed into a vector store. Use the employee question to retrieve best matched chunks of documentation, and use the LLM to generate a response to the employee based upon the documentation retrieved.

Buy Now

Questions 6

A small and cost-conscious startup in the cancer research field wants to build a RAG application using Foundation Model APIs.

Which strategy would allow the startup to build a good-quality RAG application while being cost-conscious and able to cater to customer needs?

Options:

Limit the number of relevant documents available for the RAG application to retrieve from

Pick a smaller LLM that is domain-specific

Limit the number of queries a customer can send per day

Use the largest LLM possible because that gives the best performance for any general queries

Buy Now

Questions 7

Which TWO chain components are required for building a basic LLM-enabled chat application that includes conversational capabilities, knowledge retrieval, and contextual memory?

Options:

(Q)

Vector Stores

Conversation Buffer Memory

External tools

Chat loaders

React Components

Buy Now

Answer:

B, C

Explanation:

Building a basic LLM-enabled chat application with conversational capabilities, knowledge retrieval, and contextual memory requires specific components that work together to process queries, maintain context, and retrieve relevant information. Databricks’ Generative AI Engineer documentation outlines key components for such systems, particularly in the context of frameworks like LangChain or Databricks’ MosaicML integrations. Let’s evaluate the required components:

Understanding the Requirements:

Conversational capabilities: The app must generate natural, coherent responses.

Knowledge retrieval: It must access external or domain-specific knowledge.

Contextual memory: It must remember prior interactions in the conversation.

Databricks Reference:"A typical LLM chat application includes a memory component to track conversation history and a retrieval mechanism to incorporate external knowledge"("Databricks Generative AI Cookbook," 2023).

Evaluating the Options:

A. (Q): This appears incomplete or unclear (possibly a typo). Without further context, it’s not a valid component.

B. Vector Stores: These store embeddings of documents or knowledge bases, enabling semantic search and retrieval of relevant information for the LLM. This is critical for knowledge retrieval in a chat application.

Databricks Reference:"Vector stores, such as those integrated with Databricks’ Lakehouse, enable efficient retrieval of contextual data for LLMs"("Building LLM Applications with Databricks").

C. Conversation Buffer Memory: This component stores the conversation history, allowing the LLM to maintain context across multiple turns. It’s essential for contextual memory.

Databricks Reference:"Conversation Buffer Memory tracks prior user inputs and LLM outputs, ensuring context-aware responses"("Generative AI Engineer Guide").

D. External tools: These (e.g., APIs or calculators) enhance functionality but aren’t required for abasicchat app with the specified capabilities.

E. Chat loaders: These might refer to data loaders for chat logs, but they’re not a core chain component for conversational functionality or memory.

F. React Components: These relate to front-end UI development, not the LLM chain’s backend functionality.

Selecting the Two Required Components:

Forknowledge retrieval, Vector Stores (B) are necessary to fetch relevant external data, a cornerstone of Databricks’ RAG-based chat systems.

Forcontextual memory, Conversation Buffer Memory (C) is required to maintain conversation history, ensuring coherent and context-aware responses.

While an LLM itself is implied as the core generator, the question asks for chain components beyond the model, making B and C the minimal yet sufficient pair for a basic application.

Conclusion: The two required chain components areB. Vector StoresandC. Conversation Buffer Memory, as they directly address knowledge retrieval and contextual memory, respectively, aligning with Databricks’ documented best practices for LLM-enabled chat applications.

Questions 8

A Generative Al Engineer is tasked with developing an application that is based on an open source large language model (LLM). They need a foundation LLM with a large context window.

Which model fits this need?

Options:

DistilBERT

MPT-30B

Llama2-70B

DBRX

Buy Now

Questions 9

After changing the response generating LLM in a RAG pipeline from GPT-4 to a model with a shorter context length that the company self-hosts, the Generative AI Engineer is getting the following error:

Databricks-Generative-AI-Engineer-Associate Question 9

What TWO solutions should the Generative AI Engineer implement without changing the response generating model? (Choose two.)

Options:

Use a smaller embedding model to generate

Reduce the maximum output tokens of the new model

Decrease the chunk size of embedded documents

Reduce the number of records retrieved from the vector database

Retrain the response generating model using ALiBi

Buy Now

Questions 10

A Generative AI Engineer is designing a RAG application for answering user questions on technical regulations as they learn a new sport.

What are the steps needed to build this RAG application and deploy it?

Options:

Ingest documents from a source –> Index the documents and saves to Vector Search –> User submits queries against an LLM –> LLM retrieves relevant documents –> Evaluate model –> LLM generates a response –> Deploy it using Model Serving

Ingest documents from a source –> Index the documents and save to Vector Search –> User submits queries against an LLM –> LLM retrieves relevant documents –> LLM generates a response -> Evaluate model –> Deploy it using Model Serving

Ingest documents from a source –> Index the documents and save to Vector Search –> Evaluate model –> Deploy it using Model Serving

User submits queries against an LLM –> Ingest documents from a source –> Index the documents and save to Vector Search –> LLM retrieves relevant documents –> LLM generates a response –> Evaluate model –> Deploy it using Model Serving

Buy Now

Questions 11

A Generative Al Engineer is building a system which will answer questions on latest stock news articles.

Which will NOT help with ensuring the outputs are relevant to financial news?

Options:

Implement a comprehensive guardrail framework that includes policies for content filters tailored to the finance sector.

Increase the compute to improve processing speed of questions to allow greater relevancy analysis

C Implement a profanity filter to screen out offensive language

Incorporate manual reviews to correct any problematic outputs prior to sending to the users

Buy Now

Questions 12

A Generative AI Engineer just deployed an LLM application at a digital marketing company that assists with answering customer service inquiries.

Which metric should they monitor for their customer service LLM application in production?

Options:

Number of customer inquiries processed per unit of time

Energy usage per query

Final perplexity scores for the training of the model

HuggingFace Leaderboard values for the base LLM

Buy Now

Questions 13

A Generative AI Engineer is building an LLM to generate article summaries in the form of a type of poem, such as a haiku, given the article content. However, the initial output from the LLM does not match the desired tone or style.

Which approach will NOT improve the LLM’s response to achieve the desired response?

Options:

Provide the LLM with a prompt that explicitly instructs it to generate text in the desired tone and style

Use a neutralizer to normalize the tone and style of the underlying documents

Include few-shot examples in the prompt to the LLM

Fine-tune the LLM on a dataset of desired tone and style

Buy Now

Questions 14

A Generative Al Engineer interfaces with an LLM with prompt/response behavior that has been trained on customer calls inquiring about product availability. The LLM is designed to output “In Stock” if the product is available or only the term “Out of Stock” if not.

Which prompt will work to allow the engineer to respond to call classification labels correctly?

Options:

Respond with “In Stock” if the customer asks for a product.

You will be given a customer call transcript where the customer asks about product availability. The outputs are either “In Stock” or “Out of Stock”. Format the output in JSON, for example: {“call_id”: “123”, “label”: “In Stock”}.

Respond with “Out of Stock” if the customer asks for a product.

You will be given a customer call transcript where the customer inquires about product availability. Respond with “In Stock” if the product is available or “Out of Stock” if not.

Buy Now

Questions 15

A Generative Al Engineer is building an LLM-based application that has an

important transcription (speech-to-text) task. Speed is essential for the success of the application

Which open Generative Al models should be used?

Options:

L!ama-2-70b-chat-hf

MPT-30B-lnstruct

DBRX

whisper-large-v3 (1.6B)

Buy Now

Answer:

Explanation:

The task requires an open generative AI model for a transcription (speech-to-text) task where speed is essential. Let’s assess the options based on their suitability for transcription and performance characteristics, referencing Databricks’ approach to model selection.

Option A: Llama-2-70b-chat-hf

Llama-2 is a text-based LLM optimized for chat and text generation, not speech-to-text. It lacks transcription capabilities.

Databricks Reference:"Llama models are designed for natural language generation, not audio processing"("Databricks Model Catalog").

Option B: MPT-30B-Instruct

MPT-30B is another text-based LLM focused on instruction-following and text generation, not transcription. It’s irrelevant for speech-to-text tasks.

Databricks Reference: No specific mention, but MPT is categorized under text LLMs in Databricks’ ecosystem, not audio models.

Option C: DBRX

DBRX, developed by Databricks, is a powerful text-based LLM for general-purpose generation. It doesn’t natively support speech-to-text and isn’t optimized for transcription.

Databricks Reference:"DBRX excels at text generation and reasoning tasks"("Introducing DBRX," 2023)—no mention of audio capabilities.

Option D: whisper-large-v3 (1.6B)

Whisper, developed by OpenAI, is an open-source model specifically designed for speech-to-text transcription. The “large-v3” variant (1.6 billion parameters) balances accuracy and efficiency, with optimizations for speed via quantization or deployment on GPUs—key for the application’s requirements.

Databricks Reference:"For audio transcription, models like Whisper are recommended for their speed and accuracy"("Generative AI Cookbook," 2023). Databricks supports Whisper integration in its MLflow or Lakehouse workflows.

Conclusion: OnlyD. whisper-large-v3is a speech-to-text model, making it the sole suitable choice. Its design prioritizes transcription, and its efficiency (e.g., via optimized inference) meets the speed requirement, aligning with Databricks’ model deployment best practices.

Questions 16

A Generative AI Engineer is creating an agent-based LLM system for their favorite monster truck team. The system can answer text based questions about the monster truck team, lookup event dates via an API call, or query tables on the team’s latest standings.

How could the Generative AI Engineer best design these capabilities into their system?

Options:

Ingest PDF documents about the monster truck team into a vector store and query it in a RAG architecture.

Write a system prompt for the agent listing available tools and bundle it into an agent system that runs a number of calls to solve a query.

Instruct the LLM to respond with “RAG”, “API”, or “TABLE” depending on the query, then use text parsing and conditional statements to resolve the query.

Build a system prompt with all possible event dates and table information in the system prompt. Use a RAG architecture to lookup generic text questions and otherwise leverage the information in the system prompt.

Buy Now

Answer:

Explanation:

In this scenario, the Generative AI Engineer needs to design a system that can handle different types of queries about the monster truck team. The queries may involve text-based information, API lookups for event dates, or table queries for standings. The best solution is to implement atool-based agent system.

Here’s how option B works, and why it’s the most appropriate answer:

System Design Using Agent-Based Model:In modern agent-based LLM systems, you can design a system where the LLM (Large Language Model) acts as a central orchestrator. The model can "decide" which tools to use based on the query. These tools can include API calls, table lookups, or natural language searches. The system should contain asystem promptthat informs the LLM about the available tools.

System Prompt Listing Tools:By creating a well-craftedsystem prompt, the LLM knows which tools are at its disposal. For instance, one tool may query an external API for event dates, another might look up standings in a database, and a third may involve searching a vector database for general text-based information. Theagentwill be responsible for calling the appropriate tool depending on the query.

Agent Orchestration of Calls:The agent system is designed to execute a series of steps based on the incoming query. If a user asks for the next event date, the system will recognize this as a task that requires an API call. If the user asks about standings, the agent might query the appropriate table in the database. For text-based questions, it may call a search function over ingested data. The agent orchestrates this entire process, ensuring the LLM makes calls to the right resources dynamically.

Generative AI Tools and Context:This is a standard architecture for integrating multiple functionalities into a system where each query requires different actions. The core design in option B is efficient because it keeps the system modular and dynamic by leveraging tools rather than overloading the LLM with static information in a system prompt (like option D).

Why Other Options Are Less Suitable:

A (RAG Architecture): While relevant, simply ingesting PDFs into a vector store only helps with text-based retrieval. It wouldn’t help with API lookups or table queries.

C (Conditional Logic with RAG/API/TABLE): Although this approach works, it relies heavily on manual text parsing and might introduce complexity when scaling the system.

D (System Prompt with Event Dates and Standings): Hardcoding dates and table information into a system prompt isn’t scalable. As the standings or events change, the system would need constant updating, making it inefficient.

By bundling multiple tools into a single agent-based system (as in option B), the Generative AI Engineer can best handle the diverse requirements of this system.

Questions 17

A Generative Al Engineer has already trained an LLM on Databricks and it is now ready to be deployed.

Which of the following steps correctly outlines the easiest process for deploying a model on Databricks?

Options:

Log the model as a pickle object, upload the object to Unity Catalog Volume, register it to Unity Catalog using MLflow, and start a serving endpoint

Log the model using MLflow during training, directly register the model to Unity Catalog using the MLflow API, and start a serving endpoint

Save the model along with its dependencies in a local directory, build the Docker image, and run the Docker container

Wrap the LLM’s prediction function into a Flask application and serve using Gunicorn

Buy Now

Questions 18

A Generative Al Engineer is building a production-ready LLM system which replies directly to customers. The solution makes use of the Foundation Model API via provisioned throughput. They are concerned that the LLM could potentially respond in a toxic or otherwise unsafe way. They also wish to perform this with the least amount of effort.

Which approach will do this?

Options:

Host Llama Guard on Foundation Model API and use it to detect unsafe responses

Add some LLM calls to their chain to detect unsafe content before returning text

Add a regex expression on inputs and outputs to detect unsafe responses.

Ask users to report unsafe responses

Buy Now

Answer:

Explanation:

The task is to prevent toxic or unsafe responses in an LLM system using the Foundation Model API with minimal effort. Let’s assess the options.

Option A: Host Llama Guard on Foundation Model API and use it to detect unsafe responses

Llama Guard is a safety-focused model designed to detect toxic or unsafe content. Hosting it via the Foundation Model API (a Databricks service) integrates seamlessly with the existing system, requiring minimal setup (just deployment and a check step), and leverages provisioned throughput for performance.

Databricks Reference:"Foundation Model API supports hosting safety models like Llama Guard to filter outputs efficiently"("Foundation Model API Documentation," 2023).

Option B: Add some LLM calls to their chain to detect unsafe content before returning text

Using additional LLM calls (e.g., prompting an LLM to classify toxicity) increases latency, complexity, and effort (crafting prompts, chaining logic), and lacks the specificity of a dedicated safety model.

Databricks Reference:"Ad-hoc LLM checks are less efficient than purpose-built safety solutions"("Building LLM Applications with Databricks").

Option C: Add a regex expression on inputs and outputs to detect unsafe responses

Regex can catch simple patterns (e.g., profanity) but fails for nuanced toxicity (e.g., sarcasm, context-dependent harm), requiring significant manual effort to maintain and update rules.

Databricks Reference:"Regex-based filtering is limited for complex safety needs"("Generative AI Cookbook").

Option D: Ask users to report unsafe responses

User reporting is reactive, not preventive, and places burden on users rather than the system. It doesn’t limit unsafe outputs proactively and requires additional effort for feedback handling.

Databricks Reference:"Proactive guardrails are preferred over user-driven monitoring"("Databricks Generative AI Engineer Guide").

Conclusion: Option A (Llama Guard on Foundation Model API) is the least-effort, most effective approach, leveraging Databricks’ infrastructure for seamless safety integration.

Exam Code: Databricks-Generative-AI-Engineer-Associate

Exam Name: Databricks Certified Generative AI Engineer Associate

Last Update: Dec 5, 2025

Questions: 61

PDF + Testing Engine

$134.99

Testing Engine

$99.99

PDF (Q&A)

$84.99

buy now Databricks-Generative-AI-Engineer-Associate pdf

Big Cyber Monday Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: best70

Dumpsbuddy logo

Databricks-Generative-AI-Engineer-Associate Databricks Certified Generative AI Engineer Associate Questions and Answers

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

PDF + Testing Engine

Testing Engine

PDF (Q&A)

Quick Links

Why Us

Unlimited Packages

Site Secure

We Accept

DumpsBuddy. All Rights Reserved