
This is a Servoy Tutorial on getting started with the Servoy AI Runtime Plugin, which ships with Servoy 2025.12 and later. If you have been watching the AI space from the sidelines, waiting for a native Servoy way to build AI features without wiring up REST calls by hand, your wait is over. The plugins.ai plugin is built on Langchain4j, it is MIT licensed, and it gives you a clean, builder-pattern API for chat completions, embeddings, vector stores, and tool calling right out of the box.
I am going to walk you through your first chat completion, streaming responses, conversation memory, vector embeddings, and a realistic scenario where semantic search replaces a LIKE query that nobody loved anyway. By the end of this tutorial, you will have working code you can adapt to your own projects.
What Is the AI Runtime Plugin and Why Should You Care?
The AI Runtime Plugin gives you a clean builder-pattern API for chat completions, embeddings, vector stores, and tool calling. It supports OpenAI and Google Gemini out of the box, and the two providers have identical API surfaces, so switching between them is a one-line change. The plugin operates on a Bring-Your-Own-Key model, so you provision your own API keys from OpenAI or Google and manage them yourself.
Bottom-line, if you are building AI features in Servoy, start with plugins.ai. You can always drop down to the HTTP plugin if you need a provider it does not cover yet (Claude support is on the roadmap), but for OpenAI and Gemini, this is the path of least resistance.
Before we write any code, there is one thing you need to do: put your API key in servoy.properties. Never hardcode API keys. The official recommendation from Servoy’s AI guide is:
/**@type {String}*/const sApiKey = application.getServoyProperty('openai_api_key');Add a line like openai_api_key=sk-your-key-here to your servoy.properties file and you are ready. For Gemini, use gemini_api_key. Every code example in this article uses this pattern. If you see application.getServoyProperty() and wonder what it is, now you know.
A Quick Note on Promises
The AI Runtime Plugin’s chat() method returns a JavaScript Promise. If you have not worked with Promises in Servoy before, the short version is that client.chat(prompt) does not block. It immediately returns a Promise object, and you attach handlers with .then() for success and .catch() for errors. The runtime fires the handlers when the response comes back from the model.
Okay, let’s get started.
Your First Chat Completion
Let’s build the simplest possible thing: a function that sends a prompt to GPT-4o and logs the response. Here is the complete function:
/** * Sends a prompt to OpenAI and logs the response. * @author Gary Dotzlaw * @since 2026-04-12 * @public * * @param {String} sPrompt the prompt to send */function sendFirstPrompt(sPrompt) { try { /**@type {String}*/ const sApiKey = application.getServoyProperty('openai_api_key');
/**@type {plugins.ai.ChatClient}*/ const oClient = plugins.ai.createOpenAiChatBuilder() .apiKey(sApiKey) .modelName('gpt-4o') .addSystemMessage('You are a helpful assistant for our ERP system.') .build();
oClient.chat(sPrompt).then(function(oResponse) { /**@type {String}*/ const sAnswer = oResponse.getResponse(); /**@type {Number}*/ const iTokens = oResponse.getTokenUsage().totalTokenCount(); application.output('AI response: ' + sAnswer); application.output('Tokens used: ' + iTokens); }).catch(function(oError) { application.output('AI call failed: ' + oError.message, LOGGINGLEVEL.ERROR); }); } catch (e) { application.output('Error in sendFirstPrompt: ' + e.message, LOGGINGLEVEL.ERROR); plugins.dialogs.showErrorDialog('Error', 'AI call failed: ' + e.message, 'OK'); }}Let’s walk through what is happening here:
- API key from servoy.properties:
application.getServoyProperty('openai_api_key')reads from yourservoy.propertiesfile, which keeps the key out of your source code and out of version control. - createOpenAiChatBuilder(): This is the entry point. The builder pattern lets you chain configuration calls and then
.build()to get your client. Note the casing:createOpenAiChatBuilderhas a lowercasei. The plugin also has acreateOpenAIClientshortcut (capitalAI) for one-off prompts where you do not need the builder, but the builder is what you will use in practice. - modelName(‘gpt-4o’): You are telling the plugin which model to use.
gpt-4ois a good general-purpose choice. For cheaper calls on simpler tasks,gpt-3.5-turboworks well. For tasks that need deep reasoning, theo1oro3-minimodels are available and you can access their reasoning output viaoResponse.getThinking(). - addSystemMessage: The system message sets the role, tone, and boundaries for the model. Think of it like a scope-level set of defaults that configure how the model behaves for every message in the session.
- build(): Returns the chat client, ready to send prompts.
- client.chat(prompt): Returns a Promise that resolves to a
ChatResponseobject. - getResponse(): Pulls out the response text.
- getTokenUsage().totalTokenCount(): Returns the total token count for the call. This is important because you are paying per token. Get into the habit of logging this. You cannot optimize what you are not measuring.
The .catch() handler catches any errors from the model call itself, such as network failures or authentication problems. The outer try/catch handles synchronous failures during the client setup.
Expected Output (calling sendFirstPrompt("In two sentences, what is a foundset in Servoy?")):
AI response: In Servoy, a foundset is a collection of records from a database table thatyou can work with in your application, similar to a record set or data set. It allows youto display, navigate, and manipulate data within forms, and you can apply searches andsorts to tailor the set of records you're working with.Tokens used: 121Make sense? You just called GPT-4o from Servoy in about twenty lines of code.
Switching to Gemini
The Gemini builder has the exact same API surface. Every method name, every parameter, every return type is identical. Switching providers is literally swapping the builder:
/**@type {plugins.ai.ChatClient}*/const oClient = plugins.ai.createGeminiChatBuilder() .apiKey(application.getServoyProperty('gemini_api_key')) .modelName('gemini-3-pro') .addSystemMessage('You are a helpful assistant for our ERP system.') .build();Everything else stays the same. Same .chat() call, same .then() handler, same ChatResponse methods. This is pretty sweet if you are evaluating providers or if your organization wants to avoid vendor lock-in.
Streaming Responses
The Promise form of chat() waits until the entire response is ready and delivers it in one shot. That works, but if the model is generating a long answer, your user is staring at a loading spinner for five to ten seconds wondering if something broke. Streaming fixes this by delivering partial text chunks as the model generates them, so the user sees words appearing in real time.
The streaming form uses three callbacks instead of a Promise:
/** * Sends a prompt with streaming output. * @author Gary Dotzlaw * @since 2026-04-17 * @public * * @param {String} sPrompt the prompt to send */function sendStreamingPrompt(sPrompt) { try { /**@type {String}*/ const sApiKey = application.getServoyProperty('openai_api_key');
/**@type {plugins.ai.ChatClient}*/ const oClient = plugins.ai.createOpenAiChatBuilder() .apiKey(sApiKey) .modelName('gpt-4o') .addSystemMessage('You are a helpful assistant.') .build();
/**@type {Number}*/ const iStartTime = new Date().getTime();
oClient.chat(sPrompt, // partial response callback: fired for each text chunk function(sPartial) { elements.chat_output.text += sPartial; }, // completion callback: fired once when the full response is done function(oResponse) { /**@type {Number}*/ const iElapsed = new Date().getTime() - iStartTime; /**@type {Number}*/ const iTokens = oResponse.getTokenUsage().totalTokenCount(); application.output('Done in ' + iElapsed + 'ms, ' + iTokens + ' tokens'); }, // error callback function(oError) { application.output('Streaming error: ' + oError.message, LOGGINGLEVEL.ERROR); } ); } catch (e) { application.output('Error in sendStreamingPrompt: ' + e.message, LOGGINGLEVEL.ERROR); }}A few things to notice:
- The streaming form does not return a Promise. It is void. You do not chain
.then()on it. All the work happens inside the three callbacks. - The partial callback receives raw string chunks. Each chunk is a fragment of the response as the model generates it. You append them to a UI element so the user sees the text building up character by character.
- The completion callback receives the full ChatResponse. At this point,
oResponse.getResponse()has the complete text, andgetTokenUsage()has the final token count. This is where you log, save, or do any post-processing. - The error callback works like
.catch(). If the model call fails, this fires instead of the completion callback.
For any feature where the user sees the AI response directly (chat interfaces, document summarization, help assistants), use streaming. The difference in perceived responsiveness is dramatic. For background tasks where no user is waiting, the Promise form is cleaner.
Adding Conversation Memory
One prompt in isolation is not very useful. Real conversations need context. Chat models do not actually preserve any state between calls; the illusion of a continuous conversation is created by sending the chat history with every call. The plugin handles this for you when you enable memory on the builder.
/** * Cached chat client with conversation memory. * @type {plugins.ai.ChatClient} */var _oChatClient = null;
/** * Gets or creates the chat client, preserving memory across calls. * @author Gary Dotzlaw * @since 2026-04-17 * @private * * @return {plugins.ai.ChatClient} the cached chat client */function _getChatClient() { if (_oChatClient) { return _oChatClient; }
/**@type {String}*/ const sApiKey = application.getServoyProperty('openai_api_key');
_oChatClient = plugins.ai.createOpenAiChatBuilder() .apiKey(sApiKey) .modelName('gpt-4o') .maxMemoryTokens(5000) .addSystemMessage('You are a sales assistant. Help users find information about customers and orders.') .build();
return _oChatClient;}
/** * Clears the conversation memory by resetting the client. * @author Gary Dotzlaw * @since 2026-04-17 * @public */function clearConversation() { _oChatClient = null;}With maxMemoryTokens(5000), the client will remember up to 5000 tokens of prior conversation. Think of tokens like characters in a SQL query limit: roughly a word or word fragment each. 5000 tokens is enough for a back-and-forth conversation with maybe 20-25 exchanges before older context starts getting dropped.
The critical pattern here is caching the client in a form variable. If you build a new client on every call, you get a new empty memory every time, which defeats the point entirely. Build it once, reuse it, and null it out when you want to start a fresh conversation. This is the exact pattern the official Example Solution uses, and it is the correct one.
Keep in mind your costs per token and your feature requirements when setting the maxMemoryTokens value. More memory means richer conversation context but also more tokens sent with every call, which means higher cost per interaction.
Multimodal Input: Images, Audio, and Documents
Before moving on to embeddings, it is worth knowing that the chat client can accept more than just text. The addFile() and addBytes() methods let you attach images, audio, video, PDFs, and text files to a chat prompt. Here is a full working example that attaches a PDF from the file system and asks a question about it:
/** * Multimodal input: attach a file and ask a question about it. * @author Gary Dotzlaw * @since 2026-04-17 * @public */function askAboutFile() { try { /**@type {String}*/ const sApiKey = application.getServoyProperty('openai_api_key');
/**@type {plugins.ai.ChatClient}*/ const oClient = plugins.ai.createOpenAiChatBuilder() .apiKey(sApiKey) .modelName('gpt-4o') .addSystemMessage('You are a document assistant.') .build();
// Option A: from media // var aBytes = solutionModel.getMedia('test_doc.pdf').bytes; // oClient.addBytes(aBytes, 'application/pdf');
// Option B: from file system /**@type {plugins.file.JSFile}*/ const oFile = plugins.file.convertToJSFile('/path/to/a/small/test.pdf');
oClient.addFile(oFile).chat('What is this document about?').then(function(oResponse) { application.output('addFile response: ' + oResponse.getResponse().substring(0, 100)); }).catch(function(oError) { application.output('addFile failed: ' + oError.message, LOGGINGLEVEL.ERROR); }); } catch (e) { application.output('Error in askAboutFile: ' + e.message, LOGGINGLEVEL.ERROR); }}Expected Output:
addFile response: This document serves as a quick reference guide for a team using "Copilot Agent Workflows" within a...The addFile() method accepts a JSFile or a String path on the server, auto-detects the content type, and chains back to the client so you can call .chat() right after. If you need to specify the content type explicitly, use addFile(oFile, 'application/pdf'). The addBytes() variant accepts raw byte arrays for content that is already in memory.
Supported content types include image/*, video/*, audio/*, application/pdf, and text/*. This opens up use cases like extracting structured data from scanned invoices, transcribing audio notes, or asking questions about uploaded documents.
Your First Vector Embedding
Chat completions are fun, but embeddings are where things get really interesting for Servoy developers. An embedding is a numeric representation of meaning. Two pieces of text that mean similar things produce similar embeddings, even if they share no words in common. Think of it like an index on a column, but instead of indexing exact values, it indexes meaning.
Here is the simplest possible embedding example:
/** * Embeds a few strings and searches them by meaning. * @author Gary Dotzlaw * @since 2026-04-12 * @public */function firstEmbeddingTest() { /**@type {String}*/ const sApiKey = application.getServoyProperty('openai_api_key');
/**@type {plugins.ai.EmbeddingModel}*/ const oEmbeddingModel = plugins.ai.createOpenAiEmbeddingModelBuilder() .apiKey(sApiKey) .modelName('text-embedding-3-small') .build();
/**@type {plugins.ai.EmbeddingStore}*/ const oStore = oEmbeddingModel.createInMemoryStore();
/**@type {Array<String>}*/ const aTexts = [ "Customer called about a late shipment, very upset", "Product returned due to defective packaging", "Annual review went well, customer is happy with service", "Urgent complaint about billing errors on the last invoice" ];
oStore.embed(aTexts, []).then(function() { /**@type {Array<plugins.ai.SearchResult>}*/ const aResults = oStore.search('unhappy customers', 3); application.output('Found ' + aResults.length + ' similar notes'); aResults.forEach(function(oResult) { application.output(' Score ' + oResult.getScore().toFixed(3) + ': ' + oResult.getText()); }); }).catch(function(oError) { application.output('Embedding failed: ' + oError.message, LOGGINGLEVEL.ERROR); });}Let’s walk through this:
- createOpenAiEmbeddingModelBuilder(): Similar to the chat builder. You configure it with an API key and a model name.
text-embedding-3-smallis cheap and fast, which is what you want for most embedding work. If you need the vector dimension for database column sizing, calloEmbeddingModel.getDimension()after building. Fortext-embedding-3-small, that is 1536. - createInMemoryStore(): The vector store is created from the embedding model itself, not from
plugins.ai. This is a common source of confusion. CalloEmbeddingModel.createInMemoryStore(). The store lives in memory, which is perfect for development and testing. For production, you will want PgVector, which I cover in Article 3. - store.embed(texts, metadata): Takes an array of strings and a parallel array of metadata objects. Both arguments are required. If you do not have per-text metadata, pass an empty array
[]as the second argument. Omitting it causes a runtime error because the Java signature requires both parameters. This returns a Promise, so you do not run the search until the embeddings are actually in the store. - store.search(query, maxResults): Runs a similarity search and returns results synchronously. Always pass
maxResultsexplicitly; the default is only 3. Each result has.getText(),.getScore(), and.getMetadata()methods.
Notice that the search query is “unhappy customers” but none of the stored notes contain those exact words. The search still returns the ones about the upset customer, the billing complaint, and the defective product, because those notes are semantically similar to “unhappy customers.”
Expected Output:
Found 3 similar notes Score 0.739: Customer called about a late shipment, very upset Score 0.703: Annual review went well, customer is happy with service Score 0.686: Urgent complaint about billing errors on the last invoiceA LIKE query would never find these results. A full-text index would need the exact words or aggressive stemming. Semantic search does not care about word overlap. It cares about meaning.
A Realistic Scenario: Searching Products by Meaning
Let’s take this to a real scenario. The example_data server that ships with Servoy has a products table with 77 rows, each with a productname and a product_desc column. Imagine a user wants to find all products related to “spicy hot condiments.” The word “spicy” may not appear in every product’s name or description, and a LIKE query against “spicy” will miss Louisiana Hot Sauce, Cajun seasoning, pepper sauces, and anything else that is clearly spicy but uses different words.
Here is how you solve this with plugins.ai:
/** * Indexes product names and descriptions into an in-memory embedding store. * @author Gary Dotzlaw * @since 2026-04-17 * @public * * @return {Promise<plugins.ai.EmbeddingStore>} resolves with the populated store */function buildProductIndex() { /**@type {String}*/ const sApiKey = application.getServoyProperty('openai_api_key');
/**@type {plugins.ai.EmbeddingModel}*/ const oEmbeddingModel = plugins.ai.createOpenAiEmbeddingModelBuilder() .apiKey(sApiKey) .modelName('text-embedding-3-small') .build();
/**@type {plugins.ai.EmbeddingStore}*/ const oStore = oEmbeddingModel.createInMemoryStore();
/**@type {QBSelect}*/ const query = datasources.db.example_data.products.createSelect(); query.result.add(query.columns.productname); query.result.add(query.columns.product_desc); /**@type {JSDataSet}*/ const dsProducts = databaseManager.getDataSetByQuery(query, 5000);
/**@type {Array<String>}*/ const aTexts = []; for (let i = 1; i <= dsProducts.getMaxRowIndex(); i++) { /**@type {String}*/ const sName = dsProducts.getValue(i, 1) || ''; /**@type {String}*/ const sDesc = dsProducts.getValue(i, 2) || ''; aTexts.push(sName + ' ' + sDesc); }
return oStore.embed(aTexts, []).then(function() { application.output('Indexed ' + aTexts.length + ' products'); return oStore; });}
/** * Searches products by meaning instead of keywords. * @author Gary Dotzlaw * @since 2026-04-17 * @public * * @param {plugins.ai.EmbeddingStore} oStore the embedding store * @param {String} sQuery the natural-language search query */function searchProducts(oStore, sQuery) { try { /**@type {Array<plugins.ai.SearchResult>}*/ const aResults = oStore.search(sQuery, 5); application.output('Query: ' + sQuery); application.output('Found ' + aResults.length + ' semantically similar products'); aResults.forEach(function(oResult) { application.output(' Score ' + oResult.getScore().toFixed(3) + ': ' + oResult.getText()); }); } catch (e) { application.output('Error in searchProducts: ' + e.message, LOGGINGLEVEL.ERROR); }}To use this, call buildProductIndex() once to build the store, then call searchProducts() inside the .then() callback:
/** * Demo: build the index and run a search. * @author Gary Dotzlaw * @since 2026-04-17 * @public */function runSearchDemo() { buildProductIndex().then(function(oStore) { searchProducts(oStore, 'spicy hot condiments'); }).catch(function(oError) { application.output('Demo failed: ' + oError.message, LOGGINGLEVEL.ERROR); });}Call runSearchDemo() and watch it return products like Louisiana Fiery Hot Pepper Sauce, Cajun Seasoning, and Frankfurter grüne Soße. The word “spicy” does not appear in most of those product names. A LIKE query against “spicy” would have missed them all.
Expected Output:
Indexed 77 productsQuery: spicy hot condimentsFound 5 semantically similar products Score 0.792: Louisiana Fiery Hot Pepper Sauce Score 0.733: Louisiana Hot Spiced Okra Score 0.680: Chef Anton's Cajun Seasoning Score 0.673: Original Frankfurter grüne Soße Score 0.665: Chef Anton's Gumbo MixFive results, ranked by semantic similarity. The top match scores 0.79, and even the weakest match at 0.67 is still clearly a spicy or hot product. None of the results contain the word “spicy” in their product name, yet every one of them is exactly what the user meant.
The advantages of this approach are clear:
- Users find what they mean, not what they typed. No more teaching users to guess the exact keywords.
- No stemming or full-text index maintenance. The embedding model handles linguistic variation.
- Works across synonyms and paraphrases. “Angry customer” matches “furious client” matches “irate buyer.”
- Ranked by relevance. Results come back with similarity scores so you can show the best matches first.
Keep in mind that the in-memory store is fine for development and small datasets. The embeddings live only for the lifetime of the current runtime, so if the server restarts you re-embed everything. For production, use the persistent PgVector-backed store so your embeddings survive across sessions and you are not paying OpenAI every time your server restarts. I cover PgVector in Article 3 of this series.
Advanced: Chat With Your Data
Still with me? Okay, let’s kick it up a notch. This next pattern is one of the more creative uses of the chat API, and it does not involve embeddings at all. The idea is simple: send your database schema to the LLM and let it write SQL for you based on a natural-language question. Then execute the SQL, and optionally send the results back through a second LLM call for a human-readable summary.
This is the pattern demonstrated in the Example Solution’s chatWithYourData form, and I can tell you it makes an impression the first time you see it work.
The first function gets the database schema and sends it to the model:
/** * Generates a SQL query from a natural-language question. * @author Gary Dotzlaw * @since 2026-04-17 * @public * * @param {String} sQuestion the user's natural-language question * @param {String} sServerName the Servoy server name to query against * @return {Promise<String>} resolves with the generated SQL */function generateSQLFromQuestion(sQuestion, sServerName) { /**@type {String}*/ const sApiKey = application.getServoyProperty('openai_api_key');
/**@type {String}*/ const sSystemMessage = 'You are a SQL expert. You will be given a database schema in JSON ' + 'format and a user question. Respond with a valid SQL query that answers the question. ' + 'Only provide the SQL query as the response, nothing else.';
/**@type {plugins.ai.ChatClient}*/ const oClient = plugins.ai.createOpenAiChatBuilder() .apiKey(sApiKey) .modelName('gpt-4o') .addSystemMessage(sSystemMessage) .build();
/**@type {Object}*/ const oSchema = _getSchema(sServerName);
/**@type {String}*/ const sPrompt = 'Database Schema:\n' + JSON.stringify(oSchema) + '\n\nUser Question:\n' + sQuestion;
return oClient.chat(sPrompt).then(function(oResponse) { return oResponse.getResponse(); });}The _getSchema helper function introspects the database and returns it as JSON:
/** * Tables exposed to the chat-with-your-data model. Trimming the schema to just * the tables the question can touch keeps the prompt small and the call fast. * @type {Array<String>} */const ALLOWED_TABLES = ['products', 'orders', 'order_details', 'customers'];
/** * Fetches the database schema for a Servoy server in JSON format. * Only includes tables listed in ALLOWED_TABLES. * @author Gary Dotzlaw * @since 2026-04-17 * @private * * @param {String} sServerName the Servoy server name * @return {Object} the schema as a JSON-friendly object */function _getSchema(sServerName) { /**@type {Object}*/ const oSchema = { databaseType: databaseManager.getDatabaseProductName(sServerName), tables: [] };
/**@type {Array<String>}*/ const aTableNames = databaseManager.getTableNames(sServerName); for (let i = 0; i < aTableNames.length; i++) { /**@type {String}*/ const sTableName = aTableNames[i]; if (ALLOWED_TABLES.indexOf(sTableName) === -1) { continue; } /**@type {Object}*/ const oTable = databaseManager.getTable(sServerName, sTableName); /**@type {Array<String>}*/ const aPKColumns = oTable.getRowIdentifierColumnNames(); /**@type {Array<String>}*/ const aColumns = oTable.getColumnNames();
/**@type {Object}*/ const oTableInfo = { table: sTableName, columns: [] }; for (let j = 0; j < aColumns.length; j++) { /**@type {String}*/ const sColName = aColumns[j]; /**@type {Object}*/ const oColumn = oTable.getColumn(sColName); oTableInfo.columns.push({ name: sColName, type: oColumn.getTypeAsString(), length: oColumn.getLength(), description: oColumn.getDescription(), title: oColumn.getTitle(), fkForTableName: oColumn.getForeignType(), isPK: aPKColumns.indexOf(sColName) !== -1 }); } oSchema.tables.push(oTableInfo); } return oSchema;}And then you bring it all together, executing the generated SQL and summarizing the results:
/** * Asks a question about your data and gets a natural-language answer. * @author Gary Dotzlaw * @since 2026-04-17 * @public * * @param {String} sQuestion the user's natural-language question * @param {String} sServerName the Servoy server name */function askYourData(sQuestion, sServerName) { try { plugins.svyBlockUI.show('Thinking...'); generateSQLFromQuestion(sQuestion, sServerName).then(function(sRawSQL) { // Strip markdown code fences that the model sometimes adds around the SQL /**@type {String}*/ const sSQL = sRawSQL.replace(/```sql\s*|```/g, '').trim(); application.output('Generated SQL: ' + sSQL);
/**@type {JSDataSet}*/ const dsResult = databaseManager.getDataSetByQuery(sServerName, sSQL, null, 1000); /**@type {String}*/ const sCSV = dsResult.getAsText(',', '\n', '"', true);
// Second LLM call: summarize the results in plain language /**@type {plugins.ai.ChatClient}*/ const oSummaryClient = plugins.ai.createOpenAiChatBuilder() .apiKey(application.getServoyProperty('openai_api_key')) .modelName('gpt-4o') .addSystemMessage('You are a data expert. Given a user question and a CSV dataset, ' + 'provide a concise, accurate answer. Include relevant data and observations.') .build();
/**@type {String}*/ const sSummaryPrompt = 'User Question:\n' + sQuestion + '\n\nDataset in CSV format:\n' + sCSV;
return oSummaryClient.chat(sSummaryPrompt); }).then(function(oResponse) { application.output('Answer: ' + oResponse.getResponse()); }).catch(function(oError) { application.output('askYourData failed: ' + oError.message, LOGGINGLEVEL.ERROR); }).finally(function() { plugins.svyBlockUI.stop(); }); } catch (e) { application.output('Error in askYourData: ' + e.message, LOGGINGLEVEL.ERROR); plugins.svyBlockUI.stop(); }}Call askYourData("What are the top 5 best selling products?", "example_data") and watch the model write the SQL, execute it against your database, and come back with a plain-language answer drawn from your actual data.
Expected Output:
Generated SQL: SELECT p.productname, SUM(od.quantity) AS total_quantity_soldFROM order_details odJOIN products p ON od.productid = p.productidGROUP BY p.productnameORDER BY total_quantity_sold DESCLIMIT 5;
Answer: Based on the dataset provided, the top 5 best selling products are:
1. Camembert Pierrot with 1,577 units sold.2. Raclette Courdavault with 1,496 units sold.3. Gorgonzola Telino with 1,397 units sold.4. Gnocchi di nonna Alice with 1,263 units sold.5. Pavlova with 1,158 units sold.
These products have the highest total quantities sold as per the dataset.Notice two things. First, the LLM picked the right join — order_details.productid to products.productid — even though the question did not mention tables or columns by name. The model inferred the join path from the schema JSON alone. Second, the answer summarizes the CSV it received in plain English, adding natural-language framing around the raw numbers. One pass generated the SQL, a second pass narrated the result. The user asked a question, the agent answered it.
A couple of important caveats. This pattern gives the LLM indirect read access to your database. Make sure you are comfortable with that from a security perspective before exposing it to users. The LLM-generated SQL should be treated as untrusted input; consider running it with a read-only database connection or restricting the tables visible in the schema. And obviously, never let the model generate INSERT, UPDATE, or DELETE statements through this pattern.
The ChatResponse: Everything It Can Tell You
Before we close, here is the full list of methods on the ChatResponse object. You have already seen getResponse() and getTokenUsage(), but the others are useful for logging, debugging, and working with newer models:
getResponse()returns the full response text.getPrompt()returns the original user prompt that produced this response. Useful for logging.getId()returns a unique identifier. Correlate this with your provider’s dashboard when debugging.getFinishReason()returns why the model stopped. Normal completion, token limit, tool call, etc.getTokenUsage()returns a TokenUsage object. Call.inputTokenCount(),.outputTokenCount(), and.totalTokenCount()on it. These are method calls, not properties; do not forget the parentheses.getThinking()returns the reasoning text from models like OpenAI’s o1 series or Gemini 2.5 Pro. Returns null for models that do not support reasoning (like gpt-4o). Useful for debugging agentic workflows where you need to see why the model made a decision.
What Comes Next
This tutorial covered the foundation: chat completions, streaming, conversation memory, provider switching, embeddings, similarity search, and the chat-with-your-data pattern. The AI Runtime Plugin has several more capabilities that deserve their own articles:
- Tool calling lets you register Servoy functions as tools the LLM can invoke during reasoning. The agent decides when to call them. This is how you build agentic workflows where the AI actually does things in your Servoy solution instead of just answering questions. That is Article 2.
- FoundSet embedding and PgVector stores let you embed database records directly with primary keys preserved as metadata, and persist those embeddings in PostgreSQL so they survive server restarts. That is Article 3.
- QBVectorColumn and hybrid queries let you combine semantic similarity with traditional SQL WHERE clauses in a single query. Products semantically similar to “lightweight laptop for travel” filtered to under $2,000 and in stock, in one database round-trip. That is Article 4.
I will cover each of these in the following tutorials. For now, get comfortable with the basics. Set up an API key, put it in your servoy.properties, try the chat builder, embed a few strings, and run a similarity search. Once the basics click, the advanced features will make a lot more sense.
That concludes this Servoy tutorial on getting started with the AI Runtime Plugin. I hope you enjoyed it, and I look forward to bringing you more Servoy tutorials on AI integration in the future.
The Series
This is Part 1 of a four-part series on the Servoy AI Runtime Plugin:
- Getting Started with the Servoy AI Runtime Plugin (this article). Chat completions, streaming, conversation memory, embeddings, and your first semantic search.
- Tool Calling with the AI Runtime Plugin: Agentic Servoy. Register Servoy methods as tools and let the LLM decide when to call them.
- Embedding Your Servoy Data for Semantic Search. PgVector production stores, FoundSet
embedAll(), and PDF document chunking. - Hybrid Queries with QBVectorColumn: Semantic Meets SQL. Combine semantic similarity with traditional WHERE clauses in a single database round-trip.