#Foundry RAG Agent

5 messages · Page 1 of 1 (latest)

open kraken
#

Has anyone tried building a RAG agent?

The Agent handles the orchestration you choose a model and connect to a tool or knowledge base.

The problem is if you connect to the tool you get control over parameters ie top k and semantic search settings at the agent level. This is helpful because you can control top k and control token usage but it uses it own semantic config which is annoying.

If you connect to a knowledge base instead you can use your custom semantic config in azure portal but you get no control over parameters specifically top k it automatically sets it to 10 which burns through tokens faster and hits request limits faster.

How should I go about handling this?

void raven
#

I’ve run into this @open kraken the best move is to bypass both limitations by handling retrieval outside the agent. Use the agent + tool setup, but let the tool call your own retrieval layer where you control semantic config and dynamically set top‑k before returning only the needed context. That way you keep token usage low and avoid Azure’s defaults fighting you.

open kraken
#

@void raven First of all thanks for taking the time to answer. Quick follow questions for you. Let’s say I want to build a retrieval layer in C#, do I containerize it? How does the agent call the retrieval layer? Sorry I’m still new to this and figuring things out.

void raven
#

No worries at all, you’re on the right track.

You don’t have to containerize it at first , you can just build it as a simple C# Web API (ASP.NET Core). Containerizing (Docker) is more for when you want easy deployment or scaling later.

The agent calls it like any external tool you expose an HTTP endpoint (e.g. /retrieve) that accepts the query + params (like top_k), runs your semantic search, and returns the filtered context. Then you register that endpoint as a tool/function in your agent, so whenever retrieval is needed, it hits your API and uses the response.

If you don't mind I can provide you a full guide to do all this and the rest but there's a fee attached, does that work for you
@open kraken

open kraken
#

@void raven Turns out you just need to set your semantic config as a default in the index JSON editor but adding this block of code

"semantic": {
This part below is key***
"defaultConfiguration":"default",
"configurations": [
{
"name": "default",
"prioritizedFields": { ... }
}
]
}

Your idea sounds way more sophisticated tho. I will not need the guide for now but may in the future as I try to scale. If you don’t mind me asking how much do you charge?