#More models
1 messages · Page 1 of 1 (latest)
@hot hollow moving to a thread
@hot hollow are there standard env variables that most local model users are likely to already have setup in their environment?
The equivalent of OPENAI_API_KEY and ANTHROPIC_API_KEY for hosted models
I could just start with that, and 1) add support for secret references in the value, plus 2) support sourcing from .env - both as sugar
(instead of making .env mandatory)
There's OLLAMA_BASE_URL. I think anything else would be specific to whatever infra you've used to setup ollama
OK I have working model selection (within openai selection for now). Pushing
as in, ollama itself doesn't do auth or https
But we don't want to be too tied to ollama. Maybe there's vllm also?
I've only used ollama directly so far. Isn't vllm basically a frontend to multiple backend llm providers? I can play with it
oh I guess there's vllm serve too
More models
@hot hollow for open models, besides the name, is there anything else that might need to be configured, like number of params, or other exotic config I'm not familiar with?
I think whatever we can configure in the OpenAI API request. temperature, max tokens (there seems to be different abstractions for this based on provider), context window (also varies)
OK, but nothing that might impact the routing? Like different variations of the same model for example? I was worrying if the different levels of parameters were so important, that a developer might want to control that at the same level as the model name
since eg. llama3 at different param numbers is basically a different model?
Maybe? I don't know enough about that kind of customization to say
I'm going to grab all this for now, might be useful for routing:
type LlmConfig struct {
ANTHROPIC_API_KEY string
ANTHROPIC_BASE_URL string
ANTHROPIC_VERSION string
OPENAI_API_KEY string
OPENAI_BASE_URL string
OPENAI_MODEL string
OPENAI_ORG_ID string
OLLAMA_API_HOST string
OLLAMA_MODEL string
}```
How come the ollama model is in the config but not for the other providers?
Not using it yet - this is just my full list of env vars that seem to be in use out there
xxx_MODEL could potentially be used as a default model selector for each provider. But it may not make sense to use them
not sur eyet
I could use it to, for example, change the default model depending on what you have configured in your system
eg. if I spot OPENAI_MODEL, or OLLAMA_MODELI could default to that.
I think that makes sense. You could probably guess a default for openAI or anthropic but not ollama
is the /v1/ added to the base URL if I use OLLAMA_API_HOST?
the OLLAMA_BASE_URL var that I mentioned above is what open webui uses (a popular chat frontend used with ollama) but I don't know how standardized it is
@hot hollow here's the snippet I'm building:
func (r *LlmRouter) routeOtherModel() *LlmEndpoint {
if r.OLLAMA_API_HOST != "" {
// Strip any protocol prefix if present
host := strings.TrimPrefix(strings.TrimPrefix(r.OLLAMA_API_HOST, "https://"), "http://")
// Remove trailing slash if present
host = strings.TrimSuffix(host, "/")
return &LlmEndpoint{
Host: host,
Path: "/api",
}
}
// Fallback to OpenAI endpoint if no Ollama host specified
u, err := url.Parse(r.OPENAI_BASE_URL)
if err == nil && u.Path != "" {
// If OPENAI_BASE_URL includes a path, split into host and path
return &LlmEndpoint{
Host: u.Host,
Path: u.Path + "/v1",
}
}
return &LlmEndpoint{
Host: strings.TrimSuffix(r.OPENAI_BASE_URL, "/"),
Path: "/v1",
}
}
got it. This is the config I've been using with ollama
LLM_HOST=kyle-dagger.turkey-beta.ts.net
LLM_PATH=/v1/
Ah yeah I have to wire back in the LLM_HOST and LLM_PATH, wanted to start from what people have, then see if adding our own variables actually helps, or complicates things
Yeah makes sense. I was just pointing it out because Path: "/api", for ollama that you have in the snippet may not be correct
Ah, if OLLAMA_API_HOST is set, should I just use it as is?
yeah the endpoint for OpenAI compatible chat completions is HOST/v1/chat/completions which I think is the same as OpenAI's path
the stuff at HOST/api/chat is ollama's own api which isn't openAI compatible
So maybe I should ignore all OLLAMA_... variables for now to keep things simple?
and just pick up OPENAI_BASE_URL from your config?
yeah if we want to configure multiple LLM providers it makes sense to have a OLLAMA_BASE_URL but for now its fine to pretend its just openAI
Struggling with build errors and brainfog...
Trying to push a working build, then out for the day
big week ahead!
gpustack is another model runtime we will want to make sure works. Looks like their base url is slightly different for the openAI compatible API but other than that I don't see why it wouldn't work http://myserver/v1-openai/chat/completions. That's another one where I'd just set OPENAI_BASE_URL instead of having yet another name in the config options
Also it looks like all models work with the OpenAI client libraries except for Anthropic?
From what I've seen so far, yeah! How about Gemini?
no idea
I'll find out!
@hot hollow just to be sure: for your local models to work, do I just pass through your OPENAI_BASE_URL to the client as-is, or do I need to append /v1 or something else?
If it's base_url and not just host then I can add the /v1 myself and I think that should be enough
on the latest commit setting OPENAI_BASE_URL=https://kyle-dagger.turkey-beta.ts.net/v1/
and somehow the openai client is trying to hit https://kyle-dagger.turkey-beta.ts.net/chat/completions (no /v1/). Haven't found where this is getting dropped in code yet
probably the openai client lib
I only pass through everywhere (unless I screwed up)
isn't this basically how it was passed to the client before though?
base.Scheme = "https"
base.Host = llm.Config.Host
base.Path = llm.Config.Path
opts = append(opts, option.WithBaseURL(base.String()))
if llm.Endpoint.BaseURL != "" {
opts = append(opts, option.WithBaseURL(llm.Endpoint.BaseURL))
}