#Deciding which AI LLM to use based on text.
35 messages · Page 1 of 1 (latest)
First part is hardware requirements and whether a model need to be fine tuned. You can eliminate a lot of models just by those two questions.
thats not what im asking about
i need to make something that can help decide which model to use for my api
@stoic cedar , i assume you are trying to build a hybrid application, where you would like to take decision upfront which model to be invoked.
Would like to know, what's your criteria for the selection. is it token length or specific entities in the text or any other rule you have in mind.
you pretty much want to do the same thing as openAis gpt4 ,their gpt4 is made of a lot of models, each has a profound expertise in a field,depending the input prompt ...
GPT4 kept the model secret to avoid competition, now the secret is out!
thats not a simple function, its gonna be a groundbreaking tool thatll help you pivot between llms
and have each llms best at something specific
Basically, function will take in a prompt, decide of the task is complicated or easy, whether or not it needs internet access ( I have a function to do that already ), or whether or not it should use an uncensored model.
but why are you selecting models based on prompt? It takes a long time to load and warm up models
i assume you didnt read or look at anything i said or sent
this should say enough on why
I did, it just doesn't make sense
no need to be rude.
toggling between functions/actions, ok I get that. But swapping between entire models is a costly thing, especially as an API. Unless you're attempting what the other person said, creating your own chatGPT "app" library
well u both are right , using the better model is always best for example gpt4
this would be useful when you have smaller open source models that are finetuned to do something
or if you like decided that a input doesnt need gpt 4 so you settle for a low level model
they cookin smth alright ?
XD
for that itll depend entirely on how complex of a task it is for you to use gpt4
then setup like a metric to measure
it
im not self hosting the models
Why not?
It would reduce the latency and offers you more control over just having an API wrapper like the people at HumaneAI pin or Rabbit R1
they cooking, but is the heat on too high?
I don't have the resources
also you can't self host claude 3 opus or gpt 4
Proprietary models not withstanding, self hosting models is a good idea because you can exercise control. This sounds better like a hobby project, but not something to invest resources for as a full business. Lots of competition and open alternatives.
Seems like it's relevant to ask what the goal of having this router capability actually is intended to achieve. at the frontier, all the proprietary options are going to give you effectively the same performance and you could probably just choose at random. On the other hand, if you have a set of sample data that matches what would be the ultimate usecase, you could build up a dataset where the optimal preference is labeled and then maybe do something like a finetune of flan t5 small to decide on.
well 1. it is way cheaper to use llama 3 8b than claude 3 opus
2. it's cheaper for my users that like to chat and use the ai for normal things
3. I can create a dataset
Something I've done as a simple way to approach this type of thing is to present it to the user. It may not make sense in your case but, as an example, on the UI I have a selector the user can click if they want to toggle between chat, zero-shot task completion, or continued generation. Doing it that way, the user is going to build out the dataset on your behalf and if you wanted to automate the task switching later you'd have the ground truth's based on their behaviors. AS to the question of model switching itself, this is technically what OpenAI is currently doing in their chat interace today. They let the user switch models at any point in the conversation. And you can bet that they're tracking every time a user does that so they can identify the model weakpoints.