#Serve over internet with / without gui ?
12 messages · Page 1 of 1 (latest)
Whoops sorry missed this!
GUI inferencing yes! Use GPT4All for eg
save the model to GGUF with Unsloth, then use GPT4All
You can also try Ooba via Colab: https://colab.research.google.com/github/oobabooga/text-generation-webui/blob/main/Colab-TextGen-GPU.ipynb
For an API - try vLLM
you can setup SGlang on runpod to provide an API that supports batched inference.
Do you have an examplary code?
Thank you
Can you provide code for how to do that?
the sglang github repo has example code
@gritty cedar hopefully we solved your issue?