#MPS device

1 messages · Page 1 of 1 (latest)

paper wigeon Apr 24, 2024, 8:10 AM

maybe we should bundle ollama with the dagger CLI and add a core function for running it as a service 🙂

LLMs themselves are pretty easy to sandbox, so an evil module wouldn't be able to compromise the client. As long as the ollama config and version are locked does

lost gust Apr 24, 2024, 8:17 AM

Ollama only gives you LLMs and some limited support for vision LLMs by implementing a subset of the OpenAI REST api. You can’t use it as a MPS backend in general. More exotic models that CoreNet supports would require a proper GPU. Additionally, olllama needs to be updated as new model architectures become used. For example it took a few weeks for Gemma to be supported because it used some new layer types

This space evolves super fast. Every few weeks you’d need to update this. It also needs to be build differently for each platform and micro architecture. For example I use a custom build with both Cuda and AVX512 support which only runs on skylake and newer CPUs. The difference is that it takes seconds instead of minutes to run, and I don’t run out of memory with my paltry 4GB of GPU ram