Installation bash script to solve the local model problem | Open Interpreter | Page 1

tender knoll Sep 28, 2023, 4:15 PM

#

Like other project that depend on the GPU library to be installed correctly before install the OI, a shell script to handle the pre-checking can save many problem that our user is running into with this checking.

#

Taking cues from rvm, nvm, and ollama.ai—all of which utilize installation shell scripts to set up dependencies—I propose adopting a similar strategy for OI. This approach is likely to address the majority of dependency-related issues that our users are presently encountering.

For Ollama
curl https://ollama.ai/install.sh | sh

For RVM
\curl -sSL https://get.rvm.io | bash

For NVM
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.5/install.sh | bash

summer bough Sep 28, 2023, 4:19 PM

#

We should be able to use a petals swarm server for local models and side step this entirely

tender knoll Sep 28, 2023, 4:19 PM

#

what if the user doesn't want to depend on petals swarm server ?

summer bough Sep 28, 2023, 4:20 PM

#

I

#

think they mainly want it to actually work

#

These local installs are killers.

#

I don't sdee why they care how the code runs so long as it works, and then we call everything through an endpoint

#

I have no idea of the comparative memory footprints

#

This would also allow them to have a local cluster of all their machines

#

The sefrver is one liner

#

server

tender knoll Sep 28, 2023, 4:24 PM

#

would that solve the problem that cuda library version as well ?

#

I can look into it.

summer bough Sep 28, 2023, 4:24 PM

#

yes

#

I think so

#

if the swarm is running it's running

#

this avoids sooo many issues

#

and we don't maintina it tee hee

tender knoll Sep 28, 2023, 4:25 PM

#

it crashed on my Mac Apple M2 😂

summer bough Sep 28, 2023, 4:25 PM

#

let me try on my apple intel

#

That would be an issue

tender knoll Sep 28, 2023, 4:26 PM

#

2023-09-28 17:24:23.551 python3[77884:1005009] Error = Error Domain=com.apple.appleneuralengine Code=6 "createProgramInstanceForModel:modelToken:qos:isPreCompiled:enablePowerSaving:skipPreparePhase:statsMask:memoryPoolID:enableLateLatch:modelIdentityStr:owningPid:cacheUrlIdentifier:aotCacheUrlIdentifier:error:: Program load failure (0xF0004)" UserInfo={NSLocalizedDescription=createProgramInstanceForModel:modelToken:qos:isPreCompiled:enablePowerSaving:skipPreparePhase:statsMask:memoryPoolID:enableLateLatch:modelIdentityStr:owningPid:cacheUrlIdentifier:aotCacheUrlIdentifier:error:: Program load failure (0xF0004)}
/AppleInternal/Library/BuildRoots/fe2afe83-06e7-11ee-80c3-f6357a1003e8/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Runtimes/MPSRuntime/Operations/GPURegionOps.mm:548: failed assertion `ANE load failed!'
[1] 77884 abort python3 -m petals.cli.run_server petals-team/StableBeluga2

summer bough Sep 28, 2023, 4:26 PM

#

--new_swarm

#

That doesn't look like swarm mode

tender knoll Sep 28, 2023, 4:28 PM

#

not as smooth as what we have right now.

#

I can run local model in OI currently, but this petals swarm server is not working.

summer bough Sep 28, 2023, 4:29 PM

#

smooth until they can't get the libs or cuda installed. Don't know how much of a problem this still is

tender knoll Sep 28, 2023, 4:30 PM

#

that's the bash script would solve that.

summer bough Sep 28, 2023, 4:30 PM

#

yes, I see. That assumes a solution exists.

#

Perhaps that is a good assumption

tender knoll Sep 28, 2023, 4:31 PM

#

For Ollama
curl https://ollama.ai/install.sh | sh

#

it does solve that , if you look into the script

#

just need to adopt for OI

#

kind of check the condition and apply the way we already suggest in the issue in github.

summer bough Sep 28, 2023, 4:34 PM

#

Seems to work for intel silicon

python -m petals.cli.run_server codellama/CodeLlama-7b-Instruct-hf --new_swarm --num_blocks 3
Sep 28 12:33:27.809 [INFO] Running Petals 2.2.0

tender knoll Sep 28, 2023, 4:34 PM

#

most of the people ask in the channel can solve their problem after follow how we tell them in the channel. the bash script would be a step closer to find why and fix it.

#

loc("cast"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/fe2afe83-06e7-11ee-80c3-f6357a1003e8/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":745:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x1x1x98xi1>'

summer bough Sep 28, 2023, 4:35 PM

#

sure, fewer code layes should be more performant. I suppose it comes down to how well the install scripts work

#

CPU only

python -m petals.cli.run_server codellama/CodeLlama-7b-Instruct-hf --new_swarm --num_blocks 3

tender knoll Sep 28, 2023, 4:36 PM

#

I can see the incremental improvement along the way.

summer bough Sep 28, 2023, 4:36 PM

#

I couldn't use my GPU for this on my old model

#

No reason not to. If we support petals they could do either way according to their whimsical nature

tender knoll Sep 28, 2023, 4:37 PM

#

I can use with LM Studio and OI, so it shoudn't be the computer problem.

summer bough Sep 28, 2023, 4:38 PM

#

options

#

what's LM studio

tender knoll Sep 28, 2023, 4:38 PM

#

https://github.com/lmstudio-ai

GitHub

LM Studio

Discover, download, and run local LLMs. LM Studio has 4 repositories available. Follow their code on GitHub.

#

https://lmstudio.ai

👾 LM Studio - Discover and run local LLMs

Find, download, and experiment with local LLMs

summer bough Sep 28, 2023, 5:16 PM

#

Looks nice, too bad I can't run it.

tender knoll Sep 29, 2023, 11:15 PM

#

@summer bough after using a clean environment, I can install petals now.

#

#

using petals-team/StableBeluga2 can provide around 5 token/sec, which is not bad.

#Installation bash script to solve the local model problem