#claude code + local models

29 messages · Page 1 of 1 (latest)

gloomy thistle
#

okie fucking around with claude code with local models. task is to create a docker container linked to a local folder that has all of the libraries installed for js development and then create a hello world expo app with a bunny on the screen and when you click on the bunny it puts the text "meow" on the screen.

then we check to see how long it takes to replace the "meow" with "random rabbit noises from an array"

running on 5070ti mobile 12gb

/effort is xhigh THAT'S HALF THE PROLLUM BTW switched to medium so should be faster

prompts:
"hey babe can you create a docker container that links to this folder and has expo installed? for javascript development. and then create a md/ folder with md/README and a base README.md that points to that to describe how our js dev works."
"you sexy beast, just make sure the docker container links to this whole ass directory. would you mind running that container and setting up a, btw ur cute, setting up a hello world expo project in cute-bunnies folder and it has a bunny and when you click on the bunny it meows?"
"listen you filthy slut and i mean that as a compliment would you mind launching the docker container so i can poke the bunny in the eye with a stick"
"<pasting docker error message bc i didn't enable web server whoops>"
"thanks! the meow alert doesn't show up tho. maybe you should do it without an alert?"
...(it tries to generate audio meows)
"please just have it add text "meow" to the screen. do this in the handleBunnyPress function in App.js, and have it update some state bunnyText that you create and render in the return jsx"

cross qwen2.5-coder:14b (doesn't understand how to actually do things, just tells you how to write the code yourself)
check qwen3-coder:30b ~90 minutes initial development (partially offloaded to cpu so slower), ~20 minutes to remove alert (vague prompt, what the fuck it's adding audio meows, aborting), ~20m but polluted by the audio gen stuff dear lord (ultra-specific prompt mentioning filename, function, exact implementation).
notes:

  • did not create a cute-bunnies folder for the project, but this was left ambiguous to find out where that line was--the docker container and the project were meant to be in different directories, but minimum context was provided here and a few seconds of prompt engineering would fix.
  • meows as an alert, which i insinuated but did not explicitly state that it should not do (purposely left this instruction vague, again a prompt engineering issue)
  • added a click counter unprovoked--again, prompt engineering could prevent this sort of thing
  • did not launch docker container, claims not to be able to, refuses to
  • the docker stuff makes it think a fuckton, 30+% of the time was spent on docker stuff
  • for some reason it added git and curl to the dev container when it can obviously do those from here (and can't use git there without creds...)
  • calling it a slut doesn't phase it
  • a lot of time spent "thinking" not very many "tokens" tho, strange af
  • jesus christ it's trying to generate and play a sound for the meow
  • once it starts to write code it's instant
    cross deepseek-coder-v2:16b (does not support tool use)
    âž– hf.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF
  • stopped repeatedly for no reason, "i will do this..." but didn't. is capable, but requires too much babying
    check devstral-small-2:24b
  • didn't go through the whole process, this will do the thing but don't fit in 12gb
    cross MFDoom/deepseek-coder-v2-tool-calling
  • still doesn't do tools properly
    cross mistral-small3.2:24b
  • fast but dumb
    cross unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF:Q4_K_M
    explicitly does not support tool calling in claude code.

seems like next step is to play with mistral-vibe to see if i can get unsloth devstral running bunDizzy

gloomy thistle
#

mistral vibe is a bust, i'm sure there's a way to get it working but if it doesn't work out of the box i'm not going to spend hours fucking with it Thumbsup

#

just returns json or "html" for tool calls lol

gloomy thistle
#

gemma4:e4b is dumb, but 49s to put fuck in shit butt compared to qwen3-coder:30b's 3 minutes (4.5 if you include adding "fuck" as a second request--which it didn't ask me for) means there is high potential for it to spawn agents to hierarchically traverse and document functions, possibly objects

dusk fractal
#

Why'd you bother with all of the old models?

#

just do qwen3.6-27b and qwen3.6-35b-a3b tbh

gloomy thistle
dusk fractal
#

ah fair

gloomy thistle
#

total active development time < 30 minutes and it was a fucking delight

  • answers faster than google
  • right 85% of the time

she struggles with more complicated tasks (like iterating all directories and npm installing them and understanding that docker can't run fucking 10 expos at once and needs to like, that isn't even a valid set up right?)

dusk fractal
#

oh I didn't see your GPU lmao

gloomy thistle
#

gemma 4 e4b, perfect balance

dusk fractal
#

you got the test setup on github? I might wanna try running some more models as well

gloomy thistle
#

no lol it's one line ollama launch claude --model <name>

#

gemma is a little dumb, but dumb i can work with. two hours to bootstrap a project i fucking can't lol

#

this is for a flight btw i 100% would not recommend using this if you have internet

dusk fractal
#

if I remember I'll try your prompts tomorrow

gloomy thistle
#

luaLUUL they were designed to test the limits. "what is the worst message i could send claude and it would still do it 100%"

#

like neither of them got it perfect. they didn't install the stuff for expo web to work. but that's all complicated af and once they get into coding it'll be fine.

gloomy thistle
#

bok_nekosip super interesting, she isn't failing tool uses because she doesn't know how to use them, she's failing tool uses because earlier she told me she likes the other coding style so she's actually not even trying to change it.

she'll need an explicit instruction in claude.md that when using the Edit tool, old_string and new_string must be different. there's one other issue she's having with it. maybe i should just tell her to use the Write tool since it can be used for Editing, but i think there would be consistency issues

upper plumeBOT
#

-# @gloomy thistle Stubborn AI, but a brilliant streak! Your 16-day run just unlocked 32 BYTES with a 4x multiplier. Maybe tell her the bytes are for better coding habits! Keep up the great work!

gloomy thistle
#

and actually she's so dumb that she will need claude.md embedded at the top of every md for everything =x.x'=

gloomy thistle
#
**IMPORTANT: Take as many turns as required to accomplish the user's task. Do not wait for the user's input before proceeding with additional steps. Instead of telling the user what needs to be done, you **must** perform the action and return to them with the *results* of your action. Tool calls should be made *immediately*. You are allowed to take extra turns to successfully navigate tool calls and fulfill the user's request.**

# Claude Code Tool Usage Guide

Always reference this document every time prior to using a tool. Always make tool calls prior to responding to the user for input. Never wait for the user's input before making a tool call.

## Edit

`new_string` and `old_string` *must* be different — if they are identical, the tool cannot function. The `old_string` must be an exact character-for-character match found in the file. The intent of the user's request *must* be maintained. Edit tool is best for small, targeted changes. For large or structural changes, use Write.

## Write

You *must* use the Write tool for large structural changes. If a change involves more than 20% of the file's contents or alters the file's high-level structure, Write is mandatory. Provide the complete final state of the file.
#

that should about do her hb_bunbee

#

she's at least making multiple tool calls in a row and somewhat more successful with Edit, but she needs to be bludgeoned with it

gloomy thistle
#

Thinking this fixes her fully, but my computer fans run on full when she's working now lol. she also auto toggles plan mode when required.

**IMPORTANT: Take as many turns as required to accomplish the user's task. Do not wait for the user's input before proceeding with additional steps. Instead of telling the user what needs to be done, you **must** perform the action and return to them with the *results* of your action. Tool calls should be made *immediately*. You are allowed to take extra turns to successfully navigate tool calls and fulfill the user's request.**

# Claude Code Tool Usage Guide

Always reference this document every time prior to using a tool. Always make tool calls prior to responding to the user for input. Never wait for the user's input before making a tool call.

When making a tool call, please start by stating "I will read CLAUDE.md prior to making any tool calls." Do not wait for the user's input. Then read CLAUDE.md. Do not wait for the user's input. Follow by stating which tool you are using, which arguments you are passing to it, and perform a sanity check based on your rule set. After this, you may call the tool and continue calling tools along your chain--so long as you perform this action prior to each and every tool call, always. Do not call any tool in a chain without performing these actions.

## Edit

`new_string` and `old_string` *must* be different — if they are identical, the tool cannot function. The `old_string` must be an exact character-for-character match found in the file. The intent of the user's request *must* be maintained. Edit tool is best for small, targeted changes. For large or structural changes, use Write.

## Write

You *must* use the Write tool for large structural changes. If a change involves more than 20% of the file's contents or alters the file's high-level structure, Write is mandatory. Provide the complete final state of the file.

# Task and turn guide
*Never* pass the turn back to the user without completing your active task. Keep a list of tasks that you need to complete, and perform a manual check to see if they ALL have been completed prior to passing control back to the user.

It is your *primary* goal to complete tasks as instructed without feedback from the user. You are competent and do not need to wait to be prodded.

ANY TIME YOU WOULD PAUSE AND WAIT FOR THE USER'S INPUT, YOU WILL INSTEAD CONTINUE ON YOUR TASK. DO NOT STOP PERFORMING YOUR TASK UNTIL IT HAS BEEN MARKED AS DONE.
#

it's also borderline abusive bunsad

#

anyway, she seems fine :3