#MCP for Minecraft Java, using RAG indexing

105 messages · Page 1 of 1 (latest)

velvet jewel
sour harbor
#

Wow

velvet jewel
#

ollama support ofc and i would prefer to do it locally but my single gpu said it would take 12 hours and openrouter was 30mins

frozen topaz
#

ew

icy parcel
wind topaz
#

Would it support lm studio too?

velvet jewel
velvet jewel
wind topaz
#

I see

main elk
velvet jewel
#

I've rewritten everything

#

now im down to just Resolving 372 unresolved tasks with 372 total prompts (~1413388 input tokens)... that many LLM prompts, all local now, but now i have something like 150k chunks to embed, letting it cook now

#

more i want to do, like add a reranker model, but ollama doesn't support rerankers

velvet jewel
#

works really well

wind current
#

thats so cool, when will it be released

velvet jewel
#

it's not good at writing mixins yet

#

and has no reranker

#

and i need to be explicit to tell it to use tool calls

#

lots of things to fix

grand crescent
#

This is an interesting approach

#

I am curious why you opted to go for an MCP server + RAG rather than just giving it a bash tool to use grep etc. on the game's java files

icy parcel
wind current
wind topaz
#

i wish there was a documentation for llm usage on modding

velvet jewel
#

last thing i need to do is prompt engineering and set up a reranker model, but ollama nor openrouter supports rerankers

#

with minecraft being such a large codebase as well, semantic search is incredibly helpful, even to humans

velvet jewel
#

it also does not yet support tiny remapper so is limited to unobfuscated versions

#

and i’d love to make the server itself public, and make a web frontend. i think this would be allowed if i only return the method/class signatures rather than method bodies, but this may limit usefulness

#

its also byo model, and even when using paid openrouter models, like the qwen family, an index of a whole version takes about 5-10 minutes, and costs like 20 cents. and retrieval is basically free

wind topaz
#

i see

velvet jewel
#

the embedding and reranking models can be easily local for retrieval

#

doing indexing locally is pretty inefficient because it uses LLMs and the sheer amount of embedding required. it's around 14 million tokens worth. but for retrieval, a embeding for retrieval will take ~1 second even locally and reranking will take a couple

wind topaz
#

@heady onyx saw your reaction and to make it clear
I don't want a LLM to write code for me, but to assist me in development

#

Which is the actual point of coding LLMs, i don't want a clanker to do my "job"

icy parcel
wind topaz
#

no

fathom sun
#

tbh yeah sometimes having an AI as second read when you don't got a human is so helpful

grand crescent
# velvet jewel RAG gives way better output than bash ever could

Hmm, my limited experience so far is that the model is pretty accurate with bash commands (at least for the way I prompt models when programming), but I am also using Claude for now as my GPU is too weak to run a good enough local model, so it's possible they trained it really well compared to the open weight models

#

I hope within a few years we get a decent enough local model that fits in 8GB of VRAM

#

^ is probably delusional

cyan heath
#

are we gonna get voldethonk into new use?

wind topaz
#

No, is not Minecraft Coder Pack, is Model Context Protocol

velvet jewel
cyan heath
wind topaz
#

Lol

icy parcel
upper portal
#

@velvet jewel like one week ago I’ve made my own RAG mcp server on the minecraft source code too lol, what embedding model did you use? I used nomic code embed 7b, took about 26 hours to Index 1.21.11

upper portal
velvet jewel
#

couldn’t find one on openrouter. and my chunks are mixed with none code

#

and search itself obviously isn’t code

upper portal
velvet jewel
#

very long methods are fed to a normal coding LLM and told to return an array of strings, where each string is a single logical component of the method. then the method is embedded with that string

#

at >100LoC this is like 327 methods

#

i did want to do this with every method but that would’ve taken far too long. the quality would’ve been great though

#

i’m also doing tons for retrieval. hypothetical document embedding and alternative queries and reranking

upper portal
velvet jewel
velvet jewel
upper portal
#

but sure im not a RAG expert

velvet jewel
#

neither am i lol

#

look into a reranker though. you should get like top 100 from retrieval and then reranker should give top 10

#

and if you’re using pure code embedding, also hypothetical document embedding

#

this is where you give an llm your query and you tell it to generate a hypothetical code snippet of what the result may look like

upper portal
#

ive used a reranker for a note search, it was a simple embedding of my notes and then a reranker, that worked really well, i dont know if there are any code rerankers tho

velvet jewel
#

and search for that too

velvet jewel
upper portal
#

oh that makes sense, but expensive, which hard queries did you use to test yours?

velvet jewel
#

i haven’t yet as i can’t find a good piece of local software to run a reranker lol

#

that has rocm support

upper portal
#

i use lmstudio but yeah they cant run rerankers, i dont even know anymore what i used to run the reranker lol

velvet jewel
#

there’s one by hugging face but it’s cuda only

#

i’m on amd gpu

upper portal
velvet jewel
#

it does reranking as well

upper portal
#

Oh cool tysm, ill install it and compare it to LMStudio

wind topaz
#

ok so is better to use opencode?

#

i'm really new on this llm stuff

wind current
#

yeah its so good

wind topaz
#

well i am working on something else

#

doing it for fun but yeah

#

probably will remove the stonecutter part

fathom sun
#

mmmmmmm

wind topaz
#

definitely not running out of copilot from testing

fathom sun
#

lmao

#

what did you do WTF

wind topaz
#

lot of stuff

#

but let's say i can make a new entity with custom texture from one iteration

fathom sun
#

daaaam

wind topaz
#

lava axolotl texture generated with a python script

#

not the scope i want but trying out what the agents are capable

fathom sun
#

dam

wind topaz
#

Now i will tweak it so it doesn't make the code for me but actually assist me in a way i have to write the code etc

fathom sun
#

Is there a source or a guide now ? I'd be interested to try stuff with it

velvet jewel
#

its not even public yet

#

because i couldn't figure any way to get a local reranker model

#

for AMD gpu