#Looking for ideas for long HTML for example making changes to 10kb of HTML on a 8k token model.

12 messages · Page 1 of 1 (latest)

latent fractal
#

Seen memgpt and continuedev look promising which I will delve into.

My ideas include directing LLM to use React components to build up and break down the HTML and then have separate calls on each component.

Ideally open source solutions!

Can use any LLM but currently Llama 70B is what I am using.

Any ideas are much appreciated. Even just name-dropping a git repo.

cosmic fox
#

This is gonna heavily depend upon if the objects/chunks you are having a LLM modify is coherent enough to contain everything it needs within that single chunk. 10kb is gonna be too much to edit as once, hence your post, but you also are going to have issues where you cannot properly piece/chunk out the html file, there will be necessary code ahead and behind the chunks that will be valuable to the entire structure of the page.

even using react and such, your gonna hit walls where there won't be a good enough way to chunk the code out and modify it. this is a context window problem/compute problem, not one that's been solved as of today. HTML would be a bit easier i would venture as far as saying but you would still hit the wall, always in some fashion.

latent fractal
cosmic fox
#

you can take a research approach and try something similar to a VIT, you could chunk html out, then assign tokens broadly to chunks. i would need more weed to think more about this, ill respond tomorrow.

#

that would require a massive amount of code, many complete systems, then chunk their code out into tokens and use a transformer to best provide the framework, and you could have a 2nd model then work within the tokens/chunks themselves

latent fractal
#

I am just thinking while I may go down this rabbit hole and it would be very fun, I will probably do the more traditional approaches first. One reason is using groq and I don't think they have fine tuning. Another is cost and time.

I have been thinking about what you said though.

I wonder if you convert HTML to something like BSON or Protobuf or craft a binary format and then those are your pre-tokens (word I made up)! Make it "forgiving" so that a bad binary is still somewhat deserializable.

Then tokenize the binary, using stats from the training data.

latent fractal
#

There are 142 tags so you could encode them all in 8 bits. Then maybe have encode those 142 plus common ASCII characters up to 240 with 241..255 slots to signify meaning eg 241 means UTF8 string will follow null terminated 242 is for a bunch of chunks.

More detail needed around attributes but that is the basic idea.

cosmic fox
#

i recommend reading that. a big problem is just the compute required to pull this off. my apologies i havent responded. you would need to create sequences that start with a special token to signify chunks/objects, then function/method naming conventions would follow and utilize the brackets. you can build sequences of tokens like that but i am not well versed in this discussion to fully provide anything meaningful.

what you are describing is just a janky form of tokenization. think in the broader picture of how you would start/stop chunks and then fill in the blanks with additional tokens that represent valid strings of code and such forth

#

something like:

[OBJECT-START-TOKEN][KEYWORD(DEF)-FUNCTION NAME-(PARAMETERS)] [:-TOKEN] [TOKENS THAT LEAD TO FURTHER STRINGS OF CODE] [YADA YADA] [YADA] [RETURN-TOKEN] [RETURN-TOKEN + EXPRESSION-TOKEN] [OBJECT-END]

#

if you thought about it in terms of python

#

that is a janky example above but if you used a naming convention that followed single characters and stacked them from there, you could potentially get fairly far