#Translation bot

1 messages · Page 1 of 1 (latest)

gloomy laurel
#

What would you use as a translation service?

tawny light
#

Two options, DeepL or a Mac Mini running the Meta NLLB model.

#

Well there’s plenty more options, but alas.

subtle gorge
#

deepl api isn't free iirc

tawny light
#

That’s correct, I currently run a free little service that runs on it. The cheaper option would be to run a self-hosted model as it isn’t subject to third-party fees. You’re just paying for electricity.

blazing knoll
#

it might even be possible to host the model on the same instance that hosts Nostra

tawny light
#

Depends how powerful it is, would be best to have something with an okay GPU so that it could translate relatively quickly.

#

I'm coming up with a POC right now.

blazing knoll
#

idk if things have changed with modern models, but iirc unless you wanna run big batches of translations, its totally fine to run inference on CPUs

gloomy laurel
#

I would be okay with that as long as I host the code on my VPS, for security concerns

tawny light
blazing knoll
#

might be able to help

tawny light
#

@blazing knoll

CPU vs. GPU, respectively.

{
    "translated_text": "Hello, what has prevented us so far from adding this kind of functionality is not the lack of bot but rather because we would not want this functionality. It would clutter the showrooms and introduce English to the \"French only\" showrooms. It is not a very big effort to copy and paste in an automatic translator",
    "translation_time": 11.155297994613647
}
#
{
    "translated_text": "Hello, what has prevented us so far from adding this kind of functionality is not the lack of bot but rather because we would not want this functionality. It would clutter the showrooms and introduce English to the \"French only\" showrooms. It is not a very big effort to copy and paste in an automatic translator",
    "translation_time": 1.233346939086914
}
#

This uses facebook/nllb-200-distilled-1.3B, which I have observed as giving the best performance & most accurate translations.

#

CPU is a 10700K, GPU is an RTX3070.

blazing knoll
#

damn looks like model sizes have increased dramatically since I last messed with NLP models

tawny light
#

This one is certainly quite meaty.

blazing knoll
#

ok welp i dont think hosting the model is a practical solution

#

i just realised that deepl's API is pretty cheap, hosting a model will most likely end up costing more

tawny light
#

I'll sponsor the first $100 of translations so we can test the water, if we want.

#

Honestly that'll probably last a year. 😆

#

Probably more

tawny light
#

Here is a poc built with the NLLB model

#

but could easily be adapted not to use it

blazing knoll
#

just for reference, deepl pro costs 7,49 $CA per month 👀

tawny light
#

API has a cost-per-character attached to it.

#

So the flat rate + cost-per-char

blazing knoll
#

ah shit... I didn't notice that

tawny light
#

Allows both using DeepL as well as an in-house translation model.

#

If the DEEPL_API_KEY environment variable is set, it will use DEEPL.

#

Otherwise it will try to use the Python service hosted at whatever URI you specify.

#

Built this quick message store as well which makes it so that if multiple users translate a message within a 10 minute period, it will only translate it once and provide that original translation. If users do it in quick-succession, it will do it once and provide the translation to all users at the same time.

If the message is edited and the content is translated again, it will run it again since the caching is done through the md5 hash of the message content.

https://github.com/ridafkih/francois/blob/main/src/utils/message-store.ts

#

The python service also has a language classifier built in, which is a much more lightweight model. This could be used to prevent wasting characters trying to translate non-French messages.

blazing knoll
#

hey great stuff 👀

#

maybe we could try running it with a free API key

#

on this server i mean

#

I don't mean to criticise the code or anything, but wouldn't it make more sense for the translation service to make use of the message store instead of the other way around

tawny light
#

It kind of just ended up that way.

#

I can change it later.

#

I will either change it entirely or just change it to « translationStore » instead and change the method names. 🤷‍♂️