#gemma-language-tuning | Kaggle | Page 1

rocky bolt Oct 7, 2024, 10:15 AM

#

Hey everyone.

#

Very cool competition!

#

I wanted to ask whether we could fine-tune a language that's not there in the list.

cobalt sable Oct 10, 2024, 7:11 AM

#

Hey anyone wants to form a team with me?

pale iris Oct 10, 2024, 3:13 PM

#

cobalt sable Hey anyone wants to form a team with me?

I'm a beginner if you don't mind

icy glade Oct 11, 2024, 8:29 PM

#

Hi. This is my first Kaggle competition I'm enrolling in but I've been really fascinated with translation especially since my Filipino/Tagalog is conversational at best and translation tools have been lacking for Filipino. I've dabbled in fine tuning before but this contest brings a new interest for me.

I can do it this contest by myself, but I can definitely see where working on a team would also be useful. I'm building the dataset right now and fine tuning in parallel. I've got some fine tunes already and they're showing promising results. Still working on more data to improve. Let me know if you're interested to teaming up and what you can bring to the table.

rocky bolt Oct 16, 2024, 3:39 AM

#

rocky bolt I wanted to ask whether we could fine-tune a language that's not there in the li...

@amber knot so sorry for the ping, but is this invalid?

amber knot Oct 16, 2024, 4:12 AM

#

it's probably best to ask in the competition forum so that the host is able to answer directly (unless someone else here knows)

rocky bolt Oct 16, 2024, 4:37 AM

#

amber knot it's probably best to ask in the competition forum so that the host is able to a...

Thank you Meg. I'll try that.

bitter coral Oct 16, 2024, 9:08 PM

#

i see 2b and 9b models in the models tab at https://www.kaggle.com/competitions/gemma-language-tuning/models
should we use just 2b/9b or can we use 27b as well ?

Google - Unlock Global Communication with Gemma

Create Gemma model variants for a specific language or unique cultural aspect

#

we have 2 spots on the team , will be doing it over telugu/hindi depending on other 2 , feel free to reach out

pliant obsidian Oct 20, 2024, 12:23 AM

#

Hi, I'm a 16 year old who's been doing computer vision and llm projects/competitions for a bit now and am just getting started with finetuning Gemma for Mandarin and Chinese culture. I would love to work along with others for this competition and learn from how different people approach ml competitions. If anyone's interested in teaming up please let me know!
Also, I'm fine with switching to a different preferred language, but mandarin would be my top choice.

inland sleet Oct 20, 2024, 2:32 PM

#

Hi boys, I'm working with a Friend in a English to Spanish finetune, let anyone want to join in?

tiny snow Oct 21, 2024, 3:11 AM

#

I'm from Vietnam and working on English <=> Vietnamese finetune, wanna join? Things we are doing https://huggingface.co/Symato

tribal bone Oct 21, 2024, 1:56 PM

#

will gemma fine tuned for math can be submitted?

inland sleet Oct 23, 2024, 11:32 AM

#

Which of the Gemma models we need to use,2B, 9B or 27 B?

inland sleet Oct 23, 2024, 12:27 PM

#

If someone helps it would be greate

cinder violet Oct 23, 2024, 11:51 PM

#

inland sleet Which of the Gemma models we need to use,2B, 9B or 27 B?

whichever is more applicable to the usecase you wish to demonstrate.

warped snow Oct 24, 2024, 3:55 AM

#

tribal bone will gemma fine tuned for math can be submitted?

i too ve this question

tribal bone Oct 24, 2024, 3:56 AM

#

I did not received any answer for this

inland sleet Oct 24, 2024, 8:15 AM

#

cinder violet whichever is more applicable to the usecase you wish to demonstrate.

Got you, so theres no issue with using the higher models then?

final comet Oct 28, 2024, 7:52 AM

#

I was also thinking along the line of the language not on the list. My native language Chibemba . If I can get some one interested.

rocky bolt Oct 29, 2024, 3:55 AM

#

final comet I was also thinking along the line of the language not on the list. My native la...

I asked regarding this in the Kaggle discussions section of the Competition, but got no answer, even I'm working on a language that hasn't been fine-tuned before.

warped snow Oct 29, 2024, 9:00 AM

#

how to authenticate if using collab i already accepted consent at kaggle gemma model page

final comet Oct 29, 2024, 9:11 AM

#

rocky bolt I asked regarding this in the Kaggle discussions section of the Competition, but...

Thanks for the heads up. I hope it is allowed!!

tiny snow Oct 30, 2024, 3:14 AM

#

rocky bolt I asked regarding this in the Kaggle discussions section of the Competition, but...

I think any language is important and the method can work across languages so if people don't interested in your lang - just because they don't know it yep - they may and will interest in your methods

#

Language specific problems also interesting, how a tokenization method affect languages differently is one example
I'm quite curious since the most popular BPE is made for English and alike langs

vague spindle Nov 1, 2024, 4:46 PM

#

Hello, i'm new here, i need to join one team

plain elk Nov 3, 2024, 9:58 AM

#

@warped snow refer this notebook for authentication by gemma, pinned in competition code

rn_image_picker_lib_temp_4c294ec5-fede-4c1c-9a28-7c0f03c50a40.jpg

warped snow Nov 3, 2024, 10:08 AM

#

plain elk <@928583389404102706> refer this notebook for authentication by gemma, pinned in...

Its working fine on kaggle notebook but authentication issues when try to use in collab notebook

plain elk Nov 3, 2024, 10:10 AM

#

Then you refer any random yt video

gilded ermine Nov 6, 2024, 3:51 AM

#

I was wondering about some guidance to compete because this is my first time in such a competition

livid scarab Nov 8, 2024, 2:40 AM

#

Hi everyone ,

I’m currently working with the Gemma Instruct 2B EN model from keras_nlp and have been fine-tuning it using LoRA. However, I’m curious about other fine-tuning techniques that could be effective with this model. If anyone has experience with raw, hands-on code or can point me to resources or repositories where alternative methods (like full fine-tuning, adapter layers, etc.) are used with this model, it would be really helpful.

Any code examples or practical insights are greatly appreciated. Thanks in advance for your help!

tiny snow Nov 9, 2024, 11:22 AM

#

livid scarab Hi everyone , I’m currently working with the Gemma Instruct 2B EN model from ke...

some methods I know:

full finetune (2b can fit consumer gpus, 24g vram for example)
block expanse https://arxiv.org/abs/2401.02415, only fintune some newly added layers, frozen the rest
mix of them: some frozen, some use lora, some use full-finetune ...
https://huggingface.co/docs/peft/developer_guides/lora#memory-efficient-layer-replication-with-lora is nice too ...

livid scarab Nov 12, 2024, 12:03 PM

#

tiny snow some methods I know: - full finetune (2b can fit consumer gpus, 24g vram for exa...

Thank you.. I really appreciate your help.

serene knoll Nov 13, 2024, 1:29 AM

#

Hi everyone, is there anyone working on some languages such as Urdu, Sindhi, Balochi in NLP domain I want to connect let me know in dm

rigid creek Nov 15, 2024, 4:04 PM

#

hi, is different arabic dialects allowed in this contest or just modern standard arabic?

solemn plover Nov 29, 2024, 6:20 PM

#

he guys

fickle raven Nov 30, 2024, 1:13 PM

#

https://www.kaggle.com/datasets/mohamedramadan2040/arabic-customer-reviews

Arabic Egypt Customer Reviews

Arabic Customer Reviews Dataset: Sentiment and Company Feedback

carmine dawn Dec 2, 2024, 10:49 AM

#

Are we allowed to work in fine tuning coding languages ?

#

like c/c++ /python etc

glacial mist Dec 3, 2024, 12:15 PM

#

hey 👋🏻 anyone out there looking for a teammate for this competition !

coarse canopy Dec 5, 2024, 7:35 PM

#

Hello, is it possible to use Gemma models to create a sequence to sequence model?

snow obsidian Dec 9, 2024, 3:26 AM

#

I fine-tuned gemma on brain rot language for fun - https://www.kaggle.com/code/shreeshabhat1004/finetuning-gemma-on-brain-rot-language-for-fun

Finetuning Gemma on Brain-Rot language for fun

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

quiet citrus Dec 9, 2024, 11:14 AM

#

snow obsidian I fine-tuned gemma on brain rot language for fun - https://www.kaggle.com/code/s...

Love the idea, but it sorta answers everything with ohio sigma rizz vibes? (Tbf, i do as well, but is that the behavior u want?)

snow obsidian Dec 9, 2024, 11:16 AM

#

quiet citrus Love the idea, but it sorta answers everything with ohio sigma rizz vibes? (Tbf,...

it's actually because I finetuned it on a very small dataset and The dataset happens to be containing whole lot of Ohio sigma rizz... Or maybe if I change the output template it's gonna give out more than Ohio skibidi rizz

quiet citrus Dec 9, 2024, 11:17 AM

#

yeah :)) consider filtering and using the 4chan or reddit dataset, im sure youll find the brainrottest texts :()

snow obsidian Dec 9, 2024, 11:56 AM

#

quiet citrus yeah :)) consider filtering and using the 4chan or reddit dataset, im sure youll...

sure :😁

hybrid urchin Dec 9, 2024, 3:12 PM

#

https://www.kaggle.com/code/shreeshabhat1004/finetuning-gemma-on-brain-rot-language-for-fun

Finetuning Gemma on Brain-Rot language for fun

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

#

lol

carmine dawn Dec 15, 2024, 2:21 AM

#

Hello guys, I wanted to ask if sanskrit is one of the allowed language or not, I see a lot of entries that finetune on sanskrit but it was not on the list

quiet citrus Dec 18, 2024, 10:52 AM

#

snow obsidian sure :😁

Hi just wanna ask how much gpu vram does it take to finetune gemma 2? Just so i can gauge resources needed

#

Im seeing 16GB for 2B params??!

graceful spire Dec 19, 2024, 11:10 AM

#

livid scarab Hi everyone , I’m currently working with the Gemma Instruct 2B EN model from ke...

I am trying to do the same thing but getting resource ran out error

carmine dawn Dec 19, 2024, 4:45 PM

#

are we allowed to use any other methods ALONG with finetuning ? Like RAGs or pretraining

quiet citrus Dec 25, 2024, 5:00 AM

#

carmine dawn are we allowed to use any other methods ALONG with finetuning ? Like RAGs or pre...

Pretraining im not sure, that would be quite insane. But okay on the RAG

quiet citrus Dec 25, 2024, 5:04 AM

#

quiet citrus Pretraining im not sure, that would be quite insane. But okay on the RAG

But ya know, if u pretrain on a pretrained model, you're finetuning, so who cares lmao

ornate maple Dec 25, 2024, 5:05 PM

#

Hi Do we need to use Gemma from models here https://www.kaggle.com/models/google/gemma-2/PyTorch/gemma-2-9b-pt or can we also use it with Unsloth Gemma model

Google | Gemma 2 | Kaggle

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.

opal vortex Dec 26, 2024, 7:03 PM

#

ornate maple Hi Do we need to use Gemma from models here https://www.kaggle.com/models/google...

Same question !!!!

rocky bolt Dec 30, 2024, 1:09 PM

#

Guys like how long did it take you to finish fine-tuning your model?

snow obsidian Jan 8, 2025, 5:23 AM

#

rocky bolt Guys like how long did it take you to finish fine-tuning your model?

mine took around 6 hours, the dataset was around 6 mb

rocky bolt Jan 8, 2025, 6:07 AM

#

Ah I see thanks man

warped snow Jan 9, 2025, 9:42 AM

#

Anyone tried multi dialect or multi lingual fine tuning ?

warped snow Jan 9, 2025, 2:44 PM

#

Did anyone tried fine tuning on books ?

quiet citrus Jan 10, 2025, 8:42 AM

#

Hi anyone knows how to unload and reload a lora in Keras?

snow obsidian Jan 10, 2025, 11:38 AM

#

quiet citrus Hi anyone knows how to unload and reload a lora in Keras?

your_model.backbone.enable_lora(rank=lora_rank) enables lora

snow obsidian Jan 10, 2025, 11:40 AM

#

quiet citrus Hi anyone knows how to unload and reload a lora in Keras?

did you mean you want to load the lora weights into the model?

#

your_model.backbone.load_lora_weights(filepath) should do it

warped snow Jan 10, 2025, 12:57 PM

#

is it strictly necessary to use pretrained gemma 2b model ? can we use opensource already fine tuned models(which are not participants in the comp) for further fine tuning for the comp ??

quiet citrus Jan 11, 2025, 11:17 AM

#

snow obsidian did you mean you want to load the lora weights into the model?

No i meant i already load_lora_weights once. How do i remove the lora weights, then apply a new one?

rocky bolt Jan 13, 2025, 1:14 PM

#

All the best everyone, I'm not really participating in this competition but I'm looking forward to everything you folks have built.

snow obsidian Jan 14, 2025, 10:00 AM

#

its last day today, all the best to all folks here

snow obsidian Jan 15, 2025, 2:16 AM

#

https://www.kaggle.com/code/shreeshabhat1004/non-binary-inclusive-gemma-finetuning Our submission focuses on inclusive communication with non binary people. We aim to constantly improve it, even after the competition is over. Suggestions or criticisms are most welcome. Thank you!

🌈Non-binary inclusive Gemma Finetuning

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

cloud orbit Jan 15, 2025, 6:47 AM

#

When will winners announce

snow obsidian Jan 15, 2025, 7:05 AM

#

cloud orbit When will winners announce

It will probably take few weeks for them to assess the notebooks and announce winners

swift heath Jan 15, 2025, 9:19 AM

#

I have a query

Before the deadline, I made my notebook (https://www.kaggle.com/code/ayeshaimr/gemma-2-urdu-adaptation-a-health-centric-approach) public as well as the datasets used for finetuning and the finetuned model was also published on kaggle models. However on the competition page, in the code and models section, i don't see my notebook and model present (in "your work") because I didn't specifically link them to the competition (I didn't know how to do that or the fact that this was necessary). Is my submission still valid? All other conditions are fulfilled. Just hella worried now because my teammates and I spent a lot of effort and time on it and don't want it invalidated due to this simple reason.

Also if anyone knows how I can attach my submitted notebook (without changes) to the competition now, please let me know.

@modern cove

Gemma 2 Urdu Adaptation: A Health-Centric Approach

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

atomic temple Jan 15, 2025, 9:45 AM

#

You should have added the competition dataset to link the notebook

#

For the model, I don't think that is necessary

#

https://www.kaggle.com/code/williamalabi/lingogemrag My submission focuses on the combined use of LoRA, ReFT and RAG for efficient language model adaptation

LingoGemRAG

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

swift heath Jan 15, 2025, 10:56 AM

#

atomic temple You should have added the competition dataset to link the notebook

There was no dataset though?
I’m just wondering if my submission is still valid :((

halcyon galleon Jan 15, 2025, 11:28 AM

#

I worked on Swahili variations of Gemma2, we released a couple of models, I would really apreciate you checking this out and hear your feedback 🙂
https://www.kaggle.com/code/alfaxadeyembe/introducing-gemma-2-swahili

Introducing Gemma 2 Swahili

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

atomic temple Jan 15, 2025, 11:35 AM

#

swift heath There was no dataset though? I’m just wondering if my submission is still valid...

Yes you still have to add it

atomic temple Jan 15, 2025, 11:36 AM

#

swift heath There was no dataset though? I’m just wondering if my submission is still valid...

Search for the competition name in the datasets section and add it

swift heath Jan 15, 2025, 1:44 PM

#

atomic temple Search for the competition name in the datasets section and add it

thank you! I’ll try that

snow obsidian Jan 15, 2025, 2:03 PM

#

swift heath I have a query Before the deadline, I made my notebook (https://www.kaggle.com/...

your submission must be valid because they didnt mention explictly in description of competition, to create a notebook in the competition page itself

swift heath Jan 15, 2025, 2:04 PM

#

snow obsidian your submission must be valid because they didnt mention explictly in descriptio...

Yeah I hope that is indeed the case!

keen cedar Jan 15, 2025, 6:13 PM

#

When will the winner be announced?

atomic temple Jan 15, 2025, 9:28 PM

#

snow obsidian your submission must be valid because they didnt mention explictly in descriptio...

It's there.

To participate in this competition, you must create and share a public Kaggle Notebook that demonstrates how to use the Gemma model for various languages and/or cultural contexts AND publish your variant to Kaggle models. Your Kaggle Notebook must be made public (along with any underlying data sources) and it should be attached to the official competition dataset.

#

How do you expect them to see your submission? They won't go to your profile to check. Except maybe if you made a discussion post about it

snow obsidian Jan 15, 2025, 11:19 PM

#

atomic temple How do you expect them to see your submission? They won't go to your profile to ...

Oh there was a google form to be submitted, you had to include notebook url and model url in it

atomic temple Jan 16, 2025, 2:34 AM

#

Yeah true

snow obsidian Jan 17, 2025, 11:04 AM

#

did any of you try tuning 9b or 27b models? Just curious

atomic temple Jan 17, 2025, 12:42 PM

#

I did but not on Kaggle

snow obsidian Jan 19, 2025, 9:49 AM

#

atomic temple I did but not on Kaggle

thats nice