#google-tunix-hackathon | Kaggle | Page 1

tidal spruce Nov 12, 2025, 3:26 AM

#

Hi

scenic orchid Nov 12, 2025, 7:09 AM

#

Hello

weak cove Nov 12, 2025, 7:24 AM

#

Hello everyone.

woven parrot Nov 12, 2025, 7:44 AM

#

hello

bitter turtle Nov 12, 2025, 8:44 AM

#

Hello

sand elk Nov 12, 2025, 12:35 PM

#

hello all

zealous dome Nov 12, 2025, 1:36 PM

#

hi

junior roost Nov 12, 2025, 4:18 PM

#

hello everyone

tough dawn Nov 12, 2025, 5:26 PM

#

Hi

primal flame Nov 12, 2025, 6:55 PM

#

Hello

odd parcel Nov 12, 2025, 7:57 PM

#

Hello

marble fiber Nov 12, 2025, 8:15 PM

#

Hello

calm monolith Nov 13, 2025, 2:40 AM

#

Hello all

tranquil mauve Nov 13, 2025, 5:04 AM

#

Hello

#

Who wants to join to team up with me ?

honest scaffold Nov 13, 2025, 7:03 AM

#

hii
anyone open to collab? looking for people who have some experience in fine tuning llm's!

bronze crescent Nov 13, 2025, 8:14 AM

#

I need 4 new people for my hackathon team. If anyone’s interested, just DM me!

autumn terrace Nov 13, 2025, 2:29 PM

#

Hello virtual_hug

coral patrol Nov 13, 2025, 5:26 PM

#

What is this hackathon about?

potent crest Nov 13, 2025, 7:21 PM

#

Hey Guys!!! I am new to hackathons and this would be my first one... Does anyone has any tips or just some point to keep in mind for me?

jagged pagoda Nov 13, 2025, 7:37 PM

#

Hello😍

tawdry iris Nov 14, 2025, 1:53 AM

#

potent crest Hey Guys!!! I am new to hackathons and this would be my first one... Does anyone...

You can try to join a team so that others can help you get started.

prisma barn Nov 14, 2025, 3:29 AM

#

potent crest Hey Guys!!! I am new to hackathons and this would be my first one... Does anyone...

There are public starter code notebooks at some point for you to get an idea of the technical concepts and expand it to make your write up etc

graceful merlin Nov 14, 2025, 2:42 PM

#

hello everyone!! i'm new here!!! welcome to team up with me!!!

patent harbor Nov 14, 2025, 5:16 PM

#

hello everyone

vast wigeon Nov 14, 2025, 11:38 PM

#

Did I miss something? When I tried to download the dataset here: https://www.kaggle.com/competitions/google-tunix-hackathon/data it was empty.

#

Also, Hi!

verbal rapids Nov 15, 2025, 2:47 PM

#

Hi everyone

#

I don't figure what we need to do in this hackathon ? As the problem statement says we need post train the llm using tunix such that it not only predict the answer but also provide how the model is calculated . But when I check out the starter code it's already there that it providing resonong along with the answer than what is supposed to do by us ?? Please help me as I am new in this type al hackathon..

unkempt harness Nov 15, 2025, 3:03 PM

#

Hi

ionic elk Nov 16, 2025, 3:52 AM

#

Hi, I tried to click on the gemma starter notebook but it 404'ed. Can someone on the admin team make an update to the correct link? https://github.com/google/tunix/blob/main/examples/grpo_demo.ipynb

golden raft Nov 16, 2025, 7:20 PM

#

hi everyone! will there be any office hours this week?

vivid veldt Nov 16, 2025, 8:14 PM

#

Hello everyone. Is model distillation from non Gemma teachers allowed?

ionic elk Nov 17, 2025, 12:29 AM

#

ionic elk Hi, I tried to click on the gemma starter notebook but it 404'ed. Can someone on...

I found some other example notebooks, just wondering if the demo one was different.

earnest isle Nov 18, 2025, 2:47 AM

#

anyone had any issue running tunix today? yesterday i did a full run, trained a gemma model, got good results like:

Model State Accuracy
Raw baseline (no GRPO) 45–55%
After v1 ≈98%

but today even re-running the same notebook i got so many issues with tunix, lora etc that i've been unable to do training
some update of sort maybe and nothing is compatible anymore i don't know, i'm very lost :_:

tawdry iris Nov 18, 2025, 4:22 AM

#

earnest isle anyone had any issue running tunix today? yesterday i did a full run, trained a ...

what error did you get? Are you using the right TPU image?

earnest isle Nov 18, 2025, 4:23 AM

#

i literally trained it with a notebook, then run it again and now always get errors

tawdry iris Nov 18, 2025, 4:23 AM

#

ionic elk Hi, I tried to click on the gemma starter notebook but it 404'ed. Can someone on...

Use the starter notebooks here: https://www.kaggle.com/competitions/google-tunix-hackathon/data

earnest isle Nov 18, 2025, 4:23 AM

#

without having changed anything

#

difference was from yesterday to today

#

so i thought maybe some update or something

tawdry iris Nov 18, 2025, 4:24 AM

#

verbal rapids I don't figure what we need to do in this hackathon ? As the problem statement s...

That starter notebook is only for math reasoning. If you ask a non-math question, it will fail. The goal of this competition is to train a general reasoning model, not just for math.

tawdry iris Nov 18, 2025, 4:25 AM

#

vast wigeon Did I miss something? When I tried to download the dataset here: https://www.kag...

There is no data provided. You have to come up with your own data.

tawdry iris Nov 18, 2025, 4:26 AM

#

earnest isle so i thought maybe some update or something

Can you be specific about the error(s) you are getting? Do the starter notebooks work for you?

earnest isle Nov 18, 2025, 4:27 AM

#

i was using this grpo_demo Gemma2 2B

#

i tried to run this one as well from someone else grpo_dual-stream_tunix

#

but i got the same problems

#

like with the shards and the flax version mismatch

#

and qwix lora and tunix version

#

i tried to downgrade but some wouldn't

#

so environment issues anyway

tawdry iris Nov 18, 2025, 4:43 AM

#

are you pinning your Tunix version like this '!pip install "google-tunix[prod]==0.1.3"'?

earnest isle Nov 18, 2025, 4:50 AM

#

Yes, I pinned Tunix to 0.1.3 and that version is incompatible with QWIX LoRA and that’s why I got the recursion errors.

Tunix 0.1.3 is required for the official GRPO training flow
But 0.1.3 is currently incompatible with QWIX LoRA
So the only working solution is: use 0.1.3 but disable LoRA but that... wasnt also viable

tawdry iris Nov 18, 2025, 4:50 AM

#

I just ran the Gemma3 1B starter notebook and it is fine (you have to get rid of wandb code since there is a known bug https://github.com/wandb/wandb/issues/10872).

earnest isle Nov 18, 2025, 4:51 AM

#

i wasnt using it

#

i disabled it

#

thats how i succeded yesterday as well

#

but as i said the same notebook yesterday working, today no

#

so that's weird no?

#

anyway can you link me to the starter notebook u just tried

#

and i will start again from there

#

thank you, appreciated

tawdry iris Nov 18, 2025, 4:52 AM

#

https://www.kaggle.com/competitions/google-tunix-hackathon/data

earnest isle Nov 18, 2025, 4:52 AM

#

i only have 2 h tpu left this week lol

tawdry iris Nov 18, 2025, 4:52 AM

#

Always go back to the starter notebook to isolate if sth. weird happen

earnest isle Nov 18, 2025, 4:52 AM

#

oh ok didnt think about that

#

thank you

#

anyway just as a quick question

#

i was confused because i wasnt sure of how the first training went

#

but basically baseline was like around 45% all 3 results

#

but after training i got like 55% 60% and 98% accuracy

#

with the 2b

#

which is not good because is supposed to be a general enhancement of score right?

#

mostly improved dramatically the format but not the rest

tawdry iris Nov 18, 2025, 4:54 AM

#

55% accurary isn't bad; LLMs are known for bad with math

#

98% format accurary is normal

earnest isle Nov 18, 2025, 4:55 AM

#

ok perfect, yeah i couldn't understand if was abnormally good or just normal

#

thank you again, i'll treasure the tips about going back to the starter notebook!

#

have a good night!

tawdry iris Nov 18, 2025, 4:56 AM

#

you can see the result I got in the starter notebook as a reference

earnest isle Nov 18, 2025, 4:56 AM

#

please yeah

#

where?

earnest isle Nov 18, 2025, 5:17 AM

#

Found! Oh ok my run wasnt bad at all then

#

Hopefully I'll do better if i can make it work again lol

earnest isle Nov 18, 2025, 11:11 AM

#

@tawdry iris thank you now is working, damn i spent so many hours trying to fix it, start from scratch was way easier lol

lucid bison Nov 18, 2025, 6:03 PM

#

Hello everyone

paper pilot Nov 18, 2025, 9:22 PM

#

Hello guys

#

Anyone doing capstone?

warm trench Nov 19, 2025, 2:41 AM

#

hey everyone, i just kneew about this, and am really interested in joing the team. so pls let me in your team

spice tendon Nov 20, 2025, 2:51 PM

#

I have no credit or debit card, so how i can access Google cloud

upper geyser Nov 23, 2025, 4:30 PM

#

earnest isle <@1341588806737858701> thank you now is working, damn i spent so many hours tryi...

heyy, can u run your notebook for now ?

#

I tried to run the baseline but it did not work

earnest isle Nov 23, 2025, 4:31 PM

#

Yes i had to restart from scratch and worked, couldn't find what was wrong with mine in the end but anyway working now

upper geyser Nov 23, 2025, 4:32 PM

#

earnest isle Yes i had to restart from scratch and worked, couldn't find what was wrong with ...

oh wow, did you try the Gemma 3 1B version ?

#

I just tried this version and it showed me the error telling me that the libraries is conflicting

earnest isle Nov 23, 2025, 4:32 PM

#

Tried both but 3.1 i run out of hours before any results

#

Back on monday

#

But it was working

upper geyser Nov 23, 2025, 4:34 PM

#

earnest isle But it was working

can you run this cell:

# Policy model
lora_policy = get_lora_model(ref_model, mesh=mesh)
# nnx.display(lora_policy)

#

I tried multiple times but it did not work

earnest isle Nov 23, 2025, 4:34 PM

#

Atm i can't run anything

#

But when i could training was working yes

upper geyser Nov 23, 2025, 4:35 PM

#

yea I will try it again

#

just posted a post in Kaggle about this problem

#

anyway, thank you for your help

earnest isle Nov 23, 2025, 4:35 PM

#

As the guy said, try the basic notebook or restart from there and it works

upper geyser Nov 23, 2025, 4:36 PM

#

I will try the Gemma 2 2B version also

upper geyser Nov 23, 2025, 4:36 PM

#

earnest isle As the guy said, try the basic notebook or restart from there and it works

yup, I tried the notebook from the Data part in the competition

earnest isle Nov 23, 2025, 4:36 PM

#

Yeah when i restart it from there it worked

upper geyser Nov 23, 2025, 4:36 PM

#

did you change anything ?

earnest isle Nov 23, 2025, 4:36 PM

#

Otherwise i had so many libraries issues and versions

upper geyser Nov 23, 2025, 4:36 PM

#

I read you said that you pinned some versions

earnest isle Nov 23, 2025, 4:37 PM

#

The one from the basic notebook are the right ones

upper geyser Nov 23, 2025, 4:37 PM

#

yea, I will try it again

#

thank you so much

#

have a nice day xD

#

I ran the Gemma 2 2B version and it showed the same error again

#

@earnest isle if you run it in the future, can you please let me know if it is successful or not 🥹

earnest isle Nov 23, 2025, 4:51 PM

#

You run the notebook from the competition?

#

Or yours?

upper geyser Nov 23, 2025, 4:52 PM

#

earnest isle You run the notebook from the competition?

I ran the one from the competition

earnest isle Nov 23, 2025, 4:53 PM

#

And it didnt work? Weird

upper geyser Nov 23, 2025, 4:53 PM

#

yes

#

both of them

hallow totem Nov 23, 2025, 7:14 PM

#

hey guys i wanted help with

https://www.kaggle.com/competitions/google-tunix-hackathon/discussion/638533
please check it out, would rlly be appreciated

#

@upper geyser hey there you're here wow

upper geyser Nov 23, 2025, 7:40 PM

#

hallow totem <@1025345260760596530> hey there you're here wow

yea hello man

hallow totem Nov 23, 2025, 7:45 PM

#

upper geyser yea hello man

hey hello! you're a researcher? wow

tawdry iris Nov 24, 2025, 1:29 AM

#

spice tendon I have no credit or debit card, so how i can access Google cloud

You do not need to use Google Cloud. Kaggle already offers free TPUs.

upper geyser Nov 24, 2025, 5:23 AM

#

hallow totem hey hello! you're a researcher? wow

oh yea :>, trying to be actually

hallow totem Nov 24, 2025, 6:59 AM

#

@tawdry iris I've replied to you on my "discussion " reply for the 0% accuracy issue I kindly request check it out

earnest isle Nov 27, 2025, 1:10 PM

#

i'm a bit confused i have the tpu accelerator selected, but my notebook (training now) is only using cpu?_?

earnest isle Nov 27, 2025, 1:32 PM

#

oh it just... doesn't say that is being used.. so confusing lol, but i'm not getting error and tpu is active so i suppose it is working

tawdry iris Nov 29, 2025, 10:18 AM

#

earnest isle oh it just... doesn't say that is being used.. so confusing lol, but i'm not get...

This is a known Kaggle TPU limitation. If you are using CPU image incorrectly, it won't even run at all. There will be some error.

fallen seal Nov 30, 2025, 5:24 AM

#

Hi..I had couple of query if someone could help with

What level of evaluation is done? I mean one of the domains mentioned is code. The hard thing here is that code reasoning and outputs is one domain where the token limit is hard to control. You can’t summarize or compress this. I see in the competition page 1K output token length is fine (and makes sense coz having longer sequence length in training just adds more runtime). So my question was, for coding can we limit to smaller token usage cases only and expect the eval to run on that. If so I would just don’t want to waste time (and tpu hours) trying to train longer sequences
The single session 45 points and multi session 15 bonus points was a bit confusing. Please let me know if I got this correct: For the 45 points in single session we need to do all the trainings within that session (and can’t load any checkpoints we saved in a previous session). Basically all the training Gemma model gets from scratch is to be ran in a 9h session. The 15 bonus point is for alternate models we trained ( with more time and resources across multiple sessions) and we just have to save that to kaggle and share the name of the multi session model in 9h notebook at the end. These are mutually exclusive models and can’t be used as stage loads in 9h runtime notebook

Thanks in advance!!

median slate Nov 30, 2025, 9:56 AM

#

Looking for LLM Enthusiasts!

A new hackathon is live, and we are building a team to participate. If you have experience with LLMs or are interested in learning, join us to collaborate, share ideas, and work on exciting projects.

Hackathon:
Google Tunix Hack - Train a model to show its work
https://www.kaggle.com/competitions/google-tunix-hackathon/overview

My Kaggle Profile:
https://www.kaggle.com/muhammaddanyalmalik

simple aspen Nov 30, 2025, 8:04 PM

#

tawdry iris This is a known Kaggle TPU limitation. If you are using CPU image incorrectly, i...

but then how should we run it? takes 6 hours to just train, let alone test on a CPU!

#

how are you guys running the starter NB?

spare fog Nov 30, 2025, 10:21 PM

#

hey! has anyone been able to spin up a vLLM + Tunix run on Colab? if so, could you please share your rl cluster config and your pip freeze?

tawdry iris Dec 1, 2025, 1:29 AM

#

simple aspen how are you guys running the starter NB?

It should only take 2-3 hrs for the starter notebooks. Btw, if you are looking at the Gemma3 1B starter notebook, don't be fooled by the progress bar (which says it takes 5+ hrs), it actually finishes in half of that time.

tawdry iris Dec 1, 2025, 2:09 AM

#

spare fog hey! has anyone been able to spin up a vLLM + Tunix run on Colab? if so, could y...

Can you post this question on https://github.com/google/tunix? Our engineers can give you some guidance there.

spare fog Dec 1, 2025, 2:31 AM

#

Sure! Thanks 🙂

rotund wing Dec 1, 2025, 7:04 AM

#

tawdry iris Can you post this question on https://github.com/google/tunix? Our engineers can...

You are a judge for this competition, right?

tawdry iris Dec 1, 2025, 7:48 AM

#

I am.

tawdry iris Dec 1, 2025, 8:07 AM

#

fallen seal Hi..I had couple of query if someone could help with 1. What level of evaluati...

Verifiable tasks (math&coding) will have much lower weights because 1) the starter notebooks already cover math and 1B or 2B models aren’t very good with math in general, especially without tools 2) Gemma is not particularly well trained with code.
Correct for the single-sessio mode. You can only load one of the 2 stock Gemma models via official Tunix APIs and finish the training in one go (loading other checkpoints is not allowed and will be heavily penalized)

For the multi-session run (let's also call it 'unrestricted mode'), if you choose to participate, it will be a separate model. You can resume training from your single-session ckpt for it, or do whatever you want. No restriction.

We are working on a submission template and a FAQ. Please stay tuned.

median slate Dec 1, 2025, 2:23 PM

#

Looking for LLM Enthusiasts!

A new hackathon is live, and we’re building a team to participate. If you have experience with LLMs or are interested in learning, join us to collaborate, share ideas, and work on exciting projects.

Hackathon:
Google Tunix Hack - Train a model to show its work
https://www.kaggle.com/competitions/google-tunix-hackathon/overview

My Kaggle Profile:
https://www.kaggle.com/muhammaddanyalmalik

fallen seal Dec 1, 2025, 2:33 PM

#

tawdry iris 1. Verifiable tasks (math&coding) will have much lower weights because 1) the st...

“Verifiable tasks have much lower weights” - Does this mean the evaluation of model will done with lower weight on verifiable tasks OR you meant the base Gemma model have low performance on verifiable tasks and needs more weights tuned.

If it’s the former can you please let us know what the evaluation of trained model will be on , so we can design training to follow that preferences.

crude vortex Dec 1, 2025, 5:29 PM

#

https://media.discordapp.net/attachments/1444971360047726605/1445085758598938824/image1.gif?ex=692f107d&is=692dbefd&hm=94f18cd6e7350e7cc612826beb5d11a9fd125485a58ee1e39a16a03b6f9e2426&=&width=237&height=315
https://media.discordapp.net/attachments/1444971360047726605/1445085766937088000/image2.gif?ex=692f107f&is=692dbeff&hm=51e8429e6818b166e21485a613e8f0c706d64c765aefc93f65a7bcefa10907c2&=&width=864&height=1152
https://media.discordapp.net/attachments/1444971360047726605/1445085774562197535/image3.gif?ex=692f1081&is=692dbf01&hm=e520e8e4edd4eea02e82168a7059a868ea59c19d9b90c7c34402f7bb3616c76f&=&width=864&height=1152
https://media.discordapp.net/attachments/1444971360047726605/1445085781801566319/image4.gif?ex=692f1082&is=692dbf02&hm=bdc0715977fdcda4b7804916e5bfb36af1d3132f535d1b4327894a067fbfc769&=&width=725&height=907

rotund wing Dec 1, 2025, 6:03 PM

#

I have a question, do we have to fine-tune the gemma model only for 1 task reasoning? Like only mathematical reasoning or logical reasoning? Or we have to make the model do all type of reasoning tasks like mathematical reasoning + logical reasoning + Commonsense +.....?

tawdry iris Dec 2, 2025, 12:40 AM

#

rotund wing I have a question, do we have to fine-tune the gemma model only for 1 task reaso...

The final evaluation dataset we use will cover a range of domains, not just a single domain.

tawdry iris Dec 2, 2025, 12:41 AM

#

fallen seal “Verifiable tasks have much lower weights” - Does this mean the evaluation of mo...

It just means the final evaluation dataset will have fewer questions from verifiable domains (math+coding).

rotund wing Dec 2, 2025, 1:33 AM

#

tawdry iris The final evaluation dataset we use will cover a range of domains, not just a si...

Ok thanks 👍

spare fog Dec 2, 2025, 1:43 AM

#

tawdry iris 1. Verifiable tasks (math&coding) will have much lower weights because 1) the st...

A quick question on this: does multi-session mode then automatically render more points? I see they're an optional 15 additional points in the rubric

tawdry iris Dec 2, 2025, 5:11 AM

#

spare fog A quick question on this: does multi-session mode then automatically render more...

No. There is a model quality threshold; models below that won't get any point.

grave spruce Dec 2, 2025, 5:20 PM

#

|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| _ _ _ _ _ _ https://imgur.com/TC6h8P4 https://imgur.com/iiKXKB5 https://imgur.com/JAkE28j https://imgur.com/keASgw9

inner thorn Dec 3, 2025, 1:51 PM

#

is anyone facing hf rate limit error?

hushed flower Dec 3, 2025, 2:05 PM

#

Hi @tawdry iris , Im running into error:
TypeError: set_metadata takes either 1 argument or 1 or more keyword arguments, got args=('sharding_names', ('fsdp', None)), kwargs={}

when trying to load lora model using:
lora_model = qwix.apply_lora_to_model(
base_model, lora_provider, **model_input
)

my lib versions are:
flax: 0.12.0
jax: 0.8.0
google-tunix: 0.1.3

Can you please suggest any possible workaround to this?

#

anyone else facing same issue?

hushed flower Dec 3, 2025, 5:11 PM

#

inner thorn is anyone facing hf rate limit error?

yes i am today, till yesterday it was fine: WARNING:datasets.utils.file_utils:Got disconnected from remote data host.

warped wharf Dec 3, 2025, 9:28 PM

#

Hello, me and my team are working on a submission, I have two questions:

Token limit enforcement: Is the "<1K output token" limit a hard stop during generation? If the model is mid-sentence and hits the limit, will the output be truncated immediately?
Parser Robustness: If a response is truncated due to the limit or if the model fails to generate the closing </answer> tag, how is this handled? Will the parser attempt to recover the content or will the submission automatically receive a zero score for that specific prompt due to invalid formatting?

These clarifications will help us better design our training strategy. 😄

olive ledge Dec 3, 2025, 11:32 PM

#

hushed flower Hi <@1341588806737858701> , Im running into error: TypeError: set_metadata takes...

I'm not sure if this is the exact error I got initially, but I have a fully working Save & Run w/o errors. I had to change rollback to earlier library versions: https://www.kaggle.com/code/philipkd/full-save-run-of-grpo-demo-checkpointing

tawdry iris Dec 4, 2025, 1:41 AM

#

hushed flower Hi <@1341588806737858701> , Im running into error: TypeError: set_metadata takes...

Hmm, this usually happens when you use Flax 0.12.1. Can you double check?

tawdry iris Dec 4, 2025, 1:47 AM

#

warped wharf Hello, me and my team are working on a submission, I have two questions: - **Tok...

You can go beyond 1K. It's not a requirement at all. We put it there only because longer output sequences require more compute, which I'm sure a lot of ppl don't feel having enough of 2. We won't truncate it. The entire sequence will be used for eval.

hushed flower Dec 4, 2025, 5:56 AM

#

tawdry iris Hmm, this usually happens when you use Flax 0.12.1. Can you double check?

Hello @tawdry iris , I'm using 0.12 version :/

inner thorn Dec 4, 2025, 12:36 PM

#

hello @tawdry iris, regarding the model generation format, is text outside <reasoning>...</reasoning><answer>...</answer> allowed?

tawdry iris Dec 5, 2025, 12:50 AM

#

why do you need text beyond the closing tag?

tawdry iris Dec 5, 2025, 12:51 AM

#

hushed flower Hello <@1341588806737858701> , I'm using 0.12 version :/

Can you run one of the starter notebooks and see if it work? Then you can compare yours w/ the starter notebook

hushed flower Dec 5, 2025, 4:17 AM

#

tawdry iris Can you run one of the starter notebooks and see if it work? Then you can compar...

just did "copy & edit" on the gemma3 notebook and that ran, but when I copy paste same code manually in a new notebook, that didn't seem to work.. im not sure what's the issue exactly 🤔

tawdry iris Dec 5, 2025, 7:31 AM

#

Did you choose the TPU image in the right side panel?

hushed flower Dec 5, 2025, 9:37 AM

#

yeah.. TPU v5e-8 session

tawdry iris Dec 5, 2025, 10:03 AM

#

Can you share you notebook?

#

@all, we published a submission notebook template and a FAQ. Please take a few minutes to read through them.

hushed flower Dec 5, 2025, 11:03 AM

#

tawdry iris Can you share you notebook?

yes sure, let me DM you

prisma crag Dec 6, 2025, 1:05 PM

#

hello everyone, i have some questions about competitions in kaggle,
Submissions to this competition must be made through Notebooks. In order for the "Submit" button to be active after a commit, the following conditions must be met:

CPU Notebook <= 9 hours run-time
GPU Notebook <= 9 hours run-time
Internet access disabled
Freely & publicly available external data is allowed, including pre-trained models
Submission file must be named submission.csv

will these be assesed based on the notebook you use to train your (pre-trained) models? what if i use another source like colab pro i and just save the best models the upload to kaggle to inference, does that count as cheating or anything?

tawdry iris Dec 7, 2025, 1:33 AM

#

prisma crag hello everyone, i have some questions about competitions in kaggle, Submissions...

Please read the FAQ(https://www.kaggle.com/competitions/google-tunix-hackathon/discussion/651560) very carefully. For the single-session mode, your notebook needs to run on Kaggle; if we cannot reproduce your model on Kaggle, you get 0 points (out of 45). For the unrestricted mode, we don't really care how you train the model.

rotund wing Dec 7, 2025, 6:19 AM

#

I read Submission Template, do it is compulsory to use LoRA?

grave socket Dec 7, 2025, 8:05 AM

#

Can i get some feedback on this? https://www.kaggle.com/competitions/google-tunix-hackathon/discussion/653942

tawdry iris Dec 7, 2025, 11:08 AM

#

rotund wing I read Submission Template, do it is compulsory to use LoRA?

Technically you don't have to. But what else are you going to use?

tawdry iris Dec 7, 2025, 11:08 AM

#

grave socket Can i get some feedback on this? https://www.kaggle.com/competitions/google-tun...

Answered

rotund wing Dec 7, 2025, 12:45 PM

#

tawdry iris Technically you don't have to. But what else are you going to use?

Ok thx 👍. I have some other ideas that I believe will work properly.

prisma crag Dec 8, 2025, 10:55 AM

#

tawdry iris Please read the FAQ(https://www.kaggle.com/competitions/google-tunix-hackathon/d...

thanks a lot

main jungle Dec 17, 2025, 9:24 AM

#

hi guys, i have 4 questions regarding this hackathon:

It is stated that tool use is not necessary, but do i understand correctly that it is allowed?
Can I use everything thats possible within a kaggle notebook with internet access enabled? (for example google search, llm apis, pretrained models as judge, pretrained models for distillation)?
Has the workflow to be User Input -> Reasoning -> Answer or can I do something like User Input -> Planning -> Tool Calling -> Reasoning -> Tool Calling -> Answer?
Can I use all Gemma 3 1B variations (like instruct or quantized models)?

zealous dome Dec 18, 2025, 9:11 PM

#

@tawdry iris So I can make my own dataset for the competition as long as the results are fully reproducible within my submission? Also in the "model quality across multiple kaggle sessions" evaluation description it says we can use private data. Does that mean we don't have to reveal our training data used for a run across multiple sessions, we just have to provide the checkpoints and explain what we did?

tawdry iris Dec 19, 2025, 12:32 AM

#

zealous dome <@1341588806737858701> So I can make my own dataset for the competition as long ...

Yes, it's kind of expected that you come up with your own training data. And correct, in the multi-session (or unrestricted) mode, we don't really care how you train the model. Just provide the checkpoint and describe what you did at a high level.

tawdry iris Dec 19, 2025, 12:36 AM

#

main jungle hi guys, i have 4 questions regarding this hackathon: 1. It is stated that tool ...

Correct. Tool use will not be part of our evaluation though.
Correct. Just make sure we can re-produce your results.
Techninically you can. But 'planning' or 'tool call' output might interfere with our evaluation system, so I do not recommend making things complicated unless you really have to.
Definitely instruct model. Quantized model might work, but there will be work needed.

hallow totem Dec 20, 2025, 11:08 AM

#

also I have one more question, talking abt the correct answer accuracy what's the best someone could even reach roughly?

hallow totem Dec 20, 2025, 4:14 PM

#

@tawdry iris
also I'm very confused abt how exactly is the code evaluated? good format + accuracy?

but then how I'm confused coz you see gemma 2 2b as shown in the started notebook clearly outperforms gemma 3 1b with just the same default parameters, reward fns and yk a little training

then wt if I just use gemma 2 2b i mean I don't exactly get what's the goal (just abt final model performance)

#

pls answer my questions whenever you have time
Thank you

zealous dome Dec 21, 2025, 6:13 PM

#

@tawdry iris Although the quality of a reasoning block is subjective, will the answers to questions for evaluation be short and objective? For example will the model only be expected to output one word/number for the final answer between <answer></answer> tags or could it be a long sentence or paragraph? And will there always be exactly one correct answer?

hallow totem Dec 21, 2025, 8:48 PM

#

zealous dome <@1341588806737858701> Although the quality of a reasoning block is subjective, ...

only final answer between answer tags

and yeah there will always be exactly one question

hallow totem Dec 21, 2025, 9:07 PM

#

what dataset are we supposed to use for training is it the maths dataset in the starter notebook? like are we just supposed to train the model on that dataset and that's it? @tawdry iris

willow stump Dec 21, 2025, 11:59 PM

#

hallow totem what dataset are we supposed to use for training is it the maths dataset in the ...

we are supposed to come up with a dataset of our own

#

@tawdry iris for some reason I always get permission denied every time I try to load the model from my Colab Notebook. I checked, I am correctly authorized, and I accepted the license agreement. Will be grateful for help in resolving this issue.

hallow totem Dec 22, 2025, 5:53 AM

#

willow stump we are supposed to come up with a dataset of our own

of our own? like am I supposed to train it on an another dataset picked from kaggle? (coz where else am i even gonna get a dataset from, if I'm not wrong) but which dataset? and what domain?

also if we're supposed to come up with our own then why's everyone else using the math 8k dataset

dark forge Dec 22, 2025, 5:10 PM

#

hallow totem of our own? like am I supposed to train it on an another dataset picked from kag...

there's lots from huggingface hub, or you can generate your own, or mix and match. Thee main point with a custom dataset is that for your model to generalise to multiple reasoning domains it's very unlikely that math 8k is going to be sufficient

hallow totem Dec 22, 2025, 5:11 PM

#

dark forge there's lots from huggingface hub, or you can generate your own, or mix and matc...

is there any dataset which you are aware of and don't mind recommending?

dark forge Dec 22, 2025, 5:14 PM

#

hallow totem is there any dataset which you are aware of and don't mind recommending?

you will want to do your own research -- a good starting point I find is https://asta.allen.ai/ and searching for recent papers on llm reasoning

hallow totem Dec 22, 2025, 5:15 PM

#

dark forge you will want to do your own research -- a good starting point I find is https:/...

alright thanks a lot for the guidance

#

im actually pretty prepared with the notebook and my own reward system

#

just gotta get a proper dataset now

hallow totem Dec 22, 2025, 7:41 PM

#

dark forge you will want to do your own research -- a good starting point I find is https:/...

hey i just have one more question if you don't mind answering.
as we are supposed to use a dataset covering reasoning from multiple domains, there just won't be a single numeric or single word answer anymore provided by the model right and might be a text prolly 1-2 lines in the final answer tag

vivid veldt Dec 23, 2025, 4:02 AM

#

dark forge you will want to do your own research -- a good starting point I find is https:/...

Olmo trace is also really awesome

dark forge Dec 23, 2025, 10:44 AM

#

hallow totem hey i just have one more question if you don't mind answering. as we are suppose...

yeah, I expect so

olive ledge Dec 23, 2025, 6:01 PM

#

hallow totem hey i just have one more question if you don't mind answering. as we are suppose...

yeah. formatting issues are still a problem when you do text-based reasoning, even when trained on the GSM8K dataset, so you can easily reward formatting related components, even if you don't reward having "the right answer." You could also use an LLM judge to verify the answer, but I found it too slow for GRPO with the notebooks. Some people mention in the discussion board about using the Gemini API, so that's another option for having an LLM judge as your "verifier"

hallow totem Dec 23, 2025, 6:28 PM

#

olive ledge yeah. formatting issues are still a problem when you do text-based reasoning, ev...

i prepared my own dataset but the trainer isnt running at all it's just loading and loading

#

i just am using 20k rows with 1 epoch num iteration as 4 num generation as 4 train micro batch size as 4

#

do you got any idea where im goingwrong if u dont mind helping me @olive ledge

rotund wing Dec 24, 2025, 3:23 PM

#

Hello @tawdry iris (sorry for the ping if it disturbs you).

I had a quick clarification question. The description mentions that “evaluation will cover both the reasoning trace and the final answer.”

Should the model’s output reasoning be a concise explanation that justifies the answer, or should it include a detailed, step-by-step chain-of-thought showing the model’s internal reasoning process?

I want to make sure the model’s outputs are aligned with what judges find most useful and readable.

willow stump Dec 25, 2025, 2:54 AM

#

@here does anyone else also having issues downloading the model directly from kagglehub after accepting the license agreement and authentificiation ?

willow stump Dec 25, 2025, 7:53 PM

#

also, I will be very grateful if someone @tawdry iris or @everyone could clarify:

if the answer does not adhere to the format

<reasoning>model_thinking_trace</reasoning>
<answer>model_answer</answer>

will it get a partial credit, or just 0 , and thus in the 45 points that are awarded for this section will contribute as (0* number_of_answers_that_do_not_comply_with_format + score*number_of_answers_that_comply_with_format)/ total_number_of_questions ?

rotund wing Dec 26, 2025, 1:23 PM

#

willow stump also, I will be very grateful if someone <@1341588806737858701> or @everyone co...

I think that you should follow the given format.

hallow totem Dec 26, 2025, 3:22 PM

#

@tawdry iris @olive ledge @dark forge can we train for more than 2 sessions?

olive ledge Dec 26, 2025, 3:24 PM

#

hallow totem <@1341588806737858701> <@315587319006756865> <@534659493439733761> can we train...

yes, but it only counts for the multi-session bonus points, not the single-session one.

hallow totem Dec 26, 2025, 3:45 PM

#

olive ledge yes, but it only counts for the multi-session bonus points, not the single-sessi...

if I'm doing multi session then I'll be getting
45 (single session) + 15 (multi session) points right?

olive ledge Dec 26, 2025, 3:47 PM

#

hallow totem if I'm doing multi session then I'll be getting 45 (single session) + 15 (multi ...

yeah, as long as you have one notebook that goes from vanilla Gemma (2 or 3) to something useful, that's your single session. then you add a line at the end with your multi-session trained model ID for the bonus points.

hallow totem Dec 26, 2025, 3:48 PM

#

olive ledge yeah, as long as you have one notebook that goes from vanilla Gemma (2 or 3) to ...

ok so let's say I have
3 notebooks
notebook for session 1
notebook for session 2
notebook for session 3

so then I gotta upload the model on kaggle and add model id at the end of session 3 notebook??

olive ledge Dec 26, 2025, 3:51 PM

#

hallow totem ok so let's say I have 3 notebooks notebook for session 1 notebook for session...

They're only going to run one of your notebooks all the way through. Pick that notebook as your single-session one. In your case, probably only the notebook for session 1 would qualify. And then add a cell at the end of that notebook that specifies your model id for what you produced at the end of session 3. From the competition website regarding multi-sesh: "Participants must explicitly provide a Kaggle model name/ID at the end of the notebook as the submission for this item."

hallow totem Dec 26, 2025, 3:53 PM

#

olive ledge They're only going to run one of your notebooks all the way through. Pick that n...

ooh so I can just provide model id at the end of all session notebooks then that works perfectly right?

#

also one more doubt
when I let's save have run whole notebook and got an output file in output section of my notebook

now after getting that let's say I make some changes in the maybe markdown and do quick save -> save an output for this version

even after this why does the output file from output section disappear

#

@olive ledge could u pls help me with this

olive ledge Dec 26, 2025, 4:02 PM

#

hallow totem also one more doubt when I let's save have run whole notebook and got an output ...

that's just the way kaggle works. quick saves don't save the output files. but those output files are stored in the old versions. my pattern is to click the "..." next to the output and click "create new dataset," and then use that in another notebook. There may be a better way, though.

hallow totem Dec 26, 2025, 4:02 PM

#

hallow totem also one more doubt when I let's save have run whole notebook and got an output ...

should I be bothered abt those staying in output section or should I not be bothered coz judges would re run the code

#

entire note bhot

#

notebook*

hallow totem Dec 26, 2025, 4:03 PM

#

olive ledge that's just the way kaggle works. quick saves don't save the output files. but t...

yeah yeah that's wt i do but I was only worried coz it disappears from the output section

olive ledge Dec 26, 2025, 4:03 PM

#

hallow totem should I be bothered abt those staying in output section or should I not be both...

if you're 100% confident your changes don't affect your code (like markdown), then yeah, quick save is fine. but the way to be 100% sure is that your final version is a S&R (save and run)

hallow totem Dec 26, 2025, 4:04 PM

#

yesyes at the end I'll save and run obviously coz I need checkpoints

#

thanks a lott

#

you rlly help a lot honestly

olive ledge Dec 26, 2025, 4:13 PM

#

hallow totem you rlly help a lot honestly

sure thing. thanks for sharing your CUREgrpo notebook so early in the comp.!

hallow totem Dec 26, 2025, 5:44 PM

#

olive ledge sure thing. thanks for sharing your CUREgrpo notebook so early in the comp.!

ay u checked it out? wow I'm glad thank you! im still actively working on it

blazing needle Dec 27, 2025, 12:51 PM

#

Hello, I am AI Engineer Professional with knowledge of ML & RL Looking forward to join any team for Tunix.

willow stump Dec 28, 2025, 3:25 AM

#

what is the max allowed size for the dataset that we can create?

hallow totem Dec 28, 2025, 5:46 AM

#

willow stump what is the max allowed size for the dataset that we can create?

I don't think there's any max size just that you gotta complete training inside session time which is ≤ 9 hrs

hallow totem Dec 28, 2025, 5:47 AM

#

blazing needle Hello, I am AI Engineer Professional with knowledge of ML & RL Looking forward t...

isn't it a bit too late for finding a team now? ig?

blazing needle Dec 28, 2025, 6:10 AM

#

hallow totem isn't it a bit too late for finding a team now? ig?

I know but either I can regret or could try to join any team is only option I have. so I preferred second one.

hallow totem Dec 28, 2025, 11:18 AM

#

blazing needle I know but either I can regret or could try to join any team is only option I ha...

not rlly the only option, you still got 15 days lock urself in and start grinding alone you can do it alone as well don't need a team

hallow totem Dec 28, 2025, 12:16 PM

#

is anyone else facing the error

module pyarrow has no attribute PyExtensionType

#

@olive ledge

#

it doesn't happen when running the starter notebook else wise it happens and depends really on luck for me sometimes it runs sometimes gives out this issue

blazing needle Dec 28, 2025, 2:07 PM

#

hallow totem not rlly the only option, you still got 15 days lock urself in and start grindin...

one of best pieces of advice someone has given me. I will. Thank you.

hallow totem Dec 28, 2025, 2:17 PM

#

blazing needle one of best pieces of advice someone has given me. I will. Thank you.

npp honestly i never even felt the need of a team eve

#

ever*

blazing needle Dec 28, 2025, 6:44 PM

#

hallow totem npp honestly i never even felt the need of a team eve

because u r already capable of that but okay I understood your meaning to do it alone.

vernal cobalt Dec 29, 2025, 12:00 AM

#

https://bit.ly/4amCrfJ
🚀 𝑰𝒇 𝒚𝒐𝒖’𝒓𝒆 𝒃𝒖𝒊𝒍𝒅𝒊𝒏𝒈 𝒘𝒊𝒕𝒉 𝑳𝑳𝑴𝒔, 𝒕𝒉𝒊𝒔 𝒏𝒆𝒘𝒔𝒍𝒆𝒕𝒕𝒆𝒓 𝒊𝒔 𝒇𝒐𝒓 𝒚𝒐𝒖: Agents All You Need - A bAI Labs publication Dec 2025 edition

rotund wing Dec 29, 2025, 4:47 AM

#

rotund wing Hello <@1341588806737858701> (sorry for the ping if it disturbs you). I had a q...

Can someone please answer this question? Idk why windmaple_87628 is not coming online.

lethal galleon Dec 29, 2025, 4:49 AM

#

Hi, I haven't been able to use a TPU in Kaggle. I am always at least #44 in the queue. I tried going to Colab, my code runs but it runs out of RAM.

Is anyone else facing this issue? What have you done?
If you have been +#40 how long does has it taken for you to get a TPU session? Have you gotten one?
Any suggestion?

Thanks a lot!
P.s. this is my first Kaggle competition, any help is appreciated.

rotund wing Dec 29, 2025, 4:50 AM

#

lethal galleon Hi, I haven't been able to use a TPU in Kaggle. I am always at least #44 in the ...

Yes, I also got the same thing when I started using TPU. it will take you around 15-20 minutes

lethal galleon Dec 29, 2025, 4:52 AM

#

rotund wing Yes, I also got the same thing when I started using TPU. it will take you around...

Thanks for the answer! And after that I get it for a decent amount of time? If I have to restart the kernel for some reason (around the same time), will I have to wait in the queue again?

rotund wing Dec 29, 2025, 7:39 AM

#

lethal galleon Thanks for the answer! And after that I get it for a decent amount of time? If ...

Maybe you will have to wait again.

dark forge Dec 29, 2025, 12:43 PM

#

lethal galleon Thanks for the answer! And after that I get it for a decent amount of time? If ...

you get max 9hrs per tpu session, but if you are inactive and nothing's running on the notebook for something like 10/15mins you get disconnected. Restarting your session (e.g. if you install something and need restart) don't boot you off the machine, but if you 'terminate' your session it does. If you're running out of VRAM on colab tweaking stuff like batch sizes and max sequence lengths might help (or if you mean 'normal' RAM you need to look at your data loading strategy)

rotund wing Dec 29, 2025, 2:11 PM

#

rotund wing Hello <@1341588806737858701> (sorry for the ping if it disturbs you). I had a q...

Can someone please answer this question ASAP.

hallow totem Dec 29, 2025, 3:58 PM

#

rotund wing Yes, I also got the same thing when I started using TPU. it will take you around...

wt r u on man?

hallow totem Dec 29, 2025, 3:58 PM

#

lethal galleon Thanks for the answer! And after that I get it for a decent amount of time? If ...

no way

#

if u r #44 in the queue minimum 1 hrs u gotta wait

#

best case 50 mins there's no way u getting in 15-20 mins

#

if u r 44 in the queue currently u r lucky coz I always be in like 70-80 or even 100+ most of the time

rotund wing Dec 29, 2025, 4:01 PM

#

hallow totem wt r u on man?

I was on #34 and it took me about 20 min to run the TPU. IDK if it took you 1 hr.

hallow totem Dec 29, 2025, 4:01 PM

#

rotund wing Hello <@1341588806737858701> (sorry for the ping if it disturbs you). I had a q...

it gotta be a good reasoning including proper step by step chain of thought

rotund wing Dec 29, 2025, 4:02 PM

#

hallow totem it gotta be a good reasoning including proper step by step chain of thought

Ok, thanks 🙏🏻

hallow totem Dec 29, 2025, 4:02 PM

#

rotund wing I was on #34 and it took me about 20 min to run the TPU. IDK if it took you 1 hr...

NAH man I'm always 70-80 or even 100+ in the queue takes me 2 hrs or smth

hallow totem Dec 29, 2025, 4:02 PM

#

rotund wing Ok, thanks 🙏🏻

just don't push the model to generate too much extra

#

it just gotta be as much needed

#

u also don't want ur output to get truncated

rotund wing Dec 29, 2025, 4:03 PM

#

hallow totem just don't push the model to generate too much extra

ok, thx

hallow totem Dec 29, 2025, 4:03 PM

#

yepp np

rotund wing Dec 29, 2025, 5:51 PM

#

hallow totem it gotta be a good reasoning including proper step by step chain of thought

Hello 👋, again thanks for helping me out.
But I have one question regarding this "step by step CoT":-
What should be the output look like?:-
1st:-
Question: find x in 7x+15=4x+45
<reasoning>

7x+15=4x+45
7x-4x=45-15
3x = 30
x = 30/3 = 10
</reasoning>
<answer>
X = 10
</answer>

2nd:-
<reasoning>
Ok, so the equation is 7x+15=4x+45.

First I need to bring 4x to left side and +15 to right side. So the equation becomes 7x-4x=45-15

Now, I need to subtract them. So the equation becomes 3x=30.

Now divide 30 by 3 so the value of x will be 10

I will type the solution in sequence for the user to understand
</reasoning>
<answer>

7x+15=4x+45
7x-4x=45-15
3x = 30
x = 30/3 = 10
</answer>

Which of these is a example of proper step by step CoT

rotund wing Dec 29, 2025, 5:53 PM

#

rotund wing Hello 👋, again thanks for helping me out. But I have one question regarding thi...

I am asking this because it is written: "train to show its work" and I asked this same question from different AI. All are giving there own theory, some are saying 1st one is better because it is clear, some are saying 2nd one is better because it is written in a good format. And I am confuse about What should I keep. steps or CoT

hallow totem Dec 29, 2025, 6:15 PM

#

rotund wing Hello 👋, again thanks for helping me out. But I have one question regarding thi...

the reasoning should be like 2nd one but the answer tag gotta be like:
<answer>10</answer>

#

2nd reading is what's called chain of thought basically and reasoning

rotund wing Dec 29, 2025, 6:16 PM

#

hallow totem the reasoning should be like 2nd one but the answer tag gotta be like: <answer>...

Only the answer in answer tag?

hallow totem Dec 29, 2025, 6:16 PM

#

the first one just solving maths step by step

hallow totem Dec 29, 2025, 6:16 PM

#

rotund wing Only the answer in answer tag?

yep

rotund wing Dec 29, 2025, 6:16 PM

#

hallow totem 2nd reading is what's called chain of thought basically and reasoning

I also thought the same. But just to confirm I asked that

rotund wing Dec 29, 2025, 6:16 PM

#

hallow totem yep

Ok thx again

hallow totem Dec 29, 2025, 6:17 PM

#

so if question is like mary has 10 apples she gave 2 to her sis how many she has so model gotta be like

<reasoning>
The person initially had 10 apples
then she gives 2 to her sis that means she now has 2 apples less which means she now has 10-2 = 8 apples
</reasoning>

#

@rotund wing

hallow totem Dec 29, 2025, 6:18 PM

#

rotund wing Ok thx again

npp

rotund wing Dec 29, 2025, 6:18 PM

#

hallow totem so if question is like mary has 10 apples she gave 2 to her sis how many she has...

Thanks! I understood that 👍

hallow totem Dec 29, 2025, 6:18 PM

#

npp

brittle rover Dec 30, 2025, 1:37 PM

#

Is anyone still looking for a team to join? I have a cool problem. Please DM if interested

rotund wing Dec 30, 2025, 2:10 PM

#

Is anyone expressing lag in notebook while fine-tuning the model??

I am expressing some bugs, when I run the cell that has to use GPU for a high-load task, the cell keeps on running mode with no output. Even the Cpu,ram, gpu and memory usage remains at 0%.

If you know any method to fix this problem, plz answer

dark forge Dec 30, 2025, 3:39 PM

#

rotund wing Is anyone expressing lag in notebook while fine-tuning the model?? I am expres...

I find that adding

import nest_asyncio
nest_asyncio.apply()

helps (for the more generic cases), but if it's very gpu specific, you need to turn off cuda async mode (set CUDA_LAUNCH_BLOCKING=1 in env vars) If it's tpu specific , I have no idea...

rotund wing Dec 30, 2025, 3:41 PM

#

dark forge I find that adding ``` import nest_asyncio nest_asyncio.apply() ``` helps (for ...

I will try it. Thanks 👍

hallow totem Dec 31, 2025, 7:03 PM

#

@dark forge @olive ledge hey! i was actually training gemma 3 on coding dataset but what's happening is
till 1000 smth seconds model is getting trained (i know it coz I got debugging outputs of response just to keep note of training)

after that the output response stops, but session keeps running till like 9000 secs but then session gets cancelled by itself

(im doing save and run all) does anyone got a hint here that what might be going wrong?

#

(sorry for the ping btw)

quick carbon Dec 31, 2025, 11:53 PM

#

I fine tuned a Gemma model with sft but the lora(rank64,alpha64) makes the model answer an empty response on "hi" or "hello" prompt, is this a catastrophic forgetting? I only use 2e-5 learning rate, ~1.5 epoch, 400 steps.

hallow totem Jan 1, 2026, 6:09 AM

#

Hello, please mind checking out this question posted by me in Discussion.
It's regarding Google Tunix Hack
Thank you

https://www.kaggle.com/competitions/google-tunix-hackathon/discussiobn/665386nbb

quick carbon Jan 1, 2026, 8:17 AM

#

hallow totem Hello, please mind checking out this question posted by me in Discussion. It's r...

i think it's a good idea to share the notebook codes tho, someone probably knows what's the problem

hallow totem Jan 1, 2026, 11:08 AM

#

quick carbon i think it's a good idea to share the notebook codes tho, someone probably knows...

actually lowering the NUM_BATCHES worked without loosing accuracy so it's fine now

#

i will make the notebook public by end of the day

silk patrol Jan 2, 2026, 6:00 AM

#

is it necessary to disclose our private dataset used for training. And does the generation time for the dataset also count in the original 9 hrs of given time for a single session?

hallow totem Jan 2, 2026, 9:02 AM

#

silk patrol is it necessary to disclose our private dataset used for training. And does the ...

as per I know, we don't have to disclose our private dataset (feel free to correct me if I'm wrong)

and about dataset generation, the 9 hours only counts actual notebook runtime.
you can just generate your dataset offline or outside kaggle and then save the file, import it as dataset in /kaggle/input and use it in ur notebook

silk patrol Jan 2, 2026, 10:11 AM

#

ohk thanks!

bright scaffold Jan 2, 2026, 1:01 PM

#

cant access any gemma model through HF token

#

is it not allowed and we have to download it ?

dark forge Jan 2, 2026, 4:02 PM

#

hallow totem as per I know, we don't have to disclose our private dataset (feel free to corre...

you still need to make the dataset public though, even if it's just as a dataset in your public notebook-- in the FAQs

(45 pts) ...We will be running your notebook in a single TPU session (9hrs) and reproduce the model before sending it over to eval. If you use private data or tools that we cannot access, our reproduction training cannot finish and you get 0 pt.

dark forge Jan 2, 2026, 4:07 PM

#

bright scaffold cant access any gemma model through HF token

make sure you got the correct read permissions on your HF token and that you have agreed to the gemma T&Cs on huggingface

hallow totem Jan 2, 2026, 6:30 PM

#

dark forge you still need to make the dataset public though, even if it's just as a dataset...

ooh thanks for pointing that out

#

works if i public it on 12th jan right

#

man wt is this suddenly now being in the queue for 4-5 hrs

#

earlier i was 5 hrs in the queue and then notebook failed just because i typed "checkpoints" instead of "checkpoint"

hallow totem Jan 3, 2026, 6:16 AM

#

Hello, could somebody please mind checking out my doubt in discussion regarding Google Tunix Hack
Thank you

https://www.kaggle.com/competitions/google-tunix-hackathon/discussion/665685

hallow totem Jan 3, 2026, 7:54 AM

#

dark forge you still need to make the dataset public though, even if it's just as a dataset...

sorry but wt about unrestricted mode (multi-session) are we supposed to public our dataset for that version too? or only final model matters?

hushed flower Jan 3, 2026, 1:20 PM

#

Hello @tawdry iris , is it possible to do anything about the waiting queue? Even during off-peak timings waiting is #40+ and the average waiting queue is #120+

It takes 2-5 hrs of waiting ans session discontinues if we take 15 minutes to think about code.. I'm concerned how to experiment with model given this situation, especially because we've been all working since 1 month+..

Would highly appreciate any help

hushed flower Jan 3, 2026, 1:21 PM

#

hallow totem the reasoning should be like 2nd one but the answer tag gotta be like: <answer>...

Hey are you sure the <answer> tag should only include one numeric answer?

Since this is general/open-ended fine-tuning for any domain, I thought it could contain text/paras too

hushed flower Jan 3, 2026, 1:28 PM

#

hallow totem <@534659493439733761> <@315587319006756865> hey! i was actually training gemma 3...

+1
The training stops at step 350 for example, but notebook will keep running

This only started happening recently.. anyone else facing this?

hallow totem Jan 3, 2026, 2:05 PM

#

hushed flower Hey are you sure the <answer> tag should only include one numeric answer? Sinc...

That was just about math domain.
Talking abt other domains such as general reasoning, it can definitely contain final answer text / para between <answer> </answer>

hallow totem Jan 3, 2026, 2:07 PM

#

hushed flower +1 The training stops at step 350 for example, but notebook will keep running ...

earlier i fixed this by reducing my NUM_BATCHES to 800 but when i re-ran my notebook yesterday, it stopped working even at 800 and now just working with 600

hushed flower Jan 3, 2026, 2:09 PM

#

Yep same, my earlier version ran for 8 hrs for 1200 batches.. but now even 500 is failing silently

hushed flower Jan 3, 2026, 2:09 PM

#

hallow totem That was just about math domain. Talking abt other domains such as general reaso...

I think even for math it would work if it had text + numeric answer? 🤔

hallow totem Jan 3, 2026, 2:15 PM

#

hushed flower Yep same, my earlier version ran for 8 hrs for 1200 batches.. but now even 500 i...

idk what's the issue

hallow totem Jan 3, 2026, 2:16 PM

#

hushed flower I think even for math it would work if it had text + numeric answer? 🤔

well yeah but i'd rather let it be only numeric answer for math, seems cleaner to me

#

unless they gonna evaluate it on integration and stuff (i hope not)

hallow totem Jan 3, 2026, 2:17 PM

#

hallow totem Hello, could somebody please mind checking out my doubt in discussion regarding ...

@olive ledge

#

@dark forge

dark forge Jan 3, 2026, 4:27 PM

#

hallow totem <@315587319006756865>

in the FAQs they say

We will be using the Gemma2 2B/Gemma3 1B modelling code in Tunix to load this model up for evaluation.
so presumably if it's loadable by tunix it's fine

dark forge Jan 3, 2026, 4:30 PM

#

hushed flower Yep same, my earlier version ran for 8 hrs for 1200 batches.. but now even 500 i...

if you open the console/terminal does it say anything? My guess is either there's some silent error that's not fed back to the notebook (or jax got stuck in some async process...) you might also want to make sure that your dataset is deterministic -- it could be some weird row in the data that's causing issues which shuffling is hidiing

hallow totem Jan 3, 2026, 4:43 PM

#

dark forge in the FAQs they say > We will be using the Gemma2 2B/Gemma3 1B modelling code ...

okay thank you

hallow totem Jan 3, 2026, 4:45 PM

#

dark forge if you open the console/terminal does it say anything? My guess is either there'...

no no i verify that this is an actual issue and there's no issue with dataset or anything.
some days I back ran on 800 NUM_BATCHES and when i went for running the same code same dataset with no changes again on 800 batches it didn't run and it ran only when i reduced to 600 batches

#

also one more issue i get error when i do from load_dataset import dataset and also get some version error.

This error occurs even when i import windmaple's notebook as it is no changes made in a new notebook.
But when i run the same google's notebook i get no error such that

#

due to which all my notebooks are copy edit of windmaple's notebook and deleted all the cells and then updated code according to me which doesn't causes this error

#

there are like multiple issues with no explanation pretty confusing

dark forge Jan 3, 2026, 5:03 PM

#

hallow totem also one more issue i get error when i do `from load_dataset import dataset` and...

in case this helps -- this has fixed a lot of dependency headaches for me (but this assumes you are not using vllm)

%pip install python-dotenv
%pip install google-metrax
%pip install "jax-ai-stack==2025.10.28" "jax[tpu]==0.8.0"
%pip install transformers datasets huggingface_hub wandb numba omegaconf sentencepiece tqdm
%pip install --no-deps git+https://github.com/google/tunix
%pip install --no-deps git+https://github.com/google/qwix

hallow totem Jan 3, 2026, 6:06 PM

#

dark forge in case this helps -- this has fixed a lot of dependency headaches for me (but t...

wow thx a lot but imma now pass on this coz im already done with running all my notebook and everything. I don't think it's an issue right if the notebook r copy edit of windmaple's coz the code reward fns etc are of my own and most of the things are different from initial notebook apart from core notebook stuff

dark forge Jan 3, 2026, 6:10 PM

#

hallow totem wow thx a lot but imma now pass on this coz im already done with running all my ...

I guess the main thing is to make sure that the notebook can run all the way through without random crashes due to dependency errors coming from installing tunix/jax stack (since the kaggle notebook preinstalled packages can change under you unless you have pinned the env)

hallow totem Jan 3, 2026, 7:15 PM

#

dark forge I guess the main thing is to make sure that the notebook can run all the way thr...

okk alright makes sense thank you!

#

unfortunately, my model gonna perform very poorly on coding dataset..im literally not above to train above 600 NUM_BATCHES

#

no idea what's wrong

dark forge Jan 3, 2026, 7:58 PM

#

hallow totem unfortunately, my model gonna perform very poorly on coding dataset..im literall...

it doesn't affect too much (again from faqs)

One thing we could mention is that verifiable tasks (math&coding) will have much lower weights because 1) the starter notebooks already cover math and 1B or 2B models aren’t very good with math in general, especially without tools 2) Gemma is not particularly well trained with code.

hallow totem Jan 3, 2026, 7:59 PM

#

dark forge it doesn't affect too much (again from faqs) > One thing we could mention is tha...

yeah that's why i let it be but yk sad part is just with 800 NUM_BATCHES i was able to push the code accuracy from 16% (base model) to 66.75%

#

with 600 num batches it just hits ig 33% accuracy

hallow totem Jan 3, 2026, 8:29 PM

#

@dark forge hey just one last question pls minding replying to it
what license should i keep of my final notebook

#

MIT or Gemma license?

dark forge Jan 3, 2026, 8:36 PM

#

hallow totem <@534659493439733761> hey just one last question pls minding replying to it wha...

probably MIT (and just state that your models are distributed under the gemma license)

hallow totem Jan 3, 2026, 9:23 PM

#

dark forge probably MIT (and just state that your models are distributed under the gemma li...

okayy thank you

hushed flower Jan 4, 2026, 3:32 PM

#

dark forge if you open the console/terminal does it say anything? My guess is either there'...

Thanks for this! It could be possibly the dataset, I'll have to see how to isolate this thing

hushed flower Jan 4, 2026, 3:33 PM

#

hallow totem also one more issue i get error when i do `from load_dataset import dataset` and...

Yep +1111

hushed flower Jan 4, 2026, 3:34 PM

#

hallow totem there are like multiple issues with no explanation pretty confusing

+1

hushed flower Jan 4, 2026, 3:35 PM

#

hallow totem unfortunately, my model gonna perform very poorly on coding dataset..im literall...

Can you please share if you figure out the issue? Now it's not even going above 400 batches, but one time it ran till 500.. now I'm in #200+ waiting queue

hallow totem Jan 4, 2026, 3:42 PM

#

hushed flower Can you please share if you figure out the issue? Now it's not even going above ...

about queue, you can't do anything you gotta be in it.
about batches, i gave up on it coz it's clearly not an issue of the code or dataset so i don't think there's much you or me could do on this.
i gave up as my model was trained perfectly on other domains and on coding anyways the weightage of coding is less so i chose to not bother much

willow stump Jan 5, 2026, 2:07 AM

#

does anyone else here have repeating <end_of_turn> at the end of the generation? and will it somehow disqualify if the format <reasoning>reasoning </reasoning>
<answer>answer</answer>

Is otherwise correct, just a lot of repeating <end_of_turn> in the end?

@dark forge @tawdry iris @everyone @here

rotund wing Jan 5, 2026, 7:43 AM

#

willow stump does anyone else here have repeating <end_of_turn> at the end of the generation?...

The model format should be:-

<reasoning>
REASONING
</reasoning>
<answer> 
ANSWER
</answer>

And this is only written in https://www.kaggle.com/competitions/google-tunix-hackathon/overview/evaluation

#

I am have confusion regarding the reasoning trace.

Should it contain Paragraphs or Step by step?

Paragraph example:-

Alright, so first the problem is ...... 
Let me think of......

Step by step example:-

1. This problem is ......
2. I need to find out .....

Which one is better format?

dark forge Jan 5, 2026, 11:14 AM

#

willow stump does anyone else here have repeating <end_of_turn> at the end of the generation?...

I don't think it will 'disqualify', but you will probably loose points for model quality. This sort of repeating can be an indication that something has gone a bit wrong in your training. But it might be worth checking that you have set up your inference correctly as well (e.g. if you haven't supplied the correct stop token to your inference code it will just make the model keep generating even when it should have stopped)

hushed flower Jan 5, 2026, 11:36 AM

#

Hello @tawdry iris , since a few days I've noticed the training/session stops without any error at random batch number.

I've trained till 1200 steps once, yesterday randomly it finished till 500 steps but now it's not going till even 200, I've tested this on same configuration as well.

I think some others are also facing same.

Can you please share if you have any idea what might be happening?

willow stump Jan 5, 2026, 6:06 PM

#

dark forge I don't think it will 'disqualify', but you will probably loose points for model...

thank you!

tawdry iris Jan 6, 2026, 1:40 AM

#

Sorry, I was out for a couple of weeks. Coming back online to address the questions here. And thx to the folks who were here to help each other 😀 . Good community spirit!

tawdry iris Jan 6, 2026, 1:41 AM

#

rotund wing I am have confusion regarding the reasoning trace. Should it contain Paragraph...

Either is fine. Both are reasonable reasoning traces as far as I can tell. There is no format requirement on the reasoning trace/answer text, as long as they are coherent.

tawdry iris Jan 6, 2026, 1:44 AM

#

hushed flower Hello <@1341588806737858701> , since a few days I've noticed the training/sessio...

The trainer will terminate when EITHER the data iterator runs out of data OR it reaches the max step#. I'm guessing you set a small max step# in your notebook.

tawdry iris Jan 6, 2026, 1:53 AM

#

hushed flower Hello <@1341588806737858701> , is it possible to do anything about the waiting q...

Unfortunately there is a chip shortage everywhere, for which there is nothing we can do. I generally recommend running notebooks in the background so that you don't have to wait online. If you really need an interactive session for debugging/etc, try:

load model weights in bfloat16 instead of float32 and reduce sequence length, this might fit on a Colab TPU
use Gemma3 270M on free-tier Colab TPU, but only do this for debugging; 270m is not an acceptable model for eval
WWYMAK has a GPU notebook (https://www.kaggle.com/code/wwymak/sft-with-qlora-and-the-jax-stack-gpus-edition#LoRA-&-QLoRA-Demo) that might help a bit. But do note that we will only use TPU for eval, so again this is for debugging

tawdry iris Jan 6, 2026, 2:03 AM

#

rotund wing Hello <@1341588806737858701> (sorry for the ping if it disturbs you). I had a q...

Ideally the output wrapped in the <reasoning> and </reasoning> tag will be the step-by-step CoT, similar to how humans work out the questions.

hallow totem Jan 6, 2026, 5:47 AM

#

tawdry iris The trainer will terminate when EITHER the data iterator runs out of data OR it ...

no this is definitely not the case if this was the case then it wouldn't happen that today the notebook ran 800 batches and then the next day the same notebook is not running at all for more than 600 batches.

hushed flower Jan 6, 2026, 6:33 AM

#

Yeah Exactly, I just tried again, yesterday my notebook ran till 500 batches, today it silently stopped at 132 steps on similar configuration

hushed flower Jan 6, 2026, 10:31 AM

#

I had a question regarding correctness reward function.. apart from using llm as judge, how do we check for correctness for domains other than math/logic?

dark forge Jan 6, 2026, 4:05 PM

#

hushed flower I had a question regarding correctness reward function.. apart from using llm as...

I have two ideas that I want to test at some point: using semantic similarity (ie embeddings), or you can try models that are trained specifically for this task e.g. https://huggingface.co/TIGER-Lab/general-verifier (ofc, you can also argue that the 2nd options is also a llm as a judge...)

hushed flower Jan 6, 2026, 4:10 PM

#

That's interesting .. thanks for sharing this!

hushed flower Jan 6, 2026, 7:05 PM

#

If the model outputs new line between reasoning and answer like:

<reasoning>
reasoning_trace
</reasoning>

<answer>
answer
</answer>

Instead of:
<reasoning>
reasoning_trace
</reasoning>
<answer>
answer
</answer>

Would that count as invalid response?

#

Or does it have to be all in same line like:
<reasoning>reasoning_trace</reasoning>
<answer>answer</answer>

hallow totem Jan 6, 2026, 7:58 PM

#

hushed flower If the model outputs new line between reasoning and answer like: <reasoning> re...

you definitely gotta stop overthinking

hushed flower Jan 6, 2026, 7:58 PM

#

😂

hallow totem Jan 6, 2026, 7:59 PM

#

haha fr man

hallow totem Jan 6, 2026, 7:59 PM

#

hushed flower 😂

chill it's not counted as an invalid response

hushed flower Jan 6, 2026, 7:59 PM

#

Gotcha

scarlet spear Jan 7, 2026, 3:44 PM

#

dark forge I find that adding ``` import nest_asyncio nest_asyncio.apply() ``` helps (for ...

THANKS DUDEE

lethal galleon Jan 7, 2026, 5:53 PM

#

Hi, will the models be tested using tokenizer.apply_chat_template()? Meaning we should expect the evals to have tags like "<bos><start_of_turn>user...

This changes the final sampler output a lot. For instance, when I don't use <bos> (I trained with it), my model doesn't out anything. It is very sensitive to that.

modern maple Jan 7, 2026, 10:52 PM

#

Could I please have some clarification? Should we be using gemma-3-1b-it or gemma-3-1b? The provide template notebook reads, "Use instruction-tuned Gemma2 2B or Gemma3 1B (other models are not allowed)", yet a previous discussion post reads, "The challenge of this hackathon is not around instruction tuning." Thank you!

tawdry iris Jan 8, 2026, 1:36 AM

#

use instruct-tuned model

tawdry iris Jan 8, 2026, 1:38 AM

#

lethal galleon Hi, will the models be tested using tokenizer.apply_chat_template()? Meaning we ...

We are not using HF lib for eval.

rotund wing Jan 8, 2026, 8:01 AM

#

is anyone is facing this error:
module 'flax.nnx' has no attribute 'ModelAndOptimizer'

i have asked this not because I want the correct code, I copied the installation and imports directly from official documentation (https://tunix.readthedocs.io/en/latest/_collections/examples/qlora_gemma.html), but it still results this error. So if any other is facing the same error that means I need to install other versions of these libraries

#

please answer as soon as possible because i havn't even started traning the model because of these errors. And only 4-5 days are left.

tawdry iris Jan 8, 2026, 8:58 AM

#

This is due to a Flax NNX change from flax.nnx.Optimizer to flax.nnx.ModelAndOptimizer

#

You should upgrade Flax to 0.12+

rotund wing Jan 8, 2026, 11:20 AM

#

so it means that libraries are getting updated... is jax, tunix, qwix is also getting updated?

hallow totem Jan 8, 2026, 6:06 PM

#

tawdry iris You should upgrade Flax to 0.12+

ive actually already submitted my unrestricted model and all the notebooks, writeups everything so now am i supposed to go to every notebook and change the code and upgrade flax or should I let it be and it would be handled by the judges?

vivid veldt Jan 8, 2026, 6:11 PM

#

hallow totem ive actually already submitted my unrestricted model and all the notebooks, writ...

Maybe use pip cell logs to pin versions?

vivid veldt Jan 9, 2026, 3:46 AM

#

@tawdry iris hello! What is the expected strategy for kagglehub authentication to download gemma weights? As in, I shouldn't share an api key to ensure download lol.

tawdry iris Jan 9, 2026, 6:01 AM

#

vivid veldt <@1341588806737858701> hello! What is the expected strategy for kagglehub authen...

DO NOT share your key. We will use our own.

vivid veldt Jan 9, 2026, 6:01 AM

#

So it should be called KAGGLE_API_KEY?

tawdry iris Jan 9, 2026, 6:02 AM

#

I use 'KAGGLE_KEY', but 'KAGGLE_API_KEY' is fine

tawdry iris Jan 9, 2026, 6:04 AM

#

hallow totem ive actually already submitted my unrestricted model and all the notebooks, writ...

It's prob. a good idea to pin your lib versions in the notebook. You do not need to do a full run of it. Just do a quick save.

hushed flower Jan 9, 2026, 8:44 AM

#

There's no limit on max_generation_steps in inference, unlike for temperature - which is 1e-4, right?

tawdry iris Jan 9, 2026, 9:54 AM

#

For temp, set it None for inference. See the latest submission template.

tawdry iris Jan 9, 2026, 9:54 AM

#

hushed flower There's no limit on max_generation_steps in inference, unlike for temperature - ...

Correct. No restriction on max_generation_steps.

vivid veldt Jan 10, 2026, 1:56 AM

#

hushed flower There's no limit on max_generation_steps in inference, unlike for temperature - ...

hey! have you noticed a cap on max_generation_steps?

vivid veldt Jan 10, 2026, 2:16 AM

#

ok I see now this setting lives in

sampler.CacheConfig(cache_size)

hushed flower Jan 10, 2026, 3:45 AM

#

Submissions are open till 12th Jan 11.59 PM (midnight) right?

hallow totem Jan 10, 2026, 6:34 AM

#

hushed flower Submissions are open till 12th Jan 11.59 PM (midnight) right?

mention your timezone

#

for a clear clarification

hushed flower Jan 10, 2026, 6:37 AM

#

GMT+5:30 (IST)

bright scaffold Jan 10, 2026, 3:46 PM

#

Using gemma 2b is acceptable or we have to explicitly use 2-2b ?

dark forge Jan 10, 2026, 4:43 PM

#

hushed flower Submissions are open till 12th Jan 11.59 PM (midnight) right?

Yes, in UTC

rotund wing Jan 10, 2026, 4:57 PM

#

hushed flower GMT+5:30 (IST)

IST is till Jan 13 5:00 AM

hushed flower Jan 10, 2026, 4:59 PM

#

Alright

rotund wing Jan 10, 2026, 4:59 PM

#

@tawdry iris If I am unable to submit the working model at last, but give a good writeup, video and datasets so will I get some points?

quick carbon Jan 10, 2026, 11:32 PM

#

bright scaffold Using gemma 2b is acceptable or we have to explicitly use 2-2b ?

It is either Gemma 3 (instruct, 1B) or Gemma 2 (instruct, 2B)

topaz marsh Jan 12, 2026, 5:11 AM

#

is it possible to extend the deadline by 1 day more ?

hallow totem Jan 12, 2026, 6:27 AM

#

topaz marsh is it possible to extend the deadline by 1 day more ?

depends on hosts

#

but idk, it doesn't seem like a valid argumt

fathom silo Jan 12, 2026, 3:17 PM

#

@tawdry iris there seems to be lot of crunch in availability of TPU ; is it possible to extend the deadline by a day or two?. Also noticed the number of submission are still quite low. It would be great if the deadline could be extended.

muted mantle Jan 12, 2026, 3:27 PM

#

Hey guys I recently found out about this competition and I really want to do it but there are only 9 hours left so I know its not possible bbut I still want to keep trying if late submissions are possible because I want to focus on learning. Does any body want to do it with me?

desert minnow Jan 12, 2026, 3:35 PM

#

fathom silo <@1341588806737858701> there seems to be lot of crunch in availability of TPU ; ...

I agree, since the TPU availability is the real concern for us right now.

warped wharf Jan 12, 2026, 3:43 PM

#

fathom silo <@1341588806737858701> there seems to be lot of crunch in availability of TPU ; ...

I agree

hushed flower Jan 12, 2026, 3:54 PM

#

topaz marsh is it possible to extend the deadline by 1 day more ?

This would help! I just corrected an error, wanted to make a last end to end run

green oar Jan 12, 2026, 5:48 PM

#

can we please extend the competition deadline by a few hours (11:59pm pst)? i had queued a TPU run overnight but didn't know that queued runs were prioritized less over interactive sessions because kaggle doesn't say anything

#

now im #198 in queue

rotund wing Jan 12, 2026, 6:17 PM

#

fathom silo <@1341588806737858701> there seems to be lot of crunch in availability of TPU ; ...

Maybe, at the end moment they will not.

rotund wing Jan 12, 2026, 6:18 PM

#

fathom silo <@1341588806737858701> there seems to be lot of crunch in availability of TPU ; ...

I agree. I am also facing the same problem. We have to wait for hours for a single TPU session each time.

rotund wing Jan 12, 2026, 6:21 PM

#

green oar now im #198 in queue

That will take you 2-4 hours. Only 5 hours are left.

desert minnow Jan 12, 2026, 6:30 PM

#

green oar now im #198 in queue

"TPUs are popular right now. You are #183 in the queue. You can wait, try connecting again later, or use another accelerator." same here

#

Looking forward for any solution please.

rotund wing Jan 12, 2026, 6:32 PM

#

desert minnow Looking forward for any solution please.

No solution, you have to wait for hours

#

The solution is only with the competition host.

desert minnow Jan 12, 2026, 7:11 PM

#

rotund wing The solution is only with the competition host.

https://www.kaggle.com/competitions/google-tunix-hackathon/discussion/667200

#

upvote this might help, I guess

rotund wing Jan 12, 2026, 7:24 PM

#

desert minnow upvote this might help, I guess

Done. I hope that the deadline will extend. But only 4 hrs are leftover.

vivid veldt Jan 13, 2026, 12:11 AM

#

submissions show up under "Code" correct?

#

@tawdry iris I do not think I submitted my notebook properly.

#

I have been working on this project 6-8 hours a day since novemeber.

#

My username on kaggle is Echo9Zulu, can you confirm my submision was submitted? I will be very upset if I lose all this time because I made silly mistake.

#

Would appreciate easing my nerves very much.

austere steeple Jan 13, 2026, 1:43 AM

#

Hello everyone,

I hope you are doing great. Wanted to share our work. We GRPOd Gemma3-1B-it to be a good chemistry reasoning model. We had some cool results and the model generalizes well across other domains. Feel free to check our work and share your thoughts.

Writeup: https://www.kaggle.com/competitions/google-tunix-hackathon/writeups/introducing-gemmax

Notebook: https://www.kaggle.com/code/alfaxadeyembe/gemmax-1b

Inspiration: https://www.futurehouse.org/research-announcements/ether0-a-scientific-reasoning-model-for-chemistry

hallow totem Jan 13, 2026, 7:28 AM

#

Hey everyone,

I just published a writeup on my CURE-GRPO method for the Google Tunix Hackathon.
It explores using self-critique + GRPO to push better reasoning in LLMs, with practical insights from building and experimenting with Tunix + Gemma models.

Would love for you to check it out and drop an upvote if you find it useful
Thank you

https://www.kaggle.com/competitions/google-tunix-hackathon/writeups/new-writeup-1766255740138

fading jungle Jan 13, 2026, 4:21 PM

#

I have been skimming the notebooks... some really interesting projects! Great work everyone.

I noticed that a few use external API keys during the 9-hour training run (e.g., calling a proprietary model for LLM-as-judge). I thought any LLM-as-judge would need to be an open-source model loaded in memory.

vivid veldt Jan 13, 2026, 8:11 PM

#

hallow totem Hey everyone, I just published a writeup on my CURE-GRPO method for the Google ...

Fantastic writeup! My first thought, is CURE grpo trained gemma interesting in chat use? Does it generalize well outside of training tasks?

prisma barn Jan 13, 2026, 9:01 PM

#

Subject: Kaggle Google Tunix Gemma 3 Unicorn-1B Writeup: SOTA Formatting Fix + No-API Reward Functions

Just finalized our submission, Unicorn-1B.

Seeing the discussion on API keys/Judges—we managed to solve the reasoning alignment 100% On-Device (TPU v5e) without calling external APIs during the loop. We engineered a 6-Signal Composite Reward Function (RegEx/String matching) to enforce strict structure without network overhead.

Also, for anyone who was hitting the 100GB RAM / XLA Compilation Graph crash when training on 1B+ tokens: we found the fix was moving from dynamic iterators to Static Dataset Objects. It drops compilation time from 60m to 8m.

Hope the writeup helps anyone debugging JAX OOMs!
https://www.kaggle.com/competitions/google-tunix-hackathon/writeups/novel-sft-rl-pipeline

#

YouTube Demo of Unicorn Gemma 3 1b: If anyone wants to check out Unicorn Gemma 3 1b running live and the SFT to RL GRPO process creating Unicorn Gemma 3 1b: https://youtu.be/z7BGR2XGksI?si=enx9OazBfw2kxCc8

fallen seal Jan 13, 2026, 10:43 PM

#

fading jungle I have been skimming the notebooks... some really interesting projects! Great wo...

https://www.kaggle.com/competitions/google-tunix-hackathon/discussion/651560 It’s mentioned here and in few other discussions that we can use LLMs as long as its accessible to judges (probably Gemini based)

hallow totem Jan 14, 2026, 6:44 AM

#

vivid veldt Fantastic writeup! My first thought, is CURE grpo trained gemma interesting in c...

thanks a lot!
despite the packed training, it actually does well I've also trained it for general reasoning and have done the inference on it.

ionic elk Jan 14, 2026, 4:24 PM

#

I would love to see a part 2 of this competition. I was looking forward to contributing with an educational video but alas life decided priorities needed to be shifted elsewhere. Either way had fun and looking forward to seeing what the winners have to offer. 🙂

vivid veldt Jan 14, 2026, 5:35 PM

#

hallow totem thanks a lot! despite the packed training, it actually does well I've also train...

You should consider porting to safetensors so people can more easily try in llama.cpp/other projects. My model Shadows-Gemma is much more experimental, but yours seems like it could be useful in practical ways! Name it and post it

hallow totem Jan 14, 2026, 5:52 PM

#

vivid veldt You should consider porting to safetensors so people can more easily try in llam...

Thanks! That makes sense. I’ll look into converting it to safetensors and making it easier to try on their projects . Appreciate the encouragement

#

Honestly it was pretty difficult to manage with this competition.
Coz I'm currently a first year bachelor's engineering student

and hence managing college + the beautiful TPU queue on kaggle😂

vivid veldt Jan 14, 2026, 5:55 PM

#

Oh yes. I am full time data engineer which = long nights, long weekends and even longer queues lol. Very intense project

hallow totem Jan 14, 2026, 5:55 PM

#

vivid veldt You should consider porting to safetensors so people can more easily try in llam...

you mind sharing yours? writeup link?

hallow totem Jan 14, 2026, 5:56 PM

#

vivid veldt Oh yes. I am full time data engineer which = long nights, long weekends and even...

ooh a real engineer haha interesting
yeah indeed very intense project

vivid veldt Jan 14, 2026, 5:56 PM

#

https://www.kaggle.com/competitions/google-tunix-hackathon/writeups/shadow-gemma-1b#3389318

hallow totem Jan 14, 2026, 5:56 PM

#

im very glad that I rushed and completed 15 days before the deadline

vivid veldt Jan 14, 2026, 5:56 PM

#

Lol barely. I am data monkey

hallow totem Jan 14, 2026, 5:56 PM

#

coz I saw ppl seeing hell during last days

vivid veldt Jan 14, 2026, 5:56 PM

#

Dude i had first stable checkpoint in pytorch week and half ago and had to port to tunix in like two days

#

Using custom loss and unsupported training strategy

hallow totem Jan 14, 2026, 5:58 PM

#

vivid veldt https://www.kaggle.com/competitions/google-tunix-hackathon/writeups/shadow-gemma...

hey looks interesting but u haven't attached your notebook?

vivid veldt Jan 14, 2026, 5:58 PM

#

I believe I nuked my submission this way.

hallow totem Jan 14, 2026, 5:58 PM

#

vivid veldt Dude i had first stable checkpoint in pytorch week and half ago and had to port ...

wow man I'm glad I wasn't in this situation

#

i mean I'm not winning either but still just for the sake of submitting

vivid veldt Jan 14, 2026, 5:59 PM

#

Waiting to hear back from @tawdry iris

#

Of course!

hallow totem Jan 14, 2026, 5:59 PM

#

oohh ah sorry to hear that

vivid veldt Jan 14, 2026, 5:59 PM

#

Your solution was a well engineered pipeline.

#

Many of the other solutions were heavily generated. One was an obvious prompt injection attack. Curegrpo is cool

hallow totem Jan 14, 2026, 6:00 PM

#

really? i struggled a lot but haha thank youu

vivid veldt Jan 14, 2026, 6:00 PM

#

Yeah man you went full send

hallow totem Jan 14, 2026, 6:00 PM

#

vivid veldt Many of the other solutions were heavily generated. One was an obvious prompt in...

heyy that rlly made me glad hearing that from a real engineer

hallow totem Jan 14, 2026, 6:00 PM

#

vivid veldt Yeah man you went full send

full send??

vivid veldt Jan 14, 2026, 6:01 PM

#

Haha I am mostly obsessed with Ai and learning more.

#

I have not trained models before this competition

hallow totem Jan 14, 2026, 6:01 PM

#

ooh wow

#

I've been learning abt ai since last two years

#

basically since the last year of my highschool

vivid veldt Jan 14, 2026, 6:01 PM

#

hallow totem full send??

It is clear you made effective use of compute and made a complicated pipeline which worked lol

hallow totem Jan 14, 2026, 6:02 PM

#

vivid veldt It is clear you made effective use of compute and made a complicated pipeline wh...

hey thanks a lot for the appreciation

vivid veldt Jan 14, 2026, 6:03 PM

#

Are you familiar with Huggingface ecosystem? Kaggle is fantastic, but HF is where your models will be most visible to the community of people who may be interested to try your 1b.

#

Plus

hallow totem Jan 14, 2026, 6:04 PM

#

vivid veldt Are you familiar with Huggingface ecosystem? Kaggle is fantastic, but HF is wher...

yesyes definitely

vivid veldt Jan 14, 2026, 6:04 PM

#

At some point the evals you made, plus your model, me be useful to further evaluate shadow tokens vs base model

hallow totem Jan 14, 2026, 6:04 PM

#

lemme share you my

#

github

#

wait

#

https://github.com/aaghaazkhan

vivid veldt Jan 14, 2026, 6:05 PM

#

Interestingly shadow tokens were highest in basemodel completions where the model explains itself after answering.

vivid veldt Jan 14, 2026, 6:08 PM

#

hallow totem https://github.com/aaghaazkhan

https://github.com/SearchSavior/OpenArc

#

For conversion between formats I used deepwiki to help. Tunix APIs have changed a few times since November.

hallow totem Jan 14, 2026, 6:19 PM

#

hallow totem https://github.com/aaghaazkhan

haven't posted much of my work here but yeah

hallow totem Jan 14, 2026, 6:20 PM

#

vivid veldt Interestingly shadow tokens were highest in basemodel completions where the mode...

yeah i saw ur writeup a bit it seemed interesting also the name to me tbh

hallow totem Jan 14, 2026, 6:23 PM

#

vivid veldt For conversion between formats I used deepwiki to help. Tunix APIs have changed ...

deepwiki? never heard of it tbh

vivid veldt Jan 14, 2026, 6:23 PM

#

Maybe sit down before looking up

#

You may faint

hallow totem Jan 14, 2026, 6:23 PM

#

no way not im interested 😂

vivid veldt Jan 14, 2026, 6:24 PM

#

https://deepwiki.com/searchsavior/

hallow totem Jan 14, 2026, 6:24 PM

#

lemme see

vivid veldt Jan 14, 2026, 6:24 PM

#

Oh it added my github to link? Unintentional lol

hallow totem Jan 14, 2026, 6:27 PM

#

vivid veldt https://deepwiki.com/searchsavior/

all of ur work says

#

repo not indexed

vivid veldt Jan 14, 2026, 6:28 PM

#

https://deepwiki.com/google/tunix

hallow totem Jan 14, 2026, 6:28 PM

#

wait a sec

#

the OpenArc

#

is ur original repo

#

?

vivid veldt Jan 14, 2026, 6:28 PM

#

Yes

#

Haha

hallow totem Jan 14, 2026, 6:29 PM

#

no way

#

273 stars

#

no way you're popular

vivid veldt Jan 14, 2026, 6:29 PM

#

No way you happen to have arc gpu?

#

Or intel cpu

hallow totem Jan 14, 2026, 6:29 PM

#

intel cpu ampere gpu

#

rtx 3050

#

man damn i thought the OpenArc on ur profile is just a forked repo

#

woww

#

have you ever done CUDA programming?

#

wait lemme upvote ur writeup i liked it

#

done

vivid veldt Jan 14, 2026, 6:38 PM

#

hallow totem have you ever done CUDA programming?

No I have not. Tbh there is slim chance of me getting cuda capable hardware anyway

vivid veldt Jan 14, 2026, 6:44 PM

#

hallow totem done

Your pipeline has the unstructured phase. What inspired this?

hallow totem Jan 14, 2026, 6:49 PM

#

vivid veldt No I have not. Tbh there is slim chance of me getting cuda capable hardware anyw...

ooh no way

#

it's cheap

hallow totem Jan 14, 2026, 6:50 PM

#

vivid veldt Your pipeline has the unstructured phase. What inspired this?

unstructured phase??

#

there's no unstructured phase, r u talking abt phase 2?

vivid veldt Jan 14, 2026, 6:59 PM

#

hallow totem there's no unstructured phase, r u talking abt phase 2?

I believe so. I will give your writeup another read tonight est

hallow totem Jan 14, 2026, 7:00 PM

#

vivid veldt I believe so. I will give your writeup another read tonight est

okay well if you're talking abt phase 2 then it's not unstructured it's just not following the CURE structure

#

but following the normal reasoning tags and answer tags structure

#

i mainly had to remove it coz phase 2 was based on general reasoning also creative reasoning and for such task the output tokens very too much while following cure structure due to wish output was getting truncated and model was collapsing coz of rewards penalty

#

but the model is well trained on CURE structure so if the judges got a workaround and eval on higher output tokens limit then it'll follow CURE structure even on the general task well (I've done the inference earlier and yeah it did it perfectly except for sometimes output getting truncated due to higher tokens)

tawdry iris Jan 15, 2026, 2:06 AM

#

vivid veldt Waiting to hear back from <@1341588806737858701>

I cannot find your notebook in the writeup

vivid veldt Jan 15, 2026, 2:12 AM

#

tawdry iris I cannot find your notebook in the writeup

I added it to the "code" page on the main tunix competition page, which I guess was incorrect. When I went to hut submit to competition it wouldn't complete because there was no output file. The notebook I developed in I guess was missing something related to the competition which was lost when i tried to import.

#

Discord isn't allowing me to send an image. On the main page under "Code" if you search "Echo" it comes up

#

Is there some way to appeal? I have been working on project almost everyday since November

tawdry iris Jan 15, 2026, 3:11 AM

#

Can you add your notebook link as a comment in your writeup?

vivid veldt Jan 15, 2026, 4:26 AM

#

tawdry iris Can you add your notebook link as a comment in your writeup?

Done!

hallow totem Jan 17, 2026, 8:30 AM

#

hello @tawdry iris could u pls give us a rough idea about when we can expect the results to be announced

tawdry iris Jan 18, 2026, 6:32 AM

#

It depends on the bandwidth of the judges, so it's hard to say. I think at least March.

hallow totem Jan 18, 2026, 6:48 AM

#

tawdry iris It depends on the bandwidth of the judges, so it's hard to say. I think at least...

alright thank you

hallow totem Jan 18, 2026, 11:17 AM

#

@tawdry iris sorry but were we supposed to add our PR in our writeup for our submission to be valid?

either way, I opened a PR on December 2 2025, on an issue i found during the time I was working on this competition

I think you might remember the issue, it was that I was getting 0% format accuracy as well as answer accuracy for both pre training and post training evaluation.

I figured out the issue later and hence opened a PR for that particular issue

here's the link to the PR: https://github.com/google/tunix/pull/820

also I hope it's perfect if I add this PR in the comments of my writeup right?

here's my writeup link as well:
https://www.kaggle.com/competitions/google-tunix-hackathon/writeups/new-writeup-1766255740138

tawdry iris Jan 19, 2026, 6:44 AM

#

That is fine.

hallow totem Jan 19, 2026, 7:25 AM

#

thank you

hallow totem Jan 19, 2026, 11:24 AM

#

@tawdry iris is it possible for you to let me know why my PR isn't reviewed

according to me, wt I found was the batch slicing isn't done as expected when train micro batch size is less than num generations

eternal hill Jan 23, 2026, 11:30 AM

#

Hello. Can you please let us know if the winners declaration for the google tunix hack been postponed. Would we get an official update regarding the postponement in case the results have been postponed?

rotund wing Jan 23, 2026, 2:27 PM

#

eternal hill Hello. Can you please let us know if the winners declaration for the google tuni...

Results will be announced in March

rotund wing Jan 23, 2026, 2:28 PM

#

tawdry iris It depends on the bandwidth of the judges, so it's hard to say. I think at least...

As mentioned here

tawdry iris Jan 30, 2026, 2:43 AM

#

eternal hill Hello. Can you please let us know if the winners declaration for the google tuni...

https://www.kaggle.com/competitions/google-tunix-hackathon/discussion/670878

eternal hill Jan 30, 2026, 2:49 AM

#

@tawdry iris Thank you. 😁

astral cairn Feb 2, 2026, 6:56 PM

#

blazing needle Hello, I am AI Engineer Professional with knowledge of ML & RL Looking forward t...

Hi,

Are you still looking for a project?

eternal hill Feb 20, 2026, 4:37 PM

#

Hello @tawdry iris I accidentally saved my model as a dataset. Would it affect in grading my submission for the inference section of the hackathon. My writeup is - https://www.kaggle.com/competitions/google-tunix-hackathon/writeups/new-writeup-1763935485708

hallow totem Mar 4, 2026, 4:05 PM

#

any updates on when results will be out? @tawdry iris

hushed flower Mar 6, 2026, 7:18 AM

#

@tawdry iris any updates when can we expect the results?