#playground-series-s6e2
1 messages ยท Page 1 of 1 (latest)
Kinda lost on how to progress from 0.95397
All my ensembles are just overfitting on OOF noise and translate to worse LB, I tried all sorts of training tricks and different models, but been stuck for 5 days
A few things that helped me:
- Stop chasing tiny OOF gains and check stability.
- Reduce redundancy: try fewer, more diverse models rather than more of the same family.
- Blend conservatively: median / p20โp30 often generalizes better than aggressive weights.
- If youโre doing OOF stacking, keep the meta model simple (ridge/logistic) and avoid heavy tuning.
- If you have a strong single model, sometimes a small ensemble around it beats a large stack.
If you want, share your current model list and CV/LB gaps โ happy to take a look.
I put together a simple OOFโfirst baseline here โ itโs meant as a clean, iteratable starting point rather than a final solution:
https://www.kaggle.com/code/jacksaleeby/oof-6-model-ensemble-cpu-easy-to-iterate-on
Thanks for taking the time to respond!
I stopped weight searching too aggressively and I now have a new best blend of just two models with 0.95398 LB.
do you want a 0.954
I have a 0.954 pinned to my profile
follow to see future updates ๐
Ye same
Is yours just blind blend or do you have an actuall oof as second submission?
Iโd be down to team up fs
What's your best single model performance? My GeoXGB model gets 0.95342 on the test data, on train/test it gets a bit higher than that. I'm now exploring potential MoE, ensemble and blending approaches with logistical regression, xgboost, catboost, SVM (made feasible through HVRT)
My CV XGBoost ensemble produced a test AUROC of 0.95381 and a CV AUROC of 0.95566
Interesting. Yeah I've updated HVRT to 2.5.0 so it uses vectorization for reduction and expansion across partitions, updating GeoXGB to leverage that. It will allow for more rigorous efficient HPO.
There's also a faster HPO variation that GeoXGB was using, but I didn't check whether the way the optimizer module has done it would have a higher ceiling with the slower but better accuracy settings enabled during the search instead of only after.
I posted my PS S6E2 solution writeup (20th, Private 0.95533). It focuses on an OOF-first workflow and two simple final submissions (Ridge stacking + weighted rank-average), plus what I avoided to reduce overfitting. Feedback welcome!