#playground-series-s4e1

1 messages · Page 1 of 1 (latest)

brave viper
#

Hi, I was wondering what was the best way to submit predictions in this challenge :
Either I train my models on all the train dataset and submit, or I do a 5-fold split of the train data set, train the model on 4/5 of the splits and take the average of the 5 predictions given by the models to make my submission

#

I'm also interested in the question in general

#

I feel like there's an interesting variance-bias question here but I'm not 100% sure

barren epoch
#

I usually do some experiments with cross validation, than if I find correct parameters and data transformations, train the model on the whole dataset.

brave viper
#

thanks for your insight, that's my routine as well, I'm interested to see if there's other opinions !

brave viper
#

I have a question
for the challenge my pipeline is the following :

  1. I have a file that takes care of feature generation, it generates many features with various methods
  2. I have a file that trains many models with cross validation and generates oof predictions as well as test predictions
  3. I have a file that stacks predictions using optuna : it creates 5 folds like in cross validation, finds good weights with optuna using score on train set and the cross val score at the end uses the validation set of each fold
#

I had a submission that scored 0.890 on the public leaderboard

#

then I added some features, trained more models and calculated the cross val score in file number 3. it was significantly higher but the score on the public leaderboard is significantly lower (0.884)

#

I do not understand how this is possible, how can I overfit if I always look at scores on validation sets that were not used to train ? seems odd to me.

#

I didn't use any parameters that were fine tuned using cross validation from the new features during part 2 to train models

#

and I only calculate cross validation once in part 3 to stack predictions, I don't optimise it with multiple reperitions

hasty igloo
brave viper
#

interesting

#

so you should only look at cross val score when making final submissions ?

primal bolt
#

I also have more or less the same issue. My first model got 0.888 on the test set, but went down to 0.886 in the leaderboard. Then I did some feature extraction (mainly combining geography and gender and binning age, salary, and credit) and model improvements and got 0.890 but 0.885 on the leaderboard. The number of FP/FN in the confusion matrix are the same in all cases. I share my notebook which is public and have a lot of detailed explanations. Any Feedback is welcome and hope it helps others to learn and improve.

https://www.kaggle.com/code/bmart80/bank-churn-dataset-votingclassifier

If you like it please upvote and feel free to to connect with me
https://www.linkedin.com/in/benitomzh/

hasty igloo
maiden grotto
# brave viper interesting

This happened on another competition that recently closed - Predicting Writing Quality. There was a HUGE change from the public to private leaderboard and the people who trusted their CV scores ended up on top in the end. I'm not sure if this is always the case.

versed fable
radiant coyote
#

that makes it fun

broken sapphire
primal bolt
#

was total fun, thanks for costing, looking forward into next one 🙂

thick heath