#playground-series-s5e2
1 messages · Page 1 of 1 (latest)
I was wondering the same too! I checked out a couple of the top notebooks and most of them didn't use it either... which is weird because that dataset is massive compared to the original one so wouldn't it be more useful?
original data set has verry little signal in it (do a couple boxplots of the features in train.csv alone and you'll see why folks are slow to start this competition).... looks like training_extra was added today, maybe? but at least one forum post has it seem like the signal is about as clear in the new data as the old
Right, I'm with you. @neat nest is right, there's now an extra 300MB training file in the data set. Compared to the 30MB file given-wow!
Curious to see how it unfolds from here. Will the additional file completely change the spirit of the challenge, or will it be more noisy records but with just enough signal to help us beat the mean Weight models out there scoring under 39?
i did an extremely basic entry and it initially placed 81/590 which was higher than i was expecting
so is it that the extra training set was added because it was too hard before
Yeah, that or too noisy. Some discussions on the forums showed folks not wanting to engage in such a competition. Hopefully the data add was positive and transforms the competition a bit. I love seeing everyone's creations, and thus far the data is very uninteresting, dampening a lot of that.