#playground-series-s5e2

1 messages · Page 1 of 1 (latest)

dire plank
#

tumbleweed

...so, bit strange this competition has two training data sets? is there an obvious reason for that? should you just smush them together?

dark sail
#

I was wondering the same too! I checked out a couple of the top notebooks and most of them didn't use it either... which is weird because that dataset is massive compared to the original one so wouldn't it be more useful?

neat nest
#

original data set has verry little signal in it (do a couple boxplots of the features in train.csv alone and you'll see why folks are slow to start this competition).... looks like training_extra was added today, maybe? but at least one forum post has it seem like the signal is about as clear in the new data as the old

lunar stream
dire plank
#

i did an extremely basic entry and it initially placed 81/590 which was higher than i was expecting

#

so is it that the extra training set was added because it was too hard before

lunar stream
#

Yeah, that or too noisy. Some discussions on the forums showed folks not wanting to engage in such a competition. Hopefully the data add was positive and transforms the competition a bit. I love seeing everyone's creations, and thus far the data is very uninteresting, dampening a lot of that.