As I understand it, if we have a set of predictions submitted for round_x but are unable to submit on round_x+1 then Numerai uses our last submission to fill the slot. Is that correct? If so, how many days will that carry forward? And are the previous submission values being used in their original order, or is Numerai internally mapping the ID value to the new ID value for each security?
#Regarding a "fallback" submission...
1 messages · Page 1 of 1 (latest)
If you submit after round x is closed, those prediction are "queued" and used for round x + 1, if you submit for round x, then there is no submission for x + 1 unless you submit again after the round is closed. There has been expressed want for the submissions to carry forward, without the need for re-submission, but I think it's not yet implemented.
I see, thank you. Even in the scenario you mention regarding a "queued" submission, do we know if a remapping of values to the new ID is being made internally? Or it just uses the values in the order they were submited and hope for the best?
I can't say for sure other than fall back to "they must be remapping them" since there's nothing we can do on our end. Assuming our predictions are in a certain order, even if it's the order they provide, is bound to just create unnecessary errors. There's also the case of the ticker universe changing throughout the week, so we wouldn't have any way to account for that other than hope they are appended.
That's what I figured as well. But, theoretically, it seems I could queue submissions, then wait for those to be submitted the next day, then download my predictions for the day they were submitted and be able to map backwards to link securities between eras.
Perhaps, the team might keep your submissions untouched and do the mapping during processing to avoid that. If not, the cats out the bag and they surely will now!
I think they have to provide the values back in the correct order for the era or we wouldn't be able to process metrics locally on them. It's only a day's worth of data, though one might be able to string together a system for a longer period. I'm not sure how it would be helpful on the live data though. Seems like a lot of effort for no real gain. I was just curious.
I'm not sure why the in-era order matters, i know the mmc calc in scoring.py does a sort at the start, so the in-era order doesn't matter for that. I don't remember the numerai_corr though
I suppose you mean if you download your submissions and try to do local metrics with the correct target ID's, it will all fail if they haven't mapped them? 🤔
yes.
Sorry took me a minute, I'm on board now.
Do you know where to download the submissions on the website? I just uploaded todays preds to a vacant slot, ill check it tomorrow
I've only done it through the API.
Alright
Even in the extreme case where say I have 1 model which I always submit late (queuing up the next day) and then download the rounds predictions in the future, keeping track of my original vs processed. I could only, at best, create a map of the securities through time on resolved data. It would be interesting, but I don't see a way to use that for any live round.
@vagrant bolt we only do the id translation between eras internally and don't modify the original submission. That being said, if you can prove there is a leak and send a report to [email protected], I'll send you a sizeable bounty per our bug bounty program
thanks for the info @ark I'll explore a little more and let you know.
Thats when you match the through-time IDs to signals targets 😛
I can barely keep my plants alive. That's too much for me.
You can track (most of the) rows across adjacent eras with high probability just by looking at euclidean distance (nearest neighbor in the other era). Correlation probably works also. It's quite trivial. Is that a leak? Still can't id the stocks.
not really the same as getting an actual mapping
No, but even if you could doesn't seem like that big of deal. You can confidently get 99% of the rows though -- they almost all have a single nearest neighbor of relatively small distance and the second nearest is 3x-4x farther (and much closer to all the other rows). Then there are a often a small number obviously new/different rows (so you can also usually tell which rows are not in the new era -- or have changed greatly maybe) and a handful of unclears (probably the same stock just with some larger than normal feature changes -- larger distance than usual, but still well short of the second nearest neighbor). Of course this is nothing new...we could always do this, even on weekly data most rows are pretty obvious.
So if somebody wanted to put these rows together for training purposes and then somehow incorporate that into their predictions (say using the live era and the era before at prediction time even if they couldn't match 100% of the rows) that might actually be worth doing, I don't know. But trying to id the stocks from that would still be pretty hard (and you'd probably only be able to get a few of them confidently) and then so what? What does it get you? Free data for a few stocks for short periods? Not worth the bother.
yeah it's not a worry about someone exploiting it to get better payouts (if you can still predict the future accurately thats ok), it's more about what we are legally allowed to give out for free - which is why we would pay you if you can figure out how to break the encryption
Yes, I know you can't put it together for us (or even have it be reasonably possible to actually figure out specific stocks).
@feral tiger I did run a test submitting late for round 676, so the predictions went into effect for round 677. When I downloaded the predictions submitted for round 677 the order and IDs matched my original submission. While this is good news in terms of not showing a data leak, it does mean that we cannot run any sort of analysis on the late submission because the IDs do not match anything from the published round it was actually used in. I suppose I would categorize that as a bug, however if "fixed" would then constitute a "leak".
Interesting thanks for the follow-up. Personally wouldn't classify it as a bug, I think the user has to take the responsibility here. They could simply not support submissions if the rounds are not open. It's a nicety they provide and I guess the mismatched IDs is something the users must live with in order to use it. Since we're"downloading submissions" and those are the IDs we submitted, it seems correct to me.
yes. The only outlying concern would be during automation, perhaps you miss knowing that the submission was late. Keeping track of that would be rather difficult. In which case if you tried to download historical predictions and run tests on the data it would be very hard to untangle what was going wrong. I agree it's probably not a big issue, and non-existent as long as submissions are on time.
I know on the website side after the "pending" submission is submitted it's status is changed to "on-time." I'm not sure if there's a way in the API to know if it was truely in-round, or if it was a queued submission, aside from checking date stamps.
It would be nice if there was a flag "was_queued" or something instead. There might be, this isnt something I've checked, but that would help when pulling down submissions to inspect.
Yeah, I think this was mainly a courtesy (and one that I kept pushing for because all of my submissions take a long time so it is the only way I can do daily submissions) and let's not complain about it. I think @feral tiger has it in his head that late submissions are slightly "worse" than on-time ones because of the time lag, but with my models anyway that's not been shown to be true at all, and I don't want them to take away this ability for possible performance reasons or any other reason. (One hour window just isn't enough for some of us.)
There is one thing maybe you'd like to know about -- if you submit late for round X and there was no other submissions for round X on that slot (either an on-time submission or a late submission from the previous day), then that submission will show up both as late under round X (where it won't count for staking, but you'll still get scores) and also on-time for round X+1 (where it was queued and that counts for staking). However, you if just submit everything late like I do every day on a slot, you'll only see it on round X+1 (because round X was taken already). So if you want to do experiments with late/queued submissions, do it on a slot you haven't used for the late round you're submitting for and you can see the scores for both the actual round X and X+1.
(Although for a sustained daily experiment you'd have use two slots either way, but it does do that.
I can believe it's possible to submit predictions that are resistant to the lag, but I have yet to see it.
Are those the same preds submitted 1 day late or all week?
old_data_datestamp = 1 day lagged v4_example_preds
Wait. This seems to show the opposite
is green the lagged data? it looks better than ontime
Will leave here for some insight. lagged predictions are better on this time span specifically.
If you look at the profile overviews they show the lagged predictions are worse over some periods:
https://numer.ai/old_data_datestamp
https://numer.ai/v4_example_preds
I haven't been able to find anything close to a significant difference with my own models. But that may just be my models. And I would also suggest that anything you're looking at in a big way over a longer time period would be by definition models that would never need to be submitted late because their predictions can be generated quickly, and so what I'm doing is of possibly just a different class. OR...I just can't look at enough data because it takes too long to generate. But I do have at least a good number of months of live data to look at. Even if I can't compare apples to apples all the time (same round to itself as 1-day late -- I can only see that one day a week), in theory since I submit the weekend on-time and the other four late, my mid-week submissions should all perform slightly worse than the weekend, all else being equal. But just the daily noise of the results is more than enough to cancel out any effect -- I just can't detect anything like that. So I don't worry about it. But I love the feature!