I didn't upload predictions over the weekend, and being in the UK, was banking on being able to get them done with a few hours available this morning. How wrong I was. It's taken literally several hours so far, with very many restarts needed when realising that the API had hung, and they're still not done so some submissions are probably now late. I've noticed this at times in the past though not usually an issue if uploading on Sunday, but it shouldn't be an issue at any time on any day of the week. It is something that should be resolved.
#Upload pipe feels like it's a straw
1 messages · Page 1 of 1 (latest)
I’ve run into a bunch of timeout issues as well
A frustrating thing is that the numerai library seems to just sit there, with no sense that something's wrong and no mitigation strategy or any thought to robustness and resiliency. If I kill the process and restart my code will start again at the upload it was working on, and today that might then upload quite quickly, but the next model's upload could then easily hang.
hi moggie, sorry you're running into issues with upload speed. We don't hear this complaint often, but here are a few things that could help determine what the issue is:
-
check your upload/download speed with a tool to make sure it's not your ISP limiting your speed
-
make sure you're only uploading the live era, this should only be a handful of megabytes
-
try uploading using parquet format, this will greatly reduce the size of the file
-
try out Model Uploading, which handles submissions internal to our system and are fully supported by the Numerai staff
Also, I'm sorry numerapi does not handle timeouts well, this is an open-source package maintained by @fallow mauve, but if you feel you could improve it in some way it would be greatly appreciated by everyone if you could offer your expertise.
I'll put it in our backlog to try and offer upload points in more regions so it's less of a worry but I can't gaurantee it'll be prioritized soon
The file you are trying to upload is supposed to be tiny, since it should just contain live predictions. Is that the case, maybe worth double checking. numerapi times out after a while and throws an error, If that duration isn't enough because of your internet speed, you can manually increase the timeout by passing the timeoutparameter to upload_predictions
The uploads in question are the old 310 or whatever it was files. I think they have everything in them that there needs to be. Nothing less and nothing more. AFAIK anyway. I also do daily uploads with v4.1 and those work better, but downloads are occasionally problematic with invalid data downloaded. What I do though is check that the files downloaded are actually valid, and then loop round to download again if not. So it works a lot of the time, but doesn't seem to have any resiliency against failures or adverse conditions.
Thanks for the tips. I did check the connection and it was fine, but there's thousands of miles between us 🙂 Based on manual intervention, stall detection and initiating a retry seems like it could be an effective strategy for the library, but also starting uploads the day before too would help when possible 🙂