#leap-atmospheric-physics-ai-climsim
1 messages · Page 1 of 1 (latest)
Hello @rigid whale is it correct that you are the (technical) host of https://www.kaggle.com/competitions/leap-atmospheric-physics-ai-climsim/ ?
where should we post technical questions about the dataset?
Simulate higher resolution atmospheric processes within E3SM-MMF, a climate model supported by the U.S. Department of Energy
Sorry I know nothing about that contest! I am a host of the basketball prediction contest but I think maybe the Discord roles on here don't make a fine distinction about who is host for which contest...
@atomic hound The best place to reach the host of any competition is in the kaggle forum for that competition.
Curiosity Here.... I'm not sure if I've missed something, or that it was planned this way.
However, I'm wanting to get better performance by using 'nearest' (lat/long) neighbor influences for the prediction values. The documentation and/or data provided doesn't seem to have easily identifiable ways to sort the data for this 'proximity correlation'. Atleast in the 'low-resolution data subset' provided. There is the high-resolution data, and the paper describes the dataset more clearly, but the data itself does not contain this information.
The idea for my thinking here is that predictions such as wind and liquid mixing ratios, are more closely related to Navier–Stokes and fluid dynamics, than single patch layer influences. So it seems like we are ignoring a lot of useful data for modeling the climate accurately, particularly when there is no correlated zonal influences.
So, do I continue on this path to submission, accepting that there is a potential flaw in my experiments, or do I attempt to extract the Lat/Long, and develop a prediction model based on this generalization of how the climate works, and hope that the additional resources, provides significant aide in the convergence towards the output predictions?
now that they found a way to extract location and time data, they say that supposedly adding lat lon columns doesnt improve performance because those are predictable from the data. But im wondering if adding information from nearest neighbors improves the performance, and if the test set has simultaneous observations from every point such that its possible to have the model take multiple inputs of different locations observed at the same time.
We are looking for a teamate. We have several models, including a single model with a score of 0.743 using 1.2 million data(training takes 30m over) and a score of 0.754 using 6 million data(training takes 1h 30m over) , which we believe is due to a simple and effective trick. Over 0.75 with architectures like transformers and LSTMs. We have not yet experimented with in latlon, FE. We got single models of 0.760 and 0.763, up from the previous 0.771 ensemble.
we have a score of 0.787 using just over 70 million data
Check DM
nice to meet you. check DM plz
There is no voice chanel??
Correct, we have no open voice channels on the Kaggle discord.