Feature Neutralization Performance lower than Hello Numerai | Numerai | Page 1

winged leaf Apr 20, 2025, 6:55 PM

#

What's the point of the feature neutralization tutorial notebook if it if it consistently performs worse than Hello Numerai based on the live scores?

vale spear Apr 26, 2025, 11:33 AM

#

I am not sure that it is consistently worse. I have Kaggle versions of tutorial notebooks symbolically staked, and 1Y return of feature neutralization (https://numer.ai/jos_kaggle_medium_fn) is better than Hello Numerai. And, Hello Numerai is worst in CORR (maybe MMC score is helped by its uniqueness: few people stake simple LGBM model based on small feature set).

Real advantage comes when you combine both techniques FN and TE, like in https://numer.ai/jos_kaggle_sunshine. There 1Y return is well above average and ranks 230 of 4039 - not bad at all...

winged leaf Apr 26, 2025, 2:24 PM

#

vale spear I am not sure that it is consistently worse. I have Kaggle versions of tutorial ...

I was looking at the tutorial model 1Y scores on the benchmark models account:
https://numer.ai/~benchmark_models

Did you train with the shallow parameters or deep parameters in the tutorial notebooks?

vale spear Apr 26, 2025, 6:21 PM

#

winged leaf I was looking at the tutorial model 1Y scores on the benchmark models account: h...

You can check for yourself. Each source code is public notebook referenced in model page e.g. https://www.kaggle.com/code/svendaj/numerai-feature-neutralization. I would say that these are shallow parameters:

model = lgb.LGBMRegressor(
    n_estimators=2000,
    learning_rate=0.01,
    max_depth=5,
    num_leaves=2 ** 5,
    colsample_bytree=0.1,
    verbosity=-1,
    num_threads=4
  )

#

But there might be more differences. Example model says that it is "submitting since late 2023", my models are retrained with every new data available (you can see it on notebook version history), also my submitted models are trained on medium feature set (except https://numer.ai/jos_kaggle_hello)

winged leaf Apr 26, 2025, 6:32 PM

#

vale spear You can check for yourself. Each source code is public notebook referenced in mo...

Yeah those are shallow, also I think the tutorial notebooks have a num_leaves of 31 rather than 32

winged leaf Apr 26, 2025, 6:34 PM

#

vale spear But there might be more differences. Example model says that it is "submitting s...

I think retraining on new data could be the difference, but I am pretty sure the tutorial models are also trained on the medium feature set

#Feature Neutralization Performance lower than Hello Numerai