#fide-google-efficiency-chess-ai-challenge

1 messages ยท Page 3 of 1

steep estuary
#

you blasted your code in outer space?

wide oak
#

Unless you want to SPSA the 4 million games we did to tune the search to your engine.

steep estuary
#

ah spsa

#

yeah, that breaks things

wide oak
#

No. I added the net to Peres engine, but every engine hot-swap just loses.

steep estuary
#

well, you could still try to see big gaps

wide oak
#

I put pere net in my engine, I lose elo. he puts it in his engine, he loses elo.

#

can't decide much.

steep estuary
#

isn't pere net very barebones?

wide oak
#

yeah.

steep estuary
#

oh well, guess we'll never know

#

i have better uses for 4m games

wide oak
#

the cursed C-centric non-templated version:

void add_sub_add_sub(int16_t* dest, const int16_t* src, int16_t* weights, size_t size,
                     bool ADD1, bool SUB1, bool ADD2, bool SUB2,
                     size_t add1, size_t sub1, size_t add2, size_t sub2) {
steep estuary
#

i'll open source the code if we somehow end up placing in a top 5 spot

#

it's particularly bad code that doesn't need to see the light of day otherwise

#

ultimately it's just C alex with stdlib stuff removed and worse uci support

wide oak
#

gotta pay the openbench tax if top-5

steep estuary
#

ob tax, bullet tax, actual taxes

#

i hope i don't win

#

i might lose money on it at this point

wide oak
#

see If I can trade the 10k for 20k in compute credits ๐Ÿ‘€

steep estuary
#

if it helps i did my best to not sprt because i think closed source dev is pure villany

#

but in the end i caved in and i think we clocked in a couple dozens sprt

wide oak
#

It was a wild 3 weeks.

steep estuary
#

the test count is 100ish but it's highly misleading because most would just fail

#

OB got stuck in the awaiting artifacts stage

#

and the test need to be remade

#

i think that happens if you create it because the workflow is actually done

wide oak
#

old openbench version?

steep estuary
#

no clue, gedas set it up

#

i'd assume it's as recent as the day he set it up

wide oak
#

I will say, I have NO issues with that on Torchbench.

steep estuary
#

so a couple months old

wide oak
#

I did not use the private system though for this :/

steep estuary
#

we had it for every single test so it's not a niche bug

wide oak
#

Something sussy then.

steep estuary
#

there's a chance that waiting more would've fixed the issue

wide oak
#

I just made a 2nd github account. And slapped a private token into the src code of OB.

steep estuary
#

but after a couple minutes it was easier to nuke the test and recreate it

wide oak
#

you should be able to create the test before the workload finishes.

steep estuary
#

well that's not what we experienced

#

it might be user error

#

ngl i haven't super looked into it

wide oak
#

gunicorn? native django? ngingx reverse prxy?

steep estuary
steep estuary
#

i would guess native django, but it's a guess

wide oak
#

ah.

steep estuary
#

it's hosted on pythonanywhere, but that shouldn't be too relevant

wide oak
#

That is very relevant actually.

#

pythonanywhere is a WSGI app, not native-django.

#

prior to that, pythonanywhere users could not use the pgn feature (nor artifact watcher) feature, since the threads would simply never be spawned.

steep estuary
#

hmmm

#

did geddy use a version of ob that is that old?

wide oak
#

And since OpenBench will attempt to find the artifacts ONCE upon creation; and otherwise defer to the watcher...

#

that explains all your problems.

steep estuary
#

i can check when we set it up

#

that has to be it tbh

wide oak
#

is the server up?

steep estuary
#

i think so(?), let me check

#

it is

wide oak
#

you have a client sitting around still? whats the version atop worker.py

steep estuary
#

we set it up at the start of january

#

i have messages of me trying to debug the token stuff dated 10/01

#

CLIENT_VERSION = 35 # Client version to send to the Server

wide oak
#

eh.

#

maybe still bugs.

#

hope to remove the private system stuff eventually.

steep estuary
#

yeah i didn't love using it

#

it felt wrong to me

wide oak
#

I never wanted to create it ๐Ÿ™‚

steep estuary
#

well as long as one regrets their own mistake, it's fine

#

congrats on the hardware btw

atomic karma
steep estuary
#

if i get some money i'll match the hw too lol

atomic karma
#

I hope you finish 4th as Andrew said

#

(but no higher, the other 2 being 33.33% chance each is enough for my heart)

#

and I can attest that the secret of that team's sucess is not my hw, I even questioned weather it was inspite of it xD u should see some of the stupid things I asked Andrew

steep estuary
#

ah dw, i wasn't commenting on the competition

#

i think you gave hw to other devs in the past and i'm vaguely aware of it

#

thanks for the service

solemn night
#

this is very hard to believe, maybe the test is bugged..

#

it's less than 5 Elo here, and I remember it was so when SF introduced it, I'd imagine the default is even harder to pass with weaker engines

steep estuary
#

Just tweaking it easily gained 10 elo

#

And I mean a very minor tweak

solemn night
#

well, weird things happens
In Stockfish, I discovered a hack long ago that if you skip rootDepths by 2 in iterative deepening instead of 1 you get instant STC gain
I used this trick and another trick of not resolving failhighs which are ++ers in this time control
This rootDepth tweaks alone are +20Elo or something

#

literally if you send go infinite
you'd get depth 2 then depth 4 printing

steep estuary
#

Yeah when I talk about tests note that it's at fucked time controls

#

Because we were expecting increment

#

And it just never came

solemn night
#

all our tests are with inc

#

to avoid noise

#

even when we knew

#

it will never come

steep estuary
#

Did you also tune with Inc or at sudden death?

solemn night
#

We tuned search and eval with inc, only TM with sudden death

steep estuary
#

Ah a split tune, we just bundled stuff

#

Tunes are 700k games of the 1.2M we used lol

solemn night
#

last tune on search we wanted to try sudden death after 100k games

#

it failed even passing

#

sudden death

steep estuary
#

Sudden death is such a mistake

solemn night
#

We did a lot of SPSA

steep estuary
#

As a concept

steep estuary
#

Tbf I didn't know anything about tuning

#

Since I had never tuned Alex before

#

And that's why I then turned Alex for a month

#

It was interesting

solemn night
#

I never tuned in SF before, I have 1 tune in fishtest

#

I also properly learned how to do it here

#

basically seen it as a doll thing

#

I always wanted to try ideas-patches in SF

#

but here I really wanted Elo only

#

not neccessarly interesting stuff

#

how big btw is your network?
We ended up having a 45kb network

#

Idk why I assumed your net is much smaller

#

HL32 16bit?

steep estuary
#

hl80

#

(x2, with hm)

#

admittedly with some novel compression

#

since i figured lebstuff wouldn't be worth it (it might be, honestly i was lazy)

steep estuary
steep estuary
solemn night
#

raw net is 45kb

steep estuary
#

yeah ours is 3.5x the size then

solemn night
#

insane

#

did you remove a lot of search features?

steep estuary
#

0

solemn night
#

alex is probably compact

steep estuary
#

compared to alex master it only lacks probcut because probcut was introduced later

#

and i didn't backport it

#

we also had 1kb to spare so you could add it i reckon

steep estuary
#

but yeah it's fairly compact

steep estuary
#

i checked the 96x2 one by mistake (we never got it to fit, we needed 6-7 kbs for it)

solemn night
#

I'd be very interested in testing that net if open sources, hopefully you and gedas finish in top 5

steep estuary
#

note that it requires specific code to read it

#

aside from the arch itself

#

it's 8bit quantized with a block scheme

#

so you can't just read it as is

#

but if we open source it the requantizer itself and the init code are part of it ofc

steep estuary
#

and in the same way 96x2 8bit -> 80x2 full size, as far as i could tell

#

a thing i wanted to test but never got around to was yeeting static eval from tt and just calling eval on each node, since the nets are very small

#

with so low hash i think it's a good idea

solemn night
#

interesting idea, never thought of that

steep estuary
#

in newbish engines storing static eval often doesn't gain until the net is big enough

solemn night
#

I thought of making a staticeval table instead of putting that in TT but a smaller one

steep estuary
#

and they all start with at least 64x2

#

and with nowhere near the hash pressure you get on kaggle

#

where you can have at most 1.5mb

solemn night
#

we have HL64 8bit with 8 buckets, we figured that most of the 16bit weights already fit in 8bits
a very vanilla arch and inference, but a lot of consecutive improvements of the same net based on finding the best data and filtering
pushing it to probably the absolute best it can be with this arch including SPSA sessions after training

#

did you SPSA your net?

steep estuary
#

we spsa'd the engine/tm twice, the second one did nothing

#

took around 600k games

#

that's all the spsaing that went on

solemn night
#

SPSAing the net weights and biases gained us 3Elo each time

steep estuary
#

tbqh i didn't want to waste hw resources i could divert to alex dev

#

so my worker was off limits for most of the time

solemn night
#

it's also probably a bit bigger to spsa also

#

but I'd do it if resources were there

steep estuary
#

we didn't have the code for it in the first place really

#

alex tuning code is very c++, since the switch to c gedas wrote some basic stuff so we could tune some things

#

but there's no infrastructure to tune the net

#

you'd need ad-hoc code for it

#

it would have 5x'd the effort easily

#

at that point it's better to implement ponder

solemn night
#

pondering is also interesting, we have no ponderhit logic, I wonder if telling the engine that opponent played the ponder move gains Elo somehow
It's in SF but we don't have it

#

I looked briefly at andrew code

steep estuary
#

it has to

solemn night
#

and he has it

steep estuary
#

so you do the same thing as us huh

#

just fill the TT and hope

solemn night
#

yes

steep estuary
#

it works surprisingly well for such a makeshift solution

#

but i think it's several tens of elo worse than the real one

#

(note that this is speculation, if i had the "real" one i would've used it, i clearly don't so i can't test)

solemn night
#

Haha I guess so also

#

I assume it could be a game changer, idk

#

btw it's fascinating that our submission and linrock found the same leela data for those small nets that works best, Test77

#

what data did you use?

#

probably for your bigger net

#

this Test77 data is not important

steep estuary
#

i used the same binpack i use for alex

#

which is pre transformer t80 stuff

#

i didn't try other leela binpacks

#

the only thing i tried was trying to get monty data to work

#

the idea of showing up with non sf + non leela data was funny to me

#

but it wasn't good enough

steep estuary
solemn night
steep estuary
#

tcec had it on not that long ago

#

for a one off, sure, but still

solemn night
#

I learned some general stuff about nnue, not that much of an nnue guy
but I believe making an nnue only tournament is also unfair

#

it could create some novelties

#

and it might be the most interesting

#

but still I like to emphasize that search is very interesting also

steep estuary
#

copy pasting sf isn't as interesting

atomic karma
steep estuary
#

perhaps if this was about search in poker, sure

#

nnue only was a (shortsighted perhaps) suggestion to severely limit the ability of people to just clone and repackage someone's else code

#

you should not be able to get a top 10 spot by cloning and blindly maiming cfish

#

the state of the art of search is just too explored to allow for a fun, fair, interesting competition

#

exhibit A: places 1-3 and 5-10 all showing up with cfish

#

(allegedly)

steep estuary
proven finch
#

I have Obsidian with a 768->128x2->16->32->1 nnue, and no pondering at all.

steep estuary
#

tbh that's still repackaging existing search, even we did it by using Alex

proven finch
#

Yes ofc

steep estuary
#

the point of the state of the art being too explored and only allowing for copy pasting with some minor tweaking holds

#

congrats on fitting such a huge net tho, i assume you did some novel compression stuff

solemn night
#

but it's also repacking the existing nnue archs and inferences and techniques and beginning from there
I don't see how is that different if all have the same cfish to begin with

steep estuary
#

mostly because no one constrains themselves to such a small net, it just makes no sense to

#

while most good search is just "good", no matter what

#

i've seen at least 4-5 different net archs in the top 10

wide oak
#

Virtually 100% of my time was spent on nets, and shrinking code.
Search patches just uninteresting for the very reasons you've mentioned.

steep estuary
#

but ultimately chess is dead, you are right about that

wide oak
#

To some of the above convos:
I never tried any pondering experiementation. I simply did not have the drive to code it up in OpenBench correctly, and local tests were a bit hard to run due to flagging on such fast games with high concurrency. So I just kept the existing ponderhit scheme, and hoped for the best. The only thing I did, is one submission ( not posted on github yet ), has this code, along with -march=broadwell.

    if (strcmp(str, "ponderhit") == 0) {
        Threads.ponder = false;

        // When safe to do so, try to utilize the SlowDelay on Ponderhits
        #ifdef KAGGLE
            if (Time.totalTime > 2000)
                Time.maximumTime = max_int(Time.maximumTime, time_elapsed() + 85);
        #endif
    }
#

For data: 100% Ethereal data. All self-play or adversarial against SF, with the SF evals tossed in the trash. Tried Lc0 data for a few minutes, tried SF relabeled data, tried Ethereal relabled data. Was unimpressed. Maybe its better, but its such an hard time to just insert a new net trained on a different data distribution, that I did not go any further.

#

UHHHHHHHH

#

Go to the leaderboard...

#

Guess we cant adjust the K-factor without restarting...? Maybe will be fixed lol.
Which makes that higher K-factor games obsolete.

steep estuary
#

AHAHAHAHAAHHA

wide oak
#

Hilarious, but anxiety inducing.

#

Either flubbed K-factor change, or full restart for double-error fix? who knows.

#

(assuming it was fixed. We don't know that)

steep estuary
#

i just hope this doesn't postpone the end

solemn night
#

^ me too

steep estuary
#

i really want to be done with this

wide oak
#

Yeah I got bills to pay smh

atomic karma
#

One thing we do know is we will never know why or what Kaggle did, just won't happen

wide oak
#

well idk maybe someone has stepped in, in the last 6 hrs.

steep estuary
#

you are doubting kaggle openness and ability to communicate? shame on you honestly

atomic karma
wide oak
#

Well its still bad like.

#

If someone steps in now, and fixes all the memory problems, and the double errors, and swaps to increment....

#

cool........ but lost a week of time, and also now all the submissions are suboptimal again lol

steep estuary
#

i think i like the current leaderboard

#

they should freeze it

wide oak
#

No games yet played by my entries lol.

steep estuary
#

not a problem for me

wide oak
#

it does look like they already lowed the k-factor. So I actually do think this was an accidental reset.

atomic karma
#

so essentially all the games run this far with higher k-factor are now useless xD

wide oak
#

I'll feign optimism. They fixed all the bugs and did a restart.

steep estuary
#

Postponed

#

Oh well

modern fulcrum
#

haha

solemn night
#

c-number wanted it to get postponed, but submission is locked is the differnece

atomic karma
#

I am really tempted to setup a round robin tournament of interested entries (provided they are willing to opensource and OpenBench compatible their submissions), proper UHO, proper increment

wide oak
#

For the record, I reported this three weeks ago to staff running the event and the support team.

#

I've still not gotten a response to those emails.

atomic karma
#

Andrew u r gonna get 'warned for bad word usage' again soon ๐Ÿ˜›

solemn night
#

Easy solution, increase cores/machines running the leaderboard games, call it earlier :p

atomic karma
modern fulcrum
wide oak
#

Well fingers crossed that the fixed version does not introduce more bugs -- and we still see the same top-3 as we've seen the entire event thus far.

solemn night
atomic karma
#

I mean if google were really still involved things would be different is my opinion, kaggle i dunno what they're doing xD

steep estuary
#

kaggle is owned by google

#

i assume that's what they meant

atomic karma
#

but hands off

#

clearly...

steep estuary
#

doesn't mean they can't get hw from it

#

lol

atomic karma
#

true

#

I mean if there was a will to.. sure

#

i do get the point xD

#

relax

#

I got hw from them for compute xD so

solemn night
atomic karma
#

My understanding: Any change in environment post the final submission date would be detremental to competitors as none of them would've been able to test on 'the' environment, but its already changed twice i think since

steep estuary
#

What

wide oak
#

yeah the bot is not very good.

atomic karma
#

xD

wide oak
#

I like the part where it says you've been warned, but not for.

#

LOL

atomic karma
#

hahahahahaha

solemn night
#

wait did he really get banned, what did he say?

wide oak
#

Probably. I'm gonna ditch this discord, take conversations to SF kaggle so I can speak freely.

#

GL to all.

solemn night
#

good one

solemn panther
neat sapphire
solemn panther
neat sapphire
solemn panther
muted horizon
#

Apologies for the auto-ban, we are constantly dealing with spammers and the autoban stops just a few words that spammers constantly use. Specifically @ everyone, gift, cryptocurrency, bitcoin, ethereum, and whatsapp - because the vast majority of spam is crypto scams

shrewd drum
terse ether
#

Hi ! I am new here and to kaggle competition, and I would like to know why it isn't possible to do late submissions for this competition ? Thanks !

solemn night
#

Hi, any updates about the current state of the competition?
not entirely sure how things follow up, regarding medals, prizes, winners in general since the leaderboard has this warning
The competition has ended. The private leaderboard is preliminary and will be finalized after the results are verified.

atomic karma
#

I find it alarming that no one bothered to respond to one of the top 3 place holders of the said leaderboards. Unfortunate. Upgrading it to Disappointed.

wide oak
#

Should I simply consider this competition as abandoned, and assume there will be no finalized leader board, no medals awarded, and no prize money distributed? I'm told this duration from closure without notice or updates is considered to be highly unusual for Kaggle.

muted horizon
#

Please leave comments in the forum, the discord channels are not followed by Kaggle staff (as described in the discord rules).

#

It is not uncommon for competition finalization to take multiple weeks. Among other things, this period of time is used to investigate all cheating allegations. This is a normal process on Kaggle.

atomic karma
#

If I'm being honest, using past as an indicator, I find it hard to believe forum is frequented by staff. We never really got any answers on the forum either. Some high level update here ,from time to time till conclusion,would help...