UE Threat Inputs for AB | Stockfish | Page 4

rocky vigil Apr 5, 2025, 7:01 PM

#

I'll try to get transpose to work in reading

#

and see what fixed nodes results are

formal smelt Apr 5, 2025, 7:18 PM

#

You can adjust the init settings

#

You should try adjusting the ft init because by default they’ll be tiny

#

I don’t think it’s an issue in bullet, there’s always a risk of dead init and that the default method yields dead init for this net arch is just unlucky

rocky vigil Apr 5, 2025, 8:11 PM

#

@stray reef Score of stockfish-plentychess-1024 vs stockfish-linrock-512: 313 - 320 - 367 [0.496] 1000 ... stockfish-plentychess-1024 playing White: 262 - 62 - 176 [0.700] 500 ... stockfish-plentychess-1024 playing Black: 51 - 258 - 191 [0.293] 500 ... White vs Black: 520 - 113 - 367 [0.704] 1000 Elo difference: -2.4 +/- 17.1, LOS: 39.0 %, DrawRatio: 36.7 %
(25k nodes, UHO4060v2)
btw since your impl is more advanced than mine rn maybe consider working with linrock/viren to test multilayer

#

in the meanwhile I'll be trying to optimize my impl in sf

stray reef Apr 5, 2025, 8:13 PM

#

wow that is a terrible result lmao

#

for me i mean

#

yeah I'm down to test or code stuff, or train some nets

rocky vigil Apr 5, 2025, 8:15 PM

#

you can check https://github.com/sscg13/Stockfish/commit/0140236ea10a9b5557cd45438218602a7e5a3533 but I think I got everything right

stray reef Apr 5, 2025, 8:16 PM

#

if there was an inference issue it would be far worse. can't check today anymore but I trust it's correct

rocky vigil Apr 5, 2025, 8:20 PM

#

stray reef yeah I'm down to test or code stuff, or train some nets

yeah I think viren really wanted to test an L1=1536 net at fixed nodes vs SF master (plentychess can't do this directly but it'll be very helpful in experiments, since linrock suggests there has to be a lot of data tweaking/etc. for large nets)
and I also if possible want to test threats+king buckets (my plan is to separately UE the two accumulators then combine them on evaltime, idk how much slowdown there is)

twilit oriole Apr 5, 2025, 8:23 PM

#

There's 4x4090 if you want to do parallel experiments. I guess we can train a new baseline and then threat nets using Leela data to get around the data bottleneck

rocky vigil Apr 5, 2025, 8:25 PM

#

yeah we need some input from linrock on this (whether bullet supports all the data parsing options now)

#

plentychess also has verbatim / mmap right

#

so hopefully local stc will be more accurate as well

formal smelt Apr 5, 2025, 9:57 PM

#

rocky vigil yeah we need some input from linrock on this (whether bullet supports all the da...

I started doing it but am blocked until #nnue-dev message is resolved

stray reef Apr 6, 2025, 2:27 PM

#

512 L1 on the threat-inputs-full branch is between 1.7M and 1.75M nps on my machine, so faster than the 1.6M nps of SF master

#

for some reason 256 L1 nets only produce nonsense right now... now sure what's wrong. so I don't have data on 256 -> 512 (you could send me a net though).
But going from 512 to 1024 decreases speed by roughly 21% in this impl

stray reef Apr 6, 2025, 3:51 PM

#

With pairwise and some minor optimisations, I can match my master speed with a (80624 -> 256)x2 -> (16 -> 32 -> 1)x8 net

stray reef Apr 8, 2025, 8:16 AM

#

--------------------------------------------------
Results of Threats256PWLayers vs Main (20000 nodes, 1t, 16MB, Pohl.epd):
Elo: -17.94 +/- 6.51, nElo: -26.61 +/- 9.63
LOS: 0.00 %, DrawRatio: 41.52 %, PairsRatio: 0.74
Games: 5000, Wins: 1402, Losses: 1660, Draws: 1938, Points: 2371.0 (47.42 %)
Ptnml(0-2): [161, 677, 1038, 507, 117], WL/DD Ratio: 1.75
--------------------------------------------------

Not a bad try I would say... I did 1000 superbatches with 13B positions for this one. This is all the data I have, but it must be possible to squeeze some 30-ish fixed nodes Elo from better training procedures. Speed is about 5% slower than main, but can probably be still improved a bit

rocky vigil Apr 8, 2025, 4:09 PM

#

Is main (16x768 -> 1536) for you? If so this is a really good result

rocky vigil Apr 8, 2025, 4:15 PM

#

stray reef for some reason 256 L1 nets only produce nonsense right now... now sure what's w...

The 256/512 ones in my branches should also be available to download from fishtest, if you still need them

#

They’re already transposed

#

340 scale, 255/64 quant

stray reef Apr 8, 2025, 4:20 PM

#

rocky vigil Is main (16x768 -> 1536) for you? If so this is a really good result

9 king buckets only, but yes, L1 1536, and multilayer

stray reef Apr 8, 2025, 4:20 PM

#

rocky vigil The 256/512 ones in my branches should also be available to download from fishte...

I'll try those too then.

#

Currently I'm giving that net a second train for 1000 SBs with a lower LR

formal smelt Apr 8, 2025, 4:30 PM

#

stray reef ``` -------------------------------------------------- Results of Threats256PWLa...

oh you got pairwise init working?

#

I did merge a small fix in the default kaiming initialisation recently but im not sure if it would have made a noticeable difference

stray reef Apr 8, 2025, 4:33 PM

#

yep. for the sake of staying with TrainerBuilder I modified new_affine_custom to

pub fn new_affine_custom(&self, id: &str, input_size: usize, output_size: usize, bias_cols: usize) -> Affine {
  let wid = format!("{}w", id);
  let stdev = (1.0 / (input_size as f32 * bias_cols as f32).sqrt()).max(0.05);
  let init = InitSettings::Normal { mean: 0.0, stdev: stdev };
  let weights = self.new_weights(&wid, Shape::new(output_size, input_size), init);
  let bias = self.new_weights(&format!("{}b", id), Shape::new(output_size, bias_cols), InitSettings::Zeroed);

  Affine { weights, bias }
}

formal smelt Apr 8, 2025, 4:33 PM

#

you can seed the weights without doing that btw

#

trainer.optimiser_mut().graph.get_weights_mut("l0w").seed_random(0.0, 0.05, true).unwrap();

stray reef Apr 8, 2025, 4:35 PM

#

I see, that's good to know

#

Not sure if there's something better than 0.05, but that's what the formula works out to for my master net (at least that's what I remember), so I just tried it and loss was fine

formal smelt Apr 8, 2025, 4:37 PM

#

https://github.com/cosmobobak/bullet/blob/viridithas4/examples/viridithas.rs#L123 cosmo uses this

stray reef Apr 8, 2025, 5:16 PM

#

--------------------------------------------------
Results of PlentyThreats256PWLayers-0091 vs PlentyLinrock256SingleLayer-nn-23507ff7848b.nnue (20000 nodes, 1t, 16MB, Pohl.epd):
Elo: 1.11 +/- 6.46, nElo: 1.66 +/- 9.63
LOS: 63.20 %, DrawRatio: 43.16 %, PairsRatio: 0.99
Games: 5000, Wins: 1525, Losses: 1509, Draws: 1966, Points: 2508.0 (50.16 %)
Ptnml(0-2): [127, 587, 1079, 557, 150], WL/DD Ratio: 1.63
--------------------------------------------------
Results of PlentyThreats256PWLayers-0091 vs PlentyLinrock256SingleLayer-nn-23507ff7848b.nnue (6+0.06, 1t, 16MB, Pohl.epd):
Elo: 9.34 +/- 5.48, nElo: 17.46 +/- 10.23
LOS: 99.96 %, DrawRatio: 50.09 %, PairsRatio: 1.20
Games: 4428, Wins: 1218, Losses: 1099, Draws: 2111, Points: 2273.5 (51.34 %)
Ptnml(0-2): [18, 485, 1109, 564, 38], WL/DD Ratio: 1.09
--------------------------------------------------
Results of PlentyThreats256PWLayers-0091 vs PlentyLinrock256SingleLayer-nn-23507ff7848b.nnue (30+0.3, 1t, 64MB, Pohl.epd):
Elo: 5.87 +/- 7.75, nElo: 12.44 +/- 16.43
LOS: 93.11 %, DrawRatio: 56.46 %, PairsRatio: 1.17
Games: 1718, Wins: 463, Losses: 434, Draws: 821, Points: 873.5 (50.84 %)
Ptnml(0-2): [2, 170, 485, 201, 1], WL/DD Ratio: 1.16
--------------------------------------------------

Seems like multilayer just about balances out linrocks improved training setup and data at fixed nodes.
STC and LTC looking similar too. Tomorrow I will test against plenty main

formal smelt Apr 10, 2025, 3:34 PM

#

stray reef ``` -------------------------------------------------- Results of Threats256PWLa...

One thing that could be worth trying is factorising the threat inputs,
E.g. for each threat input you could add a factoriser for just the target square and piece

#

Obviously it would be a rather significant training speed hit

lofty cedar Apr 12, 2025, 8:14 AM

#

How's it even going? Looks like the idea takes pretty long to implement...

stray reef Apr 12, 2025, 8:15 AM

#

for me at least, it's too slow currently... it'd either need to be much stronger for the speed difference, or much faster with no strength loss

lofty cedar Apr 12, 2025, 8:16 AM

#

I see. It's quite hard to implement as an idea.

stray reef Apr 12, 2025, 4:31 PM

#

I think next week I'll try a much simpler threat input set, essentially 768x2x2, which just encodes for every piece if it's attacked and if it's protected. That should be much faster with regards to UE. Though it will also be a lot worse at fixed nodes compared to even simplified threat inputs

formal smelt Apr 12, 2025, 4:33 PM

#

stray reef I think next week I'll try a much simpler threat input set, essentially 768x2x2,...

That is what Monty used to do

#

Before big threat inputs

stray reef Apr 12, 2025, 4:34 PM

#

do you have any data on the fixed nodes strength difference?

#

An advantage of the 768x2x2 feature set is that it should be doable to king bucket...

formal smelt Apr 12, 2025, 4:36 PM

#

stray reef do you have any data on the fixed nodes strength difference?

Large

#

Like 50 elo or something

#

Actually it was 50 elo at stc

#

The new threat input had halved L1 compared to the old threat input net

stray reef Apr 12, 2025, 4:59 PM

#

Hm alright. It'll definitely be stronger than plain 768 :P

stray reef Apr 15, 2025, 7:55 PM

#

I had another idea for reducing the number of updates.

Basically, I feel like the net should be able to figure out everything from all the threat features, so we only need to activate each standard 768 feature if the corresponding piece is not attacked or defended at all

#

Since especially in the middlegame, pieces pretty much always move from between squares that some piece already has vision on, that should mostly get rid of the updates required for the 768 features, at hopefully a very minor fixed nodes loss

rocky vigil Apr 15, 2025, 10:06 PM

#

This would significantly reduce the number of input changes

#

Should scale better with L1 increases (if it works)

formal smelt Apr 15, 2025, 10:56 PM

#

stray reef I had another idea for reducing the number of updates. Basically, I feel like t...

that was the original idea that Viren outlined and it was way worse when we tested in Monty

#

i think you might be underestimating the difficulty of having to deduce piece value from some combination of threats given/received

stray reef Apr 16, 2025, 7:13 AM

#

I was hoping the net might figure it out even though it sounds hard

#

But if you already tested it then nevermind

rocky vigil Apr 25, 2025, 12:02 AM

#

i think on average the psq terms have much bigger influence

#

which makes sense

rocky vigil Apr 25, 2025, 12:03 AM

#

rocky vigil i think on average the psq terms have much bigger influence

(i.e. borking psq terms causes evals to be complete nonsense, but borking threat terms will still get smth less than 1000cp away)

stray reef Apr 29, 2025, 9:32 AM

#

I'm implementing the 79856+768xK arch in bullet rn but I'm not sure I'm using Factorised / Factorises quite correctly. I plan to merge myself, so I didn't implement merge_factoriser. Loss looks alright definitely, but before I waste hours or days of compute, @formal smelt could you take a look at https://pastebin.com/9YWK3xp9 if that looks reasonable?

formal smelt Apr 29, 2025, 10:19 AM

#

stray reef I'm implementing the 79856+768xK arch in bullet rn but I'm not sure I'm using Fa...

lgtm

stray reef Apr 29, 2025, 10:22 AM

#

perfect, thanks

#

i'm not 100% sure about the layout of the input weights in raw.bin though. are the factorised weights at the very beginning (before the threat feature weights)? surely they must be, because i didn't tell bullet they should start at 79856

formal smelt Apr 29, 2025, 10:29 AM

#

yeah they're put at the beginning

stray reef May 1, 2025, 11:38 AM

#

#top-dev-chill message

stray reef May 1, 2025, 12:17 PM

#

Comparing the (79856+768x12 -> 2048)x2 -> (32 -> 64 -> 1)x8 against the (79856+768x1 -> 2048)x2 -> (16 -> 32 -> 1)x8 I trained a few weeks ago, there's at least a 50 elo fixed nodes difference here. Of course I don't know how much of it comes from the king buckets vs. the larger later layers. But I do think that UEing the king buckets together with threats is the way to go to make this work

formal smelt May 1, 2025, 1:13 PM

#

stray reef Comparing the `(79856+768x12 -> 2048)x2 -> (32 -> 64 -> 1)x8` against the `(7985...

How big is this net lol

#

Like in mb

stray reef May 1, 2025, 1:13 PM

#

quantised it's 365.4MB

daring wren May 1, 2025, 1:41 PM

#

💀

twilit oriole May 1, 2025, 4:07 PM

#

It compresses well though

stray reef May 21, 2025, 10:13 AM

#

finished UE for threat inputs + king buckets

#

my GPU is busy for another 2 days but then i'll try to find some arch that has chances at real TCs

#

anyone knows how much king buckets can gain at fixed nodes, against an already mirrored net?

formal smelt May 21, 2025, 1:58 PM

#

4 buckets was +20 in akimbo iirc

#

Over HM

formal smelt May 21, 2025, 1:59 PM

#

stray reef my GPU is busy for another 2 days but then i'll try to find some arch that has c...

Btw are you still doing pairwise with a tiny HL?

#

When I was messing about with more layers nets using pairwise with a HL of 256 lost a lot of elo compared to not

#

Presumably because 128->256 is a lot more elo than 768->1536 or whatever most people have now

stray reef May 21, 2025, 2:01 PM

#

ohhh good point, yes i am

#

gonna try 256 L1 without pairwise then, it's gonna be slower than master for sure but should be stronger at fixed nodes

#

simd should allow for steps of 64, but 192 may be too weak

stray reef May 21, 2025, 3:25 PM

#

@formal smelt do you think a threat inputs net of that size can be trained on capture positions too?

formal smelt May 21, 2025, 3:59 PM

#

stray reef <@236941606035521537> do you think a threat inputs net of that size can be train...

🤷‍♂️

rocky vigil May 21, 2025, 6:46 PM

#

stray reef <@236941606035521537> do you think a threat inputs net of that size can be train...

afaik linrock tried it and it didn't go too well (in Yukari at least)

#

up here

stray reef May 21, 2025, 6:51 PM

#

alright thx

stray reef May 23, 2025, 7:38 AM

#

This arch trains almost 4x faster than my master arch kekgasm

formal smelt May 23, 2025, 12:28 PM

#

stray reef This arch trains almost 4x faster than my master arch <:kekgasm:1032380243794739...

eta?

stray reef May 23, 2025, 12:29 PM

#

probably it won't finish before I sleep. but i can test an almost-fully trained version in like 4-8h

stray reef May 23, 2025, 4:48 PM

#

Comparing not against the master net rn, since they have different training schedules. Instead comparing against a master arch net 0102 that uses the same training schedule as the threat inputs net 0103, at the same point in training (after stage 2 finished)

--------------------------------------------------
Results of 0103r vs Main-0102r (20000 nodes, 1t, 16MB, Pohl.epd):
Elo: -38.04 +/- 8.17, nElo: -58.73 +/- 12.51
LOS: 0.00 %, DrawRatio: 42.00 %, PairsRatio: 0.54
Games: 2962, Wins: 725, Losses: 1048, Draws: 1189, Points: 1319.5 (44.55 %)
Ptnml(0-2): [108, 449, 622, 262, 40], WL/DD Ratio: 1.60
--------------------------------------------------

Not looking great. STC is running

#

I've actually used dual activation for L2 -> L3 here without thinking about it. But I doubt that'll make it any weaker, even with a small L1

stray reef May 23, 2025, 5:26 PM

#

--------------------------------------------------
Results of 0103r vs Main-0102r (5+0.05, 1t, 16MB, Pohl.epd):
Elo: -32.15 +/- 8.48, nElo: -63.80 +/- 16.72
LOS: 0.00 %, DrawRatio: 51.87 %, PairsRatio: 0.47
Games: 1658, Wins: 345, Losses: 498, Draws: 815, Points: 752.5 (45.39 %)
Ptnml(0-2): [13, 258, 430, 125, 3], WL/DD Ratio: 0.99
--------------------------------------------------

STC is holding up though!

rocky vigil May 23, 2025, 5:27 PM

#

this is ( -> 256 (no pairwise))x2 -> (16 (dual activation) -> 32 -> 1)x8?

stray reef May 23, 2025, 5:27 PM

#

There is a good chance this training schedule is absolute trash for threat input nets. So no matter how this holds up when fully trained, I'll give it another attempt

stray reef May 23, 2025, 5:27 PM

#

rocky vigil this is ( -> 256 (no pairwise))x2 -> (16 (dual activation) -> 32 -> 1)x8?

yes

rocky vigil May 23, 2025, 5:27 PM

#

ah interesting the scaling looks decent

stray reef May 23, 2025, 5:28 PM

#

yeah the speed is very good also

#

just a matter of making this arch strong, I think

#

"just"

daring wren May 23, 2025, 5:28 PM

#

🚀

rocky vigil May 23, 2025, 5:30 PM

#

stray reef yeah the speed is very good also

I think L1 and threat tracking probably take comparatively more time compared to L2 so the loss of pairwise speed probably doesn't hit as hard

formal smelt May 23, 2025, 5:35 PM

#

stray reef ``` -------------------------------------------------- Results of 0103r vs Main-...

are the king buckets factorised?

stray reef May 23, 2025, 5:36 PM

#

rocky vigil I think L1 and threat tracking probably take comparatively more time compared to...

perf for depth = 20 bench of main (left) vs new (right)

stray reef May 23, 2025, 5:36 PM

#

formal smelt are the king buckets factorised?

yes

stray reef May 23, 2025, 5:37 PM

#

stray reef perf for depth = 20 bench of main (left) vs new (right)

overhead of evaluate() goes down (since smaller L1 -> L2), but incremental updates of threats, + makemove and related threat tracking is slower

rocky vigil May 23, 2025, 5:43 PM

#

yes it looks like the threat tracking is more expensive than the actual evaluation

#

I had smth similar in SF though that was without incremental threat tracking

stray reef May 24, 2025, 7:31 AM

#

Final results vs main

--------------------------------------------------
Results of 0103rr vs Main (20000 nodes, 1t, 16MB, Pohl.epd):
Elo: -13.28 +/- 6.35, nElo: -20.14 +/- 9.63
LOS: 0.00 %, DrawRatio: 42.84 %, PairsRatio: 0.81
Games: 5000, Wins: 1427, Losses: 1618, Draws: 1955, Points: 2404.5 (48.09 %)
Ptnml(0-2): [146, 644, 1071, 533, 106], WL/DD Ratio: 1.75
--------------------------------------------------
Results of 0103rr vs Main (5+0.05, 1t, 16MB, Pohl.epd):
Elo: -37.02 +/- 9.19, nElo: -71.43 +/- 17.59
LOS: 0.00 %, DrawRatio: 50.20 %, PairsRatio: 0.44
Games: 1498, Wins: 313, Losses: 472, Draws: 713, Points: 669.5 (44.69 %)
Ptnml(0-2): [17, 242, 376, 111, 3], WL/DD Ratio: 1.09
--------------------------------------------------

Scaling is worse against main, maybe due to the training schedule, not sure.

Second attempt is underway, ETA 24h

Another idea would be to go for L1=192, but a bigger L2? like 32 or 64? no idea if that would be stronger at a similar speed

stray reef May 25, 2025, 8:21 AM

#

Next attempt

--------------------------------------------------
Results of 0104r vs Main (20000 nodes, 1t, 16MB, Pohl.epd):
Elo: -7.30 +/- 6.30, nElo: -11.16 +/- 9.63
LOS: 1.16 %, DrawRatio: 43.08 %, PairsRatio: 0.88
Games: 5000, Wins: 1451, Losses: 1556, Draws: 1993, Points: 2447.5 (48.95 %)
Ptnml(0-2): [128, 628, 1077, 555, 112], WL/DD Ratio: 1.66
--------------------------------------------------
Results of 0104r vs Main (5+0.05, 1t, 16MB, Pohl.epd):
Elo: -33.55 +/- 9.50, nElo: -63.68 +/- 17.92
LOS: 0.00 %, DrawRatio: 49.03 %, PairsRatio: 0.47
Games: 1444, Wins: 312, Losses: 451, Draws: 681, Points: 652.5 (45.19 %)
Ptnml(0-2): [12, 239, 354, 110, 7], WL/DD Ratio: 1.13
--------------------------------------------------

Still not enough. At STC the slowdown kicks in (idk why it didn't here: #1336647760388034610 message), but even fixed nodes barely isn't good enough... at least with this training setup

lofty cedar May 25, 2025, 4:39 PM

#

Is the UE threat input still being tried in Stockfish?

#

Or has the Stockfish devs moved past this idea?

formal smelt May 25, 2025, 4:46 PM

#

presumably if Yoshie, as the most serious attempt at threat inputs in a/b engines thus far, gets a gainer net, then it will encourage people to try it seriously in SF
SF has not even had a properly trained threat input net tried yet afaik

formal smelt May 25, 2025, 4:47 PM

#

lofty cedar Or has the Stockfish devs moved past this idea?

also this is not the only alternative to people not currently trying something in SF

twilit oriole May 25, 2025, 4:49 PM

#

I think the underestimated drawback was the threat tracking overhead which has ended up much higher than initial expectations

#

@stray reef How does your threat tracking work?

#

And what branch is it on

#

https://github.com/Yoshie2000/PlentyChess/blob/threat-inputs-full-layers-pairwise-kingbuckets/src/threat-inputs.cpp i guess

twilit oriole May 25, 2025, 6:46 PM

#

Where do I get the net also

stray reef May 25, 2025, 6:51 PM

#

which net do you want exactly? I've uploaded some past nets to my net repo but not these recent ones

stray reef May 25, 2025, 6:51 PM

#

twilit oriole <@415167192296849409> How does your threat tracking work?

yeah that branch you just linked is the correct one, the threat updating etc is done like in yukari

#

and the feature calculation in the file you linked obv

twilit oriole May 25, 2025, 6:52 PM

#

stray reef Next attempt ``` -------------------------------------------------- Results of 0...

This one I guess?

stray reef May 25, 2025, 6:56 PM

#

Network: https://github.com/Yoshie2000/PlentyNetworks/releases/tag/0104r
Branch using it: https://github.com/Yoshie2000/PlentyChess/tree/0104r

#

if you want a verbatim version of the net, run make normally and it'll be put at processed.bin

stray reef May 25, 2025, 7:40 PM

#

is that clang or gcc?

#

compiler not supported? wait

#

what is your compiler / os setup

twilit oriole May 25, 2025, 7:41 PM

#

Thats mingw

#

oh it uses clang lmao

stray reef May 25, 2025, 7:46 PM

#

mmm

#

i haven't compiled on mingw in a while maybe i broke smth

#

xD

#

gcc should work tho

twilit oriole May 25, 2025, 8:00 PM

#

g++.exe (Rev3, Built by MSYS2 project) 14.1.0

g++ -std=c++17 -Wall -pedantic -Wextra -fcommon -pthread -O3 -g -ggdb -DARCH_X86 -march=native -lstdc++ -static -Wl,--no-as-needed -DEVALFILE=\"processed.bin\"  -c src/engine.cpp -o src/engine.o
In file included from src/uci.h:5,
                 from src/engine.cpp:2:
src/nnue.h:472:3: error: '__attribute_noinline__' does not name a type
  472 |   __attribute_noinline__ void resetAccumulator(Board* board, Accumulator* acc);
      |   ^~~~~~~~~~~~~~~~~~~~~~
src/nnue.h:476:3: error: '__attribute_noinline__' does not name a type
  476 |   __attribute_noinline__ void calculateAccumulators();
      |   ^~~~~~~~~~~~~~~~~~~~~~
src/nnue.h:479:3: error: '__attribute_noinline__' does not name a type
  479 |   __attribute_noinline__ void refreshPieceFeatures(Accumulator* acc, KingBucketInfo* kingBuc
ket);
      |   ^~~~~~~~~~~~~~~~~~~~~~
src/nnue.h:481:3: error: '__attribute_noinline__' does not name a type
  481 |   __attribute_noinline__ void refreshThreatFeatures(Accumulator* acc);
      |   ^~~~~~~~~~~~~~~~~~~~~~
src/nnue.h:484:3: error: '__attribute_noinline__' does not name a type
  484 |   __attribute_noinline__ void incrementallyUpdatePieceFeatures(Accumulator* inputAcc, Accumu
lator* outputAcc, KingBucketInfo* kingBucket);
      |   ^~~~~~~~~~~~~~~~~~~~~~
src/nnue.h:486:3: error: '__attribute_noinline__' does not name a type
  486 |   __attribute_noinline__ void incrementallyUpdateThreatFeatures(Accumulator* inputAcc, Accum
ulator* outputAcc, KingBucketInfo* kingBucket);
      |   ^~~~~~~~~~~~~~~~~~~~~~
make[1]: *** [Makefile:164: src/engine.o] Error 1
make[1]: Leaving directory '/c/Users/Viren/Documents/Github/PlentyChess-0104r/PlentyChess-0104r'
make: *** [Makefile:153: all] Error 2

#

CXXFLAGS = -std=c++17 -Wall -pedantic -Wextra -fcommon -pthread -O3 \
           -D'__attribute_noinline__=__attribute__((noinline))'
CXXFLAGS_EXTRA =```
I put this at the top of my Makefile to fix g++ for now

stray reef May 25, 2025, 8:01 PM

#

right. those aren't necessary anyway, just put them there for profiling

rocky vigil May 25, 2025, 8:04 PM

#

lofty cedar Is the UE threat input still being tried in Stockfish?

there is an impasse because linrock wants to see sufficient speed optimization before seriously trying to make a net and we want to see sufficient fixed nodes before seriously trying to speed optimize

twilit oriole May 25, 2025, 8:05 PM

#

stray reef Next attempt ``` -------------------------------------------------- Results of 0...

Hm i think this fixed nodes is good enough? Im starting to speed optimize from it

stray reef May 25, 2025, 8:07 PM

#

we'd need to increase L1 to 320 I think. Not sure. With this exact arch I don't think I can squeeze much more than 10 elo fixed nodes without extreme effort

twilit oriole May 25, 2025, 8:07 PM

#

yeah sure

stray reef May 25, 2025, 8:07 PM

#

320 L1 would easily pass fixed nodes then ofc

twilit oriole May 25, 2025, 8:08 PM

#

I dont know about inference tricks but i think there are some tricks with the threats themselves. Like I think there are situations where you know you can terminate the calculation of threats early because there cant be further threats

stray reef May 25, 2025, 8:08 PM

#

yeah that's the stuff I didn't really put much thought into

#

there's also probably many moves that add and remove the same index

#

and i don't check for that rn either

rocky vigil May 25, 2025, 8:09 PM

#

well yeah especially if you do capture sequence

candid ivy May 25, 2025, 8:10 PM

#

rocky vigil there is an impasse because linrock wants to see sufficient speed optimization b...

if you write me a bullet config for a "serious" net together with some datasets I can also bake something

rocky vigil May 25, 2025, 8:10 PM

#

see you would have to consult linrock on that

#

the issue is you are running say L1 = N but with a speed of N+1024 or smth

twilit oriole May 25, 2025, 8:13 PM

#

SF has L1 3072. So a pretty large threat input net should be possible at equal speed

#

Maybe an L1 1024 threat input net even

rocky vigil May 25, 2025, 8:15 PM

#

well with my last attempt we only got 256 to barely be faster (without incremental threat computation)

#

so we need a major overhaul in sf

#

the second issue is that we never got the bullet -> sf arch working

#

something goes wrong in the transpose or whatever

#

the custom kernels are almost certainly significantly better than my autovec'd for loops

#

(even for single layer)

#

upstream has a major NNUE code refactor since the last time I worked on threat inputs in sf btw

#

so basically the next attempt will be almost from scratch

twilit oriole May 25, 2025, 8:20 PM

#

I think what we will have to do is forget about SF for now. Train two leeler nets for use in a plentychess branch, L1 3072 regular net and L1 1024 threat input net, and then show how much better the threat input net is in this closest representation of what it would be like in SF. Then the idea will be fully proven finally

rocky vigil May 25, 2025, 8:21 PM

#

yeah that makes sense

#

and once everything is known (for speed optimization) I can work on adding it

stray reef May 25, 2025, 8:28 PM

#

rocky vigil well with my last attempt we only got 256 to barely be faster (without increment...

this arch should be equivalent in speed with like a 2048 L1 net, in the current impl

rocky vigil May 25, 2025, 8:31 PM

#

ah

twilit oriole May 26, 2025, 8:00 AM

#

@stray reef Could you lazy update the threat generation itself? Like only walk through the threat indice updates when evaluate is needed

stray reef May 26, 2025, 8:02 AM

#

you mean what currently happens in addPiece, removePiece or movePiece?

#

sounds like it could save some time yeah

lofty cedar Jul 16, 2025, 6:07 AM

#

Bumping back this thread... byteboard representation might actually speed up NNUE with threat input.

stray reef Jul 16, 2025, 8:00 AM

#

The one @plain flower is working on? I guess so, currently incremental threat updates take up 5+% of the total runtime

rocky vigil Aug 6, 2025, 1:32 AM

#

What I am more concerned about is how going from L1=256 to L1=512 was basically neutral at stc and only +6 elo LTC, despite a big fixed nodes gain, but that result may have something to do with either speed optimization or the fact that output buckets were messed up at the time of that test (using (pieces - 1)/4 instead of -2)

twilit oriole Aug 6, 2025, 1:34 AM

#

Data starvation seems more likely to me

rocky vigil Aug 6, 2025, 1:34 AM

#

idk the details of linrock's training

twilit oriole Aug 6, 2025, 1:34 AM

#

Well. One possibility I don't know how much data and training time it had

#

Oh it's Linrock. Then it won't be that lol

rocky vigil Aug 6, 2025, 1:36 AM

#

there is definitely still a significant amount that linrock could probably gain with training routine

#

iirc he did a second stage of the L1=256 and gained 6 elo on top

#

@plain flower can I learn more about incremental threat tracking?

#

the simplest working method for threat inputs, would be, given a position and a move, compute the added and removed threats

#

we can ignore pins, making this simpler

#

I don't think we necessarily need full attack table knowledge (in particular, we may be able to save some computation), but I am not the expert on this

lofty cedar Aug 6, 2025, 1:50 AM

#

Maybe we can try several versions, with pins incorporated or ignored.

rocky vigil Aug 6, 2025, 1:57 AM

#

the network is trained ignoring pins, so it would be best if we also inference ignoring pins

#

we do not need all the functions necessary for movegen, we only need enough functionality to know what is attacked

lofty cedar Aug 6, 2025, 2:45 AM

#

Yes... I mean each version would need its own network as well.

#

But might as well extract as much information as possible to feed into the net if it helps.

#

For now though, let's just ignore pins.

rocky vigil Aug 6, 2025, 4:27 AM

#

At least if you compare my impl with yukari you see that despite having L1=384 Yukari is way faster in midgame positions

#

(well Yukari also uses simplified threat inputs but afaik it should be around the same speed as with full threat inputs?)

plain flower Aug 6, 2025, 7:50 PM

#

rocky vigil <@1320686513465462815> can I learn more about incremental threat tracking?

tbh doing superpiece rays from src/dest and updating relevant sliders is pretty much it

rocky vigil Aug 6, 2025, 7:53 PM

#

hmm so basically

#

do superpiece from src

#

update all sliders attacking src

#

do superpiece from dest

#

update all sliders attacking dest?

plain flower Aug 6, 2025, 7:54 PM

#

yeah

#

in attack tables this would be called slider extension / slider retraction respectively

rocky vigil Aug 6, 2025, 7:54 PM

#

I was thinking of this but it looked painful to implement

#

esp. since we don't have a way of only doing a single ray-direction

#

it would be more convenient if we had file-only attacks, for instance

rocky vigil Aug 6, 2025, 7:55 PM

#

plain flower tbh doing superpiece rays from src/dest and updating relevant sliders is pretty ...

what about e.g. castling (especially frc castling?)

plain flower Aug 6, 2025, 7:57 PM

#

rocky vigil what about e.g. castling (especially frc castling?)

code from Clockwork: ```cpp
switch (m.flags()) {
case MoveFlags::Normal:
new_pos.incrementally_move_piece(color, from, to, src);
// ...
break;
case MoveFlags::CaptureBit:
new_pos.incrementally_remove_piece(color, src.id(), from);
new_pos.incrementally_mutate_piece(!color, dst.id(), to, color, src);

    // ...
    break;
case MoveFlags::Castle: {
    // ...

    // TODO: Optimize further (slider updates can be elided in some cases).
    new_pos.incrementally_remove_piece(color, king_id, king_from);
    new_pos.incrementally_remove_piece(color, rook_id, rook_from);
    new_pos.incrementally_add_piece(color, king_place, king_to);
    new_pos.incrementally_add_piece(color, rook_place, rook_to);

    // ...
    break;
}
case MoveFlags::EnPassant: {
    // ...

    new_pos.incrementally_remove_piece(!color, victim.id(), victim_sq);
    new_pos.incrementally_move_piece(color, from, to, src);

    // ...
    break;
}
case MoveFlags::PromoKnight:
case MoveFlags::PromoBishop:
case MoveFlags::PromoRook:
case MoveFlags::PromoQueen: {
    // ...

    new_pos.incrementally_move_piece(color, from, to, new_place);

    // ...
    break;
}
case MoveFlags::PromoKnightCapture:
case MoveFlags::PromoBishopCapture:
case MoveFlags::PromoRookCapture:
case MoveFlags::PromoQueenCapture: {
    // ...

    new_pos.incrementally_remove_piece(color, src.id(), from);
    new_pos.incrementally_mutate_piece(!color, dst.id(), to, color, new_place);

    // ...
    break;
}
}

#

where move does extension at src and retraction at dst
add_piece just does retraction
and remove_piece just does extension

#

mutate doesn't do any slider updates

rocky vigil Aug 6, 2025, 7:59 PM

#

hmm for threat inputs we don't want to have to mutate separately since that means changing all the corresponding inputs of attackers of that piece

plain flower Aug 6, 2025, 8:00 PM

#

yukari doesn't have mutate

#

it just does a remove then add

rocky vigil Aug 6, 2025, 8:00 PM

#

rocky vigil At least if you compare my impl with yukari you see that despite having L1=384 Y...

btw can someone independently verify this

#

at least on my laptop yukari seems 2x faster in a typical midgame position

#

which is honestly just sad

plain flower Aug 6, 2025, 8:01 PM

#

yukari doesn't even use bitboards or byteboards lol

rocky vigil Aug 6, 2025, 8:01 PM

#

yeah but afaik for L1=256 threat tracking takes up like half the total runtime

#

I would like to reduce that significantly

#

since I know it should be possible lol

plain flower Aug 6, 2025, 8:02 PM

#

i had a patch that does bitrays for SEE implemented for AVX2 and AVX512 which could be adapted for threat updates

rocky vigil Aug 6, 2025, 8:03 PM

#

yeah I believe that vector stuff can make this faster

plain flower Aug 6, 2025, 8:05 PM

#

https://github.com/official-stockfish/Stockfish/compare/master...87flowers:Stockfish:faster-see-3

rocky vigil Aug 6, 2025, 8:14 PM

#

stray reef perf for depth = 20 bench of main (left) vs new (right)

reviewing this again it seems concerning that updating the threat feature accumulator is 4x more expensive than piece features, since from my measurements the average number of changed features should be comparable, not 4x as many

#

btw I also was not able to compile PlentyChess-0104r because of these weird issues: src/nnue.h:472:3: error: unknown type name '__attribute_noinline__' 472 | __attribute_noinline__ void resetAccumulator(Board* board, Accumulator* acc); | ^ src/nnue.h:476:3: error: unknown type name '__attribute_noinline__' 476 | __attribute_noinline__ void calculateAccumulators(); | ^ src/nnue.h:479:3: error: unknown type name '__attribute_noinline__' 479 | __attribute_noinline__ void refreshPieceFeatures(Accumulator* acc, KingBucketInfo* kingBucket); | ^ src/nnue.h:481:3: error: unknown type name '__attribute_noinline__' 481 | __attribute_noinline__ void refreshThreatFeatures(Accumulator* acc); | ^ src/nnue.h:484:3: error: unknown type name '__attribute_noinline__' 484 | __attribute_noinline__ void incrementallyUpdatePieceFeatures(Accumulator* inputAcc, Accumulator* outputAcc, Ki... | ^ src/nnue.h:486:3: error: unknown type name '__attribute_noinline__' 486 | __attribute_noinline__ void incrementallyUpdateThreatFeatures(Accumulator* inputAcc, Accumulator* outputAcc, K...

#

is this a compiler issue on my end

stray reef Aug 6, 2025, 8:17 PM

#

ah just remove the __attribute_noinline__, i think it doesn't work on all compilers, it's just there so the function is forced to show up in the profiler

rocky vigil Aug 6, 2025, 8:18 PM

#

oh I see

#

is there a noticeable speed diff

stray reef Aug 6, 2025, 8:18 PM

#

not measurable with these functions

rocky vigil Aug 6, 2025, 8:18 PM

#

ah

#

so I do that and just standard make right

#

am trying to speed compare on my laptop

#

arch is (threats + 12x768) -> 256 -> (16 -> 32 -> 1)?

stray reef Aug 6, 2025, 8:19 PM

#

rocky vigil reviewing this again it seems concerning that updating the threat feature accumu...

i think there should be more threat updates than piece updates, don't have my numbers anymore, but iirc the average total update was like 7.X, whereas without threat inputs it's 2.X

rocky vigil Aug 6, 2025, 8:19 PM

#

hmm

stray reef Aug 6, 2025, 8:20 PM

#

still not a 4x increase of course

stray reef Aug 6, 2025, 8:20 PM

#

rocky vigil arch is (threats + 12x768) -> 256 -> (16 -> 32 -> 1)?

yeah

rocky vigil Aug 6, 2025, 8:22 PM

#

am still very curious about how Yukari can be so much faster with L1=384

#

I am pretty sure there is a negligible speed difference between simplified vs full threat inputs

stray reef Aug 6, 2025, 8:23 PM

#

yep

#

well yukari doesn't have multilayer

rocky vigil Aug 6, 2025, 8:23 PM

#

yeah but my single-layer impl is sf is like,

#

:((( slow

#

can you try running yukari release vs https://github.com/sscg13/Stockfish/tree/threat-inputs and let me know the nps's of a few positions?

stray reef Aug 6, 2025, 8:25 PM

#

can hopefully do it in 30min

rocky vigil Aug 6, 2025, 8:25 PM

#

thanks a lot

#

i mean for testing I can always just set time odds but I think linrock thought that wasn't sound

rocky vigil Aug 6, 2025, 8:27 PM

#

stray reef i think there should be more threat updates than piece updates, don't have my nu...

after 100M node search from startpos: Number of accumulator updates: 168218824 Number of positions looped through: 342768578 Number of feature indices looped through: 1840654489
but iirc feature indices counts both psq + threat

#

I think the number here is (feature indices) / (2 * acc updates)

#

which comes out to be around 5.5?

#

at least much less than 7.X

rocky vigil Aug 6, 2025, 8:40 PM

#

stray reef yeah

the inference seems broken on my compiled exe but it looks to be 50% faster than my single layer threats -> 256 -> 1 lmao

stray reef Aug 6, 2025, 8:41 PM

#

broken? that's not good

rocky vigil Aug 6, 2025, 8:41 PM

#

does it work locally on your computer?

stray reef Aug 6, 2025, 8:42 PM

#

yep

#

what's your bench output?

rocky vigil Aug 6, 2025, 8:42 PM

#

huh

#

Nodes searched  : 1078738
Nodes/second    : 1112101```

stray reef Aug 6, 2025, 8:42 PM

#

yeah that's broken... on my machine it matches the number in the commit

#

what CPU arch, and what platform/ compiler are you using?

rocky vigil Aug 6, 2025, 8:43 PM

#

all I did was remove the __attribute__noinline and run make

rocky vigil Aug 6, 2025, 8:43 PM

#

stray reef what CPU arch, and what platform/ compiler are you using?

1165-G7 (Intel, Tiger Lake (11th gen, AVX512) mobile), Windows, clang

stray reef Aug 6, 2025, 8:44 PM

#

wow i have the exact same config on my laptop, lemme try there

rocky vigil Aug 6, 2025, 8:44 PM

#

I did get lld: error: unknown argument: --no-as-needed so I executed the final link without -Wl,--no-as-needed

#

(I have the same issue compiling sf though, and this workaround has never messed up sf compilation for me)

stray reef Aug 6, 2025, 8:46 PM

#

rocky vigil after 100M node search from startpos: ```Number of accumulator updates: 16821882...

i added some prints in incrementallyUpdateThreatFeatures / incrementallyUpdatePieceFeatures to see how many add/sub/addsub calls there are per incremental update. result:

1.38653 for piece features
8.76375 for threat features
(10M nodes from startpos)

#

piece features are mostly fused in addsub, therefore < 2

#

this is uhhh

#

pretty bad

rocky vigil Aug 6, 2025, 8:47 PM

#

does this multiply by 2 since two accumulators per position

#

actually I'm trolliing

stray reef Aug 6, 2025, 8:47 PM

#

well yeah every feature is applied to both accumulators

rocky vigil Aug 6, 2025, 8:48 PM

#

rocky vigil after 100M node search from startpos: ```Number of accumulator updates: 16821882...

yeah for this just 1840654489/168218824 should be correct

stray reef Aug 6, 2025, 8:48 PM

#

but the number doesn't multiply by 2

rocky vigil Aug 6, 2025, 8:48 PM

#

so it is 10.x

stray reef Aug 6, 2025, 8:48 PM

#

rocky vigil 1165-G7 (Intel, Tiger Lake (11th gen, AVX512) mobile), Windows, clang

mingw clang or "native" clang?

rocky vigil Aug 6, 2025, 8:48 PM

#

(the psq is not fused since I literally did the simplest for loop autovec)

#

mingw clang I think

#

or

#

actually for me clang lives in msys64/clang64/bin

#

iidk

stray reef Aug 6, 2025, 8:50 PM

#

that's mingw clang then

rocky vigil Aug 6, 2025, 8:50 PM

#

ok

stray reef Aug 6, 2025, 8:51 PM

#

what do you want me to run on your sf branch and yukari?

rocky vigil Aug 6, 2025, 8:51 PM

#

uh

#

can you just pull up the nps values for a single (LTC) game from startpos

#

between the two

#

or is that too complicated

#

maybe cutechess-ob works for this

stray reef Aug 6, 2025, 8:53 PM

#

i have no idea how cutechess cli works

rocky vigil Aug 6, 2025, 8:53 PM

#

oh uh

stray reef Aug 6, 2025, 8:53 PM

#

does fastchess output nps in the pgns?

rocky vigil Aug 6, 2025, 8:53 PM

#

good question

#

I don't have fastchess

#

I stick with cutechess bc shatranj

rocky vigil Aug 6, 2025, 8:55 PM

#

stray reef i have no idea how cutechess cli works

maybe try .\cutechess-ob.exe -engine cmd="engine1-path" tc=60+1 name=engine1-name proto=uci -engine cmd="engine2-path" tc=60+1 name=engine2-name proto=uci -games 1 -rounds 1 -pgnout "pgn-file"

#

I think all file paths need to be absolute

stray reef Aug 6, 2025, 9:01 PM

#

damn, yukari doesn't output nps...

doing the calculations from time and nodes searched, for a 10s think from startpos, yukari is roughly 3x faster (7900X, 1 thread)

#

i'll see if some llm can quickly make a script to calculate this for the PGN...

rocky vigil Aug 6, 2025, 9:04 PM

#

oh shoot yukari does output nps (e.g. run game in cutechess GUI, or maybe cutechess auto-calculates it???) but I think cutechess-ob only prints time and nodes searched

#

which is good enough theoretically

rocky vigil Aug 6, 2025, 9:05 PM

#

stray reef damn, yukari doesn't output nps... doing the calculations from time and nodes s...

lmao 3x faster with L1=384 vs L1=256

#

is comparable to result on my laptop

stray reef Aug 6, 2025, 9:08 PM

#

i think yukari doesn't report final nodes during hard cutoffs, at least i think so since the nps are very inconsistent

rocky vigil Aug 6, 2025, 9:08 PM

#

huh

stray reef Aug 6, 2025, 9:08 PM

#

it's only roughly 50% faster on average during the game, for this reason

rocky vigil Aug 6, 2025, 9:09 PM

#

we regain our speed in the endgame I know this

#

but in the midgame the speed tanks

stray reef Aug 6, 2025, 9:09 PM

#

alright i gtg, ping me if there's anything else

rocky vigil Aug 6, 2025, 9:10 PM

#

aight thanks for your help

#

the "vibe" way to compare nps is to just load up a game in cutechess gui and eyeball the nps ratios lol

stray reef Aug 6, 2025, 9:10 PM

#

that requires having some chess gui installed :P

rocky vigil Aug 6, 2025, 9:10 PM

#

I see :P

rocky vigil Aug 11, 2025, 7:17 AM

#

@formal smelt @hollow crystal turns out I actually did stc with fixed output buckets, but no ltc, anyways it's still basically neutral: https://tests.stockfishchess.org/tests/view/67e1f1d38888403457d87680
I'll rebase everything and run some again after mmap I think

#

can estimate the current ltc diff to be in the range of 7-9

#

but it will change a lot with speed optimization and mmap

rocky vigil Aug 11, 2025, 9:08 PM

#

stray reef piece features are mostly fused in addsub, therefore < 2

can you not addsub threat changes as well or is there a limitation to this

stray reef Aug 11, 2025, 9:10 PM

#

iirc i tried variations of this without success, but looking at the now I'm not sure i did it right... i'll put it on my todo list

rocky vigil Aug 14, 2025, 10:40 PM

#

i wonder if there's any eta on mmap

#

once that is merged I'll rebase the basic ue to see how it affects the L1 scaling

violet badger Aug 15, 2025, 5:03 AM

#

the branch is usable state..

#

but not mergeable state yet

twilit oriole Aug 28, 2025, 7:11 AM

#

stray reef Comparing the `(79856+768x12 -> 2048)x2 -> (32 -> 64 -> 1)x8` against the `(7985...

How much data did you train these nets with? I don't seem able to replicate this result and I think it may have something to do with not being data starved so the full threat inputs can saturate

#

Since the additions in the latter net are much less sparse than full threat inputs

#

For context we are at 50B+ positions with L1 3072 and still not fully saturating full threat inputs

stray reef Aug 28, 2025, 7:32 AM

#

twilit oriole How much data did you train these nets with? I don't seem able to replicate this...

This must have been around 7B positions each

rocky vigil Aug 28, 2025, 9:06 AM

#

twilit oriole How much data did you train these nets with? I don't seem able to replicate this...

what kind of result are you getting (the later layers of the first net are also twice as large)

twilit oriole Aug 28, 2025, 9:09 AM

#

So far I just checked increasing L2 size and regular piece output buckets, not much was going on there. The king buckets test will happen soon

rocky vigil Aug 28, 2025, 9:11 AM

#

ah in monty?

twilit oriole Aug 28, 2025, 9:12 AM

#

Ye

rocky vigil Aug 28, 2025, 9:12 AM

#

surprising that increasing L2 isn't that good

twilit oriole Aug 28, 2025, 9:12 AM

#

This is some short writeup on how threat inputs progressed in monty also (and what the performance is like there)

twilit oriole Aug 28, 2025, 9:12 AM

#

rocky vigil surprising that increasing L2 isn't that good

I think the full threats sucks the Elo out the later layers basically

#

Linrock didn't gain with output buckets either with full threats

#

In SF

rocky vigil Aug 28, 2025, 9:13 AM

#

i thought it was like +5 elo or smth

#

idk my branch had them so...

#

single layer + small L1 maybe makes it different

twilit oriole Aug 28, 2025, 9:13 AM

#

https://tests.stockfishchess.org/tests/view/67d4935e517865b4a2dfcf8d

#

They don't actually work with threat inputs as far as I can tell

rocky vigil Aug 28, 2025, 9:15 AM

#

tbh when the commit message says "what is going wrong" i might've borked smth

#

was there any later test

twilit oriole Aug 28, 2025, 9:15 AM

#

No but there is a diff. If you screwed smth it is wrong in both sides

#

The diff is very simple

rocky vigil Aug 28, 2025, 9:15 AM

#

nvm it's basically the end of the branch

rocky vigil Aug 28, 2025, 9:17 AM

#

twilit oriole https://tests.stockfishchess.org/tests/view/67d4935e517865b4a2dfcf8d

he gained like 15 elo with more involved training right after this test lol

twilit oriole Aug 28, 2025, 9:18 AM

#

Yeah but output buckets shouldn't really benefit much from that

rocky vigil Aug 28, 2025, 9:18 AM

#

yeah i guess we actually should just not have them whoops

twilit oriole Aug 28, 2025, 9:19 AM

#

I mean it's not the only thing there's probably a lot of small things to tweak. It just wasn't the focus

rocky vigil Aug 28, 2025, 9:20 AM

#

if increasing L2 doesn't gain much then it might also be worth testing decreasing L2 to 8

#

or smth

twilit oriole Aug 28, 2025, 9:20 AM

#

I put this summary of how threats progressed in monty

rocky vigil Aug 28, 2025, 9:20 AM

#

although idk if it would screw with the nnz or anything

twilit oriole Aug 28, 2025, 9:21 AM

#

Just to get an idea of the value of full threat inputs

#

They are great tbh, just too bad fast threat gen is so hard lol

rocky vigil Aug 28, 2025, 9:21 AM

#

sorry what is being compared to a standard 768 -> 3072

twilit oriole Aug 28, 2025, 9:21 AM

#

The 80624 -> 3072

#

Full threats Vs none at all

#

Is about 300 UHO (if you set midpoint anchors so you don't hit book limit)

rocky vigil Aug 28, 2025, 9:23 AM

#

oh wait fixing the indexing bug was worth that much

#

interesting

twilit oriole Aug 28, 2025, 9:23 AM

#

At least

#

It's still training

rocky vigil Aug 28, 2025, 9:23 AM

#

is it not in monty main yet

twilit oriole Aug 28, 2025, 9:23 AM

#

No. Since it is training lol

rocky vigil Aug 28, 2025, 9:24 AM

#

i guess that's smth exciting to be looking forward to

twilit oriole Aug 28, 2025, 9:24 AM

#

I took the +40 measurement midway through the run

rocky vigil Aug 28, 2025, 9:24 AM

#

since it should be much better than the current value net right

twilit oriole Aug 28, 2025, 9:24 AM

#

Yeah

#

rocky vigil Aug 28, 2025, 9:26 AM

#

twilit oriole For context we are at 50B+ positions with L1 3072 and still not fully saturating...

as far as i can tell linrock did 220 sb training so there's probably some big gain there as well

twilit oriole Aug 28, 2025, 9:26 AM

#

Nah. He trained a much smaller net right

formal smelt Aug 28, 2025, 9:26 AM

#

twilit oriole

remember to coauthor me and sscg (he found one of the bugs)

twilit oriole Aug 28, 2025, 9:26 AM

#

Yeah

rocky vigil Aug 28, 2025, 9:27 AM

#

wait i become monty contributor from this

#

lezgo

twilit oriole Aug 28, 2025, 9:27 AM

#

🚀

rocky vigil Aug 28, 2025, 9:28 AM

#

idk what the training time scaling laws are

twilit oriole Aug 28, 2025, 9:28 AM

#

Linear with L1 size is what I found

#

Assuming same arch ofc

rocky vigil Aug 28, 2025, 9:28 AM

#

hmm so if you are doing 4000 with 3072 then I guess 350 for 256 is good enough

twilit oriole Aug 28, 2025, 9:29 AM

#

MCTS has longer training because we take LR much lower

rocky vigil Aug 28, 2025, 9:30 AM

#

btw if vondele can get within 10 elo to master net then resuming threat input training is feasible right

twilit oriole Aug 28, 2025, 9:31 AM

#

I mean I thought about what about just temporarily shoving threat inputs into NNUE pytorch kek

#

But it doesn't solve the issue of not having fast threat gen

#

I don't even know how incremental threat gen works

rocky vigil Aug 28, 2025, 9:32 AM

#

yeah i am decently convinced having acceptable speed requires a major change to sf position framework

#

anyways I don't really want to rebase until either that or mmap is worked out

#

but looks like it will be quite a wait

twilit oriole Aug 28, 2025, 9:34 AM

#

Yeah. There's the plenty branch, if someone sends me some configs for bigger nets I can train that. We can simulate it with L1 3072 base Vs L1 1024 threats and Leela data for both or smth

rocky vigil Aug 28, 2025, 9:34 AM

#

ah true

#

plenty is decently optimized (at least +50% including multilayer)

twilit oriole Aug 28, 2025, 9:36 AM

#

I mean I think a L1 3072 base Vs L1 1024 threats in plenty will already work tbh without additional optimization

#

Like the threats will already be superior in that comparison

rocky vigil Aug 28, 2025, 9:37 AM

#

yeah I am pretty convinced as well but somehow yoshie never found the speed / data to make it work selfgen

twilit oriole Aug 28, 2025, 9:38 AM

#

Because his base net is L1 1536 and the threats have fixed overhead is what I think

#

Like in the 3072 Vs 1024 that's a 2048 delta already

#

SF is unique in that it's somehow managed to work out how to allow eval taking a large fraction of total time already

rocky vigil Aug 28, 2025, 9:40 AM

#

fixed overhead is identical to increasing l1 by 512 I think (or, 1024 in my impl lmao)

#

personally am more concerned why L1 = 512 to L1 = 256 didn't work in stc (in fact, slower threat impl should make this more favorable to the larger net)

rocky vigil Aug 28, 2025, 9:41 AM

#

twilit oriole https://tests.stockfishchess.org/tests/view/67d4935e517865b4a2dfcf8d

tbf i borked the output buckets initially and only realized later, see https://tests.stockfishchess.org/tests/view/67df73348888403457d874df

#

so it needs to be redone eventually

twilit oriole Aug 28, 2025, 9:42 AM

#

Yeah perhaps. But the Elo delta is very small regardless, output buckets usually yields more I thought

rocky vigil Aug 28, 2025, 9:43 AM

#

yeah +10-20 is normal whereas here suggests it's +3 or smth

twilit oriole Aug 28, 2025, 9:44 AM

#

https://github.com/official-monty/montytrain/tree/fixed-threat-inputs-out-buc soon we will attempt output buckets with piece and threat count in monty, will see how that goes

rocky vigil Aug 28, 2025, 9:44 AM

#

rocky vigil personally am more concerned why L1 = 512 to L1 = 256 didn't work in stc (in fac...

might be an artifact of training tbh

twilit oriole Aug 28, 2025, 9:44 AM

#

twilit oriole <https://github.com/official-monty/montytrain/tree/fixed-threat-inputs-out-buc> ...

Might as well if we have the threat count already is what I'm thinking

#

There's no overhead then really

#

It's segmented like this

#

Counts were checked also to make sure it never gets too low

#

So all buckets get trained properly for sure

#

Just waiting on some new montytrain operations impls to do it

rocky vigil Aug 28, 2025, 9:46 AM

#

ah interesting

rocky vigil Aug 28, 2025, 9:48 AM

#

rocky vigil might be an artifact of training tbh

like the L1=256 has +6 training advantage as far as i can tell

twilit oriole Aug 28, 2025, 9:49 AM

#

The NN inference might be slow also

rocky vigil Aug 28, 2025, 9:49 AM

#

true I autovec'd it

twilit oriole Aug 28, 2025, 9:49 AM

#

Which would have close to twice the impact at double L1

rocky vigil Aug 28, 2025, 9:50 AM

#

rocky vigil true I autovec'd it

dunno how to write simd since I've actually not done it in Prolix

#

the biggest impact is probably in the screlu affine

#

there is probably also some nontrivial gain from fusing addsub

#

that's smth that someone else needs to do

#

unless we get sf arch to work in bullet anyhow and I can go back to the already written code

twilit oriole Aug 30, 2025, 1:30 PM

#

stray reef Comparing the `(79856+768x12 -> 2048)x2 -> (32 -> 64 -> 1)x8` against the `(7985...

@stray reef how did you do this with bullet? So I can try in monty adding the king bucketed piece square inputs to our full threat net also

#

If you have the config would be useful

stray reef Aug 30, 2025, 1:36 PM

#

twilit oriole <@415167192296849409> how did you do this with bullet? So I can try in monty add...

https://github.com/Yoshie2000/bullet/blob/53cd9b832b6dfb4999be6cb1df8d1c455eb51805/examples/plenty/0104.rs#L843

twilit oriole Aug 30, 2025, 1:39 PM

#

Perfect thx that's very useful

rocky vigil Sep 1, 2025, 5:24 AM

#

twilit oriole <@415167192296849409> how did you do this with bullet? So I can try in monty add...

How long does it take Monty nets to train

twilit oriole Sep 1, 2025, 5:25 AM

#

Currently around 4 days on a 4090

rocky vigil Sep 1, 2025, 5:26 AM

#

Oh that’s quite long…

violet badger Sep 1, 2025, 5:27 AM

#

actually not to different from a SF master net on H100.

rocky vigil Sep 1, 2025, 5:28 AM

#

Actually is a 4090 more effective

#

Since vram not a concern

#

Might depend on dataloader speed as well

twilit oriole Sep 1, 2025, 5:29 AM

#

No

#

vram bandwidth is definitely a concern

rocky vigil Sep 1, 2025, 5:31 AM

#

Huh

#

Interesting

#

How effective is a 5090 vs a 4090 then

#

Since the 5090 is supposed to have way more bandwidth

twilit oriole Sep 1, 2025, 5:33 AM

#

A lot more. Depends on your exact arch and how sparse it is

stray reef Sep 17, 2025, 7:10 PM

#

@rocky vigil yukari is not multilayer

#

i'm gonna try a net of yukaris arch rq (training for 1 superbatch) to compare speeds

rocky vigil Sep 17, 2025, 7:21 PM

#

Alright

rocky vigil Sep 17, 2025, 7:22 PM

#

stray reef <@693549181838819338> yukari is not multilayer

Yep I’m aware

#

But my single layer speed sucks as well

#

Because I thought the progression was going to be fixed nodes then speed optimization

stray reef Sep 17, 2025, 7:35 PM

#

Wait what the hell?

I think bench speeds aren't that comparable because of different positions, but even from startpos, yukari gets
3.7M nps during a 10s search, plenty gets
2.3M nps during a 10s search...

#

granted, this plenty arch still has king buckets. let me get rid of those rq

#

i need to take a deep dive in yukari again it seems...

#

yeah it's not that much better without king buckets

rocky vigil Sep 17, 2025, 7:43 PM

#

Wait how bad is the king bucket slowdown again?

stray reef Sep 17, 2025, 7:45 PM

#

10%-ish it seems

rocky vigil Sep 17, 2025, 7:46 PM

#

Hmm it seems Yukari maybe counts moves that get see pruned/lmp/whatever

#

Idk though

#

Rust is not my specialty

#

https://github.com/yukarichess/yukari/blob/trunk/yukari/src/search.rs#L581

stray reef Sep 17, 2025, 7:48 PM

#

ah good point. lemme change that

#

ah! :P 1.7M nps now. plenty is faster

rocky vigil Sep 17, 2025, 7:52 PM

#

Oh interesting

twilit oriole Sep 17, 2025, 7:52 PM

#

Yeah we already know about that issue where the Nps isnt comparable

rocky vigil Sep 17, 2025, 7:53 PM

#

Ok so that inflated nps quite a bit

twilit oriole Sep 17, 2025, 7:53 PM

#

Since the counting is different

rocky vigil Sep 17, 2025, 7:53 PM

#

Ngl wasn’t aware

#

Am attempting to compile Plentychess on new laptop

#

https://github.com/Yoshie2000/PlentyChess/tree/threat-inputs-full-layers-pairwise-kingbuckets is still the right one?

stray reef Sep 17, 2025, 7:53 PM

#

twilit oriole Yeah we already know about that issue where the Nps isnt comparable

it feels like we stumbled upon this already and i forgot yeah

stray reef Sep 17, 2025, 7:54 PM

#

rocky vigil <https://github.com/Yoshie2000/PlentyChess/tree/threat-inputs-full-layers-pairwi...

best one is 0118 currently

#

though the net there isn't being downloaded correctly, the one you have should work out of the box

rocky vigil Sep 17, 2025, 7:55 PM

#

rocky vigil <https://github.com/Yoshie2000/PlentyChess/tree/threat-inputs-full-layers-pairwi...

ah

#

why does it never work

#

is the net not processed correctly

stray reef Sep 17, 2025, 7:56 PM

#

let me check the branch rq

twilit oriole Sep 17, 2025, 7:56 PM

#

Did the experiment to add more data to training do anything?

#

For monty master net i measured king buckets + factoriser was -20 and L2 16 to 128 was +25. fixed nodes elo

#

those are the final values

rocky vigil Sep 17, 2025, 7:57 PM

#

wait what

#

king buckets lost elo

#

fixed nodes?

#

in monty?

twilit oriole Sep 17, 2025, 7:57 PM

#

yes

stray reef Sep 17, 2025, 7:57 PM

#

twilit oriole Did the experiment to add more data to training do anything?

styx wanted to do the training, not yet done unfortunately

rocky vigil Sep 17, 2025, 7:57 PM

#

very very strange

twilit oriole Sep 17, 2025, 7:57 PM

#

i would do but your format is too big :p

stray reef Sep 17, 2025, 7:58 PM

#

yes ik it's bad

#

i could also do it myself, it's pretty quick

rocky vigil Sep 17, 2025, 7:58 PM

#

considering it should be a strict generalization

#

of the threats + psq

twilit oriole Sep 17, 2025, 7:58 PM

#

the king buckets was on just the psq

rocky vigil Sep 17, 2025, 7:58 PM

#

yeah

#

but it's still more representative power

stray reef Sep 17, 2025, 7:59 PM

#

rocky vigil ah

oh it's broken for me too. whoops

twilit oriole Sep 17, 2025, 7:59 PM

#

rocky vigil but it's still more representative power

well it thought things were dependent on king position when they were not

#

threats is a more useful signal

rocky vigil Sep 17, 2025, 7:59 PM

#

hmmm

#

yeah maybe it requires like more fancy training setups

#

like start with psq

#

and then do king buckets on muc hlower lr

stray reef Sep 17, 2025, 8:00 PM

#

rocky vigil <https://github.com/Yoshie2000/PlentyChess/tree/threat-inputs-full-layers-pairwi...

this one has L1=2048. so it's probably the wrong branch / net combo. what exactly did you want to test?

rocky vigil Sep 17, 2025, 8:00 PM

#

stray reef this one has L1=2048. so it's probably the wrong branch / net combo. what exactl...

essentially plentychess speed

#

vs main

twilit oriole Sep 17, 2025, 8:00 PM

#

well the loss came in lower. it did fit the data better. its just king position isnt that useful for our net so it confused it or whatever

stray reef Sep 17, 2025, 8:01 PM

#

is L1=384 multilayer fine? that would be easiest for me to push rn

rocky vigil Sep 17, 2025, 8:01 PM

#

sure

#

yeah

#

that's nice

#

huh

#

i guess less loss really doesn't mean better net 💀

twilit oriole Sep 17, 2025, 8:04 PM

#

well it is L1=3072. could be different for smaller L1s

#

our net is too clever even output buckets and stuff are rubbish

stray reef Sep 17, 2025, 8:04 PM

#

rocky vigil that's nice

https://github.com/Yoshie2000/PlentyChess/tree/0120

rocky vigil Sep 17, 2025, 8:05 PM

#

ok nice

#

Nodes searched  : 2104883
Nodes/second    : 1390279``` nice it works

#

lemme get plentychess main as well

#

Nodes searched  : 1855539
Nodes/second    : 1511025```

#

main

#

so maybe 384 threat inputs competitive with 1792 standard

stray reef Sep 17, 2025, 8:11 PM

#

it's really strange, the speed of the 384 and 256 seem to be almost the same

rocky vigil Sep 17, 2025, 8:11 PM

#

do you get similar results?

stray reef Sep 17, 2025, 8:11 PM

#

yeah more or less, faster overall but similar speed loss

rocky vigil Sep 17, 2025, 8:12 PM

#

i mean it's good if scaling L1 incurs less speed loss :p

stray reef Sep 17, 2025, 8:13 PM

#

i'm just gonna train a full L1=384 net

rocky vigil Sep 17, 2025, 8:13 PM

#

wdym by "full" sorry

stray reef Sep 17, 2025, 8:14 PM

#

like the full training schedule, not just a few SBs for fun

rocky vigil Sep 17, 2025, 8:14 PM

#

ohhhh

#

yeah good idea

#

btw do you know how L1=3072 would compare with 1792

#

i assume something on the order of -30%

stray reef Sep 17, 2025, 8:15 PM

#

rocky vigil i mean it's good if scaling L1 incurs less speed loss \:p

maybe it's because with very small L1s the overhead of doing lots of updates is comparably high to actually doing the updates, so if the update itself is 50% longer wrt. cpu instructions, the memory overhead etc. is much lower now in comparision

stray reef Sep 17, 2025, 8:15 PM

#

rocky vigil btw do you know how L1=3072 would compare with 1792

I do not unfortunately

twilit oriole Sep 17, 2025, 8:15 PM

#

I did recommend scaling L1 lol. We found same in monty, scaling L1 of the full threat net has less speed loss than expected

rocky vigil Sep 17, 2025, 8:15 PM

#

stray reef maybe it's because with very small L1s the overhead of doing lots of updates is ...

yeah i think the ultimate dream is to have L1=1024 be competitive

stray reef Sep 17, 2025, 8:16 PM

#

okay then hopefully this will yield good results...

rocky vigil Sep 17, 2025, 8:16 PM

#

how close is vondele to master?

#

might not be bad to give it a try again

stray reef Sep 17, 2025, 8:16 PM

#

i saw some -8 i think?

rocky vigil Sep 17, 2025, 8:16 PM

#

wait that's really good

twilit oriole Sep 17, 2025, 8:16 PM

#

Yeah but SPSA gives 8

#

So he is about pre SPSA net level

stray reef Sep 17, 2025, 8:16 PM

#

#nnue-dev message

#

this is all i read

rocky vigil Sep 17, 2025, 8:17 PM

#

yeah so he can replicate the tech pretty much

#

so maybe new sf nets are back on the menu

twilit oriole Sep 17, 2025, 8:17 PM

#

Yeah need to shove threat inputs into NNUE pytorch

rocky vigil Sep 17, 2025, 8:17 PM

#

which would be easier

#

unironically

stray reef Sep 17, 2025, 8:17 PM

#

need to shove bullet nets into SF :P

twilit oriole Sep 17, 2025, 8:17 PM

#

rocky vigil which would be easier

Threat inputs in NNUE pytorch ofc

rocky vigil Sep 17, 2025, 8:18 PM

#

shoving bullet nets in sf just went wrong when we tried so idk...

stray reef Sep 17, 2025, 8:18 PM

#

yeah ik

#

long term it would be so much better

twilit oriole Sep 17, 2025, 8:18 PM

#

Yeah but it failed after many months

#

So best is just work with NNUE pytorch I think

rocky vigil Sep 17, 2025, 8:18 PM

#

idk how nnue-pytorch works, is it as simple as defining new features.py

twilit oriole Sep 17, 2025, 8:19 PM

#

Also Bruno tried with a simple single layer to get bullet net on par with NNUE pytorch and failed by some 40 Elo

#

So there's that aspect also

rocky vigil Sep 17, 2025, 8:19 PM

#

single layer surely would be -20 elo

#

but -40 elo is anomalous

stray reef Sep 17, 2025, 8:19 PM

#

with leela data all that filtering is worth a lot ofc

twilit oriole Sep 17, 2025, 8:19 PM

#

No I mean both single

rocky vigil Sep 17, 2025, 8:19 PM

#

oh

twilit oriole Sep 17, 2025, 8:19 PM

#

And using Leela data

rocky vigil Sep 17, 2025, 8:19 PM

#

wait

twilit oriole Sep 17, 2025, 8:19 PM

#

Etc etc

rocky vigil Sep 17, 2025, 8:19 PM

#

strange

twilit oriole Sep 17, 2025, 8:20 PM

#

It was made to test if bullet nets are on par with NNUE pytorch

#

So everything was constant if it could be

#

Only trainer change was the idea

#

Anyways it failed terribly lmao

#

And nobody knows why

rocky vigil Sep 17, 2025, 8:21 PM

#

rocky vigil idk how nnue-pytorch works, is it as simple as defining new features.py

if this, it isn't that much of a stretch to port it

twilit oriole Sep 17, 2025, 8:21 PM

#

Yeah it's easy. Speed might be shit but oh well

rocky vigil Sep 17, 2025, 8:21 PM

#

~~surely vondele H200 or whatever cancels out the effect~~

twilit oriole Sep 17, 2025, 8:28 PM

#

I mean it's easy now. Shove the threat inputs into features.py, train a L1 1024 net using same schedule as SF master net, yoink the plenty threat UE stuff

#

That's all the steps

#

You can probably just ask vondele to do all the training even. Since he already did it once

rocky vigil Sep 17, 2025, 8:31 PM

#

yeah gimme a bit to figure out how nnue-pytorch works

#

idk how to interpret this line https://github.com/official-stockfish/nnue-pytorch/blob/master/features/halfka_v2_hm.py#L78

twilit oriole Sep 17, 2025, 8:37 PM

#

See https://github.com/official-stockfish/nnue-pytorch/blob/master/training_data_loader.cpp

#

There's where the features actually are

rocky vigil Sep 17, 2025, 8:37 PM

#

oh cmon

#

you mean this python stuff is like

#

red herring

#

bruh

twilit oriole Sep 17, 2025, 8:37 PM

#

So you would be adding it there. It's actually easier since it is c++ already lol

rocky vigil Sep 17, 2025, 8:37 PM

#

that's true

green moat Sep 17, 2025, 8:40 PM

#

rocky vigil how close is vondele to master?

https://tests.stockfishchess.org/tests/view/68c59a846d91bee0e315c893

rocky vigil Sep 17, 2025, 8:43 PM

#

yeah that's very good

violet badger Sep 18, 2025, 5:31 AM

#

and I'll share the one-liner needed to train that this weekend.

lofty cedar Sep 18, 2025, 11:35 AM

#

Only 10 elo? We're getting pretty close!

#

Wait... how much of the net is training vs post-training SPSA?

#

I mean... there already is a significant possibility that with enough SPSA tune and search tune tailored for this net, it could even vs master.

#

The question, however, is whether or not we should do that right now.

torn lagoon Sep 18, 2025, 12:51 PM

#

lofty cedar I mean... there already is a significant possibility that with enough SPSA tune ...

If it would be even, there's no point

lofty cedar Sep 18, 2025, 12:51 PM

#

torn lagoon If it would be even, there's no point

Oops... I meant beat.

#

But I mean tuning and all takes a lot of resources, and once you do that, you're kinda partially locked in.

#

So, it's probably better to train the net as best as you can first before tuning.

twilit oriole Sep 18, 2025, 1:24 PM

#

lofty cedar Wait... how much of the net is training vs post-training SPSA?

If you read I have already mentioned this

lofty cedar Sep 18, 2025, 1:30 PM

#

Oh, I see. Thanks.

violet badger Sep 18, 2025, 4:28 PM

#

also, that's both the master and small net combined, and at LTC.

lofty cedar Sep 19, 2025, 1:09 AM

#

Do we start tuning?

#

Or do we have some more training to do first?

frosty imp Sep 19, 2025, 1:35 AM

#

Hopefully we won’t need tuning

rocky vigil Sep 19, 2025, 1:38 AM

#

stray reef okay then hopefully this will yield good results...

how did the experiment go (if it's concluded)?

stray reef Sep 19, 2025, 7:10 AM

#

rocky vigil how did the experiment go (if it's concluded)?

--------------------------------------------------
Results of Threats-384-0120rrr vs Main-0119rr (20000 nodes, 1t, 16MB, UHO_4060_v2.epd):
Elo: 12.32 +/- 4.51, nElo: 18.87 +/- 6.90
LOS: 100.00 %, DrawRatio: 43.56 %, PairsRatio: 1.19
Games: 9734, Wins: 3126, Losses: 2781, Draws: 3827, Points: 5039.5 (51.77 %)
Ptnml(0-2): [183, 1072, 2120, 1201, 291], WL/DD Ratio: 1.73
--------------------------------------------------
Results of Threats-384-0120rrr vs Main-0119rr (5+0.05, 1t, 16MB, UHO_4060_v2.epd):
Elo: -17.03 +/- 5.12, nElo: -32.84 +/- 9.85
LOS: 0.00 %, DrawRatio: 50.36 %, PairsRatio: 0.68
Games: 4778, Wins: 1086, Losses: 1320, Draws: 2372, Points: 2272.0 (47.55 %)
Ptnml(0-2): [23, 684, 1203, 462, 17], WL/DD Ratio: 0.96
--------------------------------------------------
Results of Threats-384-0120rrr vs Main-0119rr (30+0.3, 1t, 64MB, UHO_4060_v2.epd):
Elo: -10.10 +/- 5.22, nElo: -20.93 +/- 10.82
LOS: 0.01 %, DrawRatio: 54.52 %, PairsRatio: 0.78
Games: 3958, Wins: 932, Losses: 1047, Draws: 1979, Points: 1921.5 (48.55 %)
Ptnml(0-2): [5, 502, 1079, 389, 4], WL/DD Ratio: 0.98
--------------------------------------------------
Results of Threats-384-0120rrr vs Main-0119rr (5+0.05, 12t, 192MB, UHO_4060_v2.epd):
Elo: -2.42 +/- 5.16, nElo: -5.04 +/- 10.74
LOS: 17.90 %, DrawRatio: 55.02 %, PairsRatio: 0.94
Games: 4020, Wins: 984, Losses: 1012, Draws: 2024, Points: 1996.0 (49.65 %)
Ptnml(0-2): [4, 462, 1106, 434, 4], WL/DD Ratio: 0.96
--------------------------------------------------

#

it is crazy close...

#

and in fact, i forgot to rebase on main, so it's missing a couple search gainers

rocky vigil Sep 19, 2025, 7:10 AM

#

wait it actually scales better ??

stray reef Sep 19, 2025, 7:10 AM

#

probably due to the slowdown, but yes!

rocky vigil Sep 19, 2025, 7:10 AM

#

unexpected

#

interesting

stray reef Sep 19, 2025, 7:11 AM

#

wondering if i should try L1=512 next

rocky vigil Sep 19, 2025, 7:11 AM

#

stc smp result is also interesting

#

do you know what could be going on with that

stray reef Sep 19, 2025, 7:11 AM

#

it just hints at better scaling imo

#

i ran this test because

it's a higher TC than 30+0.3
i ran it with concurrency 1, so there's no memory bottleneck from multiple processes (i do have verbatim but eh, it's closer to tournament conditions this way)

#

the latter may be part of the good performance

desert tree Sep 19, 2025, 7:17 AM

#

its also within error of being neutral scaling

twilit oriole Sep 19, 2025, 7:17 AM

#

hm its not lol

stray reef Sep 19, 2025, 7:17 AM

#

not if you include the smp test

desert tree Sep 19, 2025, 7:17 AM

#

oh i missed one test

#

woops yeah

twilit oriole Sep 19, 2025, 7:18 AM

#

do u want a green. u can send rebased 16 thread 10+0.1 and i put worker on it

stray reef Sep 19, 2025, 7:18 AM

#

not yet. i want it to pass under my normal (V)LTC conditions

twilit oriole Sep 19, 2025, 7:19 AM

#

hm ok. smp stc seems just as valuable to me tbh

stray reef Sep 19, 2025, 7:20 AM

#

yes ofc, if i would care less about the spcc performance it'd do it

#

this will definitely work at ccc/tcec, that's for sure

#

and with some tweaking under shorther conditions as well

rocky vigil Sep 19, 2025, 7:23 AM

#

stray reef wondering if i should try L1=512 next

it's definitely worth trying i think

stray reef Sep 19, 2025, 7:26 AM

#

gonna train the first stage i think, and then compare at fixed nodes & stc

twilit oriole Sep 19, 2025, 7:28 AM

#

So if it is working here imagine what it would be like in SF with L1 1024...

#

I think these results indicate it would be both stronger at fixed nodes and faster lol

stray reef Sep 19, 2025, 7:41 AM

#

i can train 1 SB of an L1=1024 and compare speeds with SF master

stray reef Sep 19, 2025, 8:10 AM

#

Plenty with L1=1024: 1.2M nps
SF Master: 1.5M nps

#

(single core bench, ao5)

stray reef Sep 19, 2025, 8:29 AM

#

oh, and i forgot to tell, this is without pairwise, at this size it probably makes sense to use it again

rocky vigil Sep 19, 2025, 9:15 AM

#

stray reef Plenty with L1=1024: 1.2M nps SF Master: 1.5M nps

actually not super uncompetitive

stray reef Sep 19, 2025, 9:16 AM

#

it might well be stronger at fixed nodes

#

so yeah

#

pairwise speedup

rocky vigil Sep 19, 2025, 9:18 AM

#

yep it's looking exciting

#

the fun part is always dreaming

twilit oriole Sep 19, 2025, 3:21 PM

#

Yeah I was assuming pairwise

violet badger Sep 19, 2025, 7:05 PM

#

FYI, we now have a fully described pipeline to train the SF net, to near master strength #nnue-dev message ... I hope we can use that to facilitate developing and testing some of the ideas discussed here, and e.g. compare bullet to nnue-pytorch.

stray reef Sep 19, 2025, 8:45 PM

#

Results of stage 1 of 4 of the L1=512 net against stage 1 of 4 of the L1=384 net

--------------------------------------------------
Results of Threats-0122 vs Threats-0120 (20000 nodes, 1t, 16MB, UHO_4060_v2.epd):
Elo: 18.54 +/- 7.02, nElo: 28.49 +/- 10.77
LOS: 100.00 %, DrawRatio: 42.89 %, PairsRatio: 1.32
Games: 3996, Wins: 1274, Losses: 1061, Draws: 1661, Points: 2104.5 (52.67 %)
Ptnml(0-2): [67, 425, 857, 526, 123], WL/DD Ratio: 1.41
--------------------------------------------------
Results of Threats-0122 vs Threats-0120 (5+0.05, 1t, 16MB, UHO_4060_v2.epd):
Elo: -1.91 +/- 7.62, nElo: -3.67 +/- 14.59
LOS: 31.12 %, DrawRatio: 50.41 %, PairsRatio: 0.97
Games: 2178, Wins: 534, Losses: 546, Draws: 1098, Points: 1083.0 (49.72 %)
Ptnml(0-2): [11, 263, 549, 259, 7], WL/DD Ratio: 0.91
--------------------------------------------------

running LTC over night while continuing training. if it's ready in time i'll send it to tcec, else i'll send L1=384

rocky vigil Sep 20, 2025, 8:17 AM

#

when is the tcec deadline?

#

i am lil busy now but will try to make progress on nnue pytorch etc. over this weekend

stray reef Sep 20, 2025, 8:20 AM

#

Updates will be run when the current bonus is over, so in ~21h. Unfortunately that means I'll have to send the smaller net, unless it's extended last minute

rocky vigil Sep 20, 2025, 8:21 AM

#

ah

#

that's a shame

#

how did the LTC go?

#

if it's done

desert tree Sep 20, 2025, 8:22 AM

#

stray reef Updates will be run when the current bonus is over, so in ~21h. Unfortunately th...

not finished training or lacking hw to test

stray reef Sep 20, 2025, 8:22 AM

#

--------------------------------------------------
Results of Threats-0122 vs Threats-0120 (30+0.3, 1t, 64MB, UHO_4060_v2.epd):
Elo: 1.02 +/- 3.41, nElo: 2.11 +/- 7.09
LOS: 72.05 %, DrawRatio: 54.27 %, PairsRatio: 1.02
Games: 9220, Wins: 2256, Losses: 2229, Draws: 4735, Points: 4623.5 (50.15 %)
Ptnml(0-2): [3, 1039, 2502, 1060, 6], WL/DD Ratio: 0.90
--------------------------------------------------

stray reef Sep 20, 2025, 8:22 AM

#

desert tree not finished training or lacking hw to test

not finished

desert tree Sep 20, 2025, 8:22 AM

#

oof

#

cant help then

#

rip

#

maybe you can ask for an extension since this is Really Cool™?

stray reef Sep 20, 2025, 8:23 AM

#

can try :P

desert tree Sep 20, 2025, 8:23 AM

#

🙏

rocky vigil Sep 20, 2025, 8:23 AM

#

yeah within error margin oh well

#

i guess like this is first checkpoint only out of 4

stray reef Sep 20, 2025, 8:23 AM

#

512 over 384 is an instamerge tbh

desert tree Sep 20, 2025, 8:23 AM

#

if u need hw to test it ill gladly help on that front

#

but itd have to be on OB

stray reef Sep 20, 2025, 8:24 AM

#

yes fs, ty, will let you know when it's ready for ob

#

the net is gigantic (184MB), i'll have to implement leb compression

desert tree Sep 20, 2025, 8:24 AM

#

alright

twilit oriole Sep 20, 2025, 8:24 AM

#

I already offered testing HW so it's not an issue kek. He wants it to be within his normal conditions

desert tree Sep 20, 2025, 8:24 AM

#

ah okok

rocky vigil Sep 20, 2025, 8:25 AM

#

yeah note that the FT size is like

desert tree Sep 20, 2025, 8:25 AM

#

gigantic ik

rocky vigil Sep 20, 2025, 8:25 AM

#

4x that of 32 bucket

stray reef Sep 20, 2025, 8:25 AM

#

well some 32 or 64 thread data would also be nice for tcec. but to merge into main it needs to pass at least VLTC

twilit oriole Sep 20, 2025, 8:25 AM

#

rocky vigil 4x that of 32 bucket

It compresses very well though

#

Binary size isn't too bad

rocky vigil Sep 20, 2025, 8:25 AM

#

true

#

is probably necessary to uh

#

get past the 128 MB limit or whatever

#

on fishtest

twilit oriole Sep 20, 2025, 8:26 AM

#

It's easy to bypass that limit. I do it all the time

rocky vigil Sep 20, 2025, 8:26 AM

#

oh for montytest right

twilit oriole Sep 20, 2025, 8:28 AM

#

With an L1 1024 for SF. Binary size will actually go down. So like I said before this isn't a real issue

stray reef Sep 20, 2025, 11:45 AM

#

#503163384875974656 message

desert tree Sep 20, 2025, 11:46 AM

#

stray reef https://discord.com/channels/479003439125495819/503163384875974656/1418925469545...

what channel is this?

stray reef Sep 20, 2025, 11:46 AM

#

Compression has been implemented, so network downloads are now a lot smaller. That means we are OB-ready
@twilit oriole @desert tree Would you be interested in a high-concurrency L1=384 test (the one being sent to TCEC) against main, or do you prefer waiting for the larger L1 to be done?

stray reef Sep 20, 2025, 11:46 AM

#

desert tree what channel is this?

Ah sorry, it's the dev channel, forgot it's not public. I asked aloril & kan to use the threat inputs branch

desert tree Sep 20, 2025, 11:47 AM

#

ah yeah nw

#

i have no roles in the tcec disc

green moat Sep 20, 2025, 11:56 AM

#

rocky vigil when is the tcec deadline?

FRD4 engine submission deadline Friday 2025-09-19T12:00 UTC

#

Deadline passed so I guess it is now the end of Altsufi Kibitzer Bonus

desert tree Sep 20, 2025, 12:03 PM

#

stray reef Compression has been implemented, so network downloads are now a lot smaller. Th...

id like to see the 512hl net

#

if u get the required extension of course

#

which i rly hope u do

stray reef Sep 20, 2025, 12:07 PM

#

i think the rules are pretty clear unfortunately

desert tree Sep 20, 2025, 12:07 PM

#

oof

stray reef Sep 20, 2025, 12:07 PM

#

https://wiki.chessdom.org/TCEC_FRD_rules

Under no circumstances are updates and fixes to engines allowed once the FRD tournament has started.

desert tree Sep 20, 2025, 12:07 PM

#

welp

stray reef Sep 20, 2025, 12:07 PM

#

384 is gonna play at least as well as main so

twilit oriole Sep 20, 2025, 5:45 PM

#

stray reef Compression has been implemented, so network downloads are now a lot smaller. Th...

how long is left for the large one

#

i mean can just do both ngl

stray reef Sep 20, 2025, 6:22 PM

#

twilit oriole how long is left for the large one

around 15h

stray reef Sep 21, 2025, 3:24 PM

#

L1=512 is up on furybench now. First running fixed nodes & STC against L1=384, to confirm the results of the tests I ran of the first stage.
Then I'll run some tests against main, including SMP, though I'm not sure yet what conditions are best

desert tree Sep 21, 2025, 3:28 PM

#

stray reef L1=512 is up on furybench now. First running fixed nodes & STC against L1=384, t...

ayy

stray reef Sep 21, 2025, 3:30 PM

#

I'm thinking something like 8th 60+0.6, potentially more threads and less TC

desert tree Sep 21, 2025, 3:30 PM

#

sgtm

#

0.8 mnps / core seems reasonable right

#

for zen4

stray reef Sep 21, 2025, 3:32 PM

#

yeah lgtm (and thanks a lot!)

#

@twilit oriole also paging you

desert tree Sep 21, 2025, 3:32 PM

#

(not using smt fwiw)

stray reef Sep 21, 2025, 3:37 PM

#

mh actually, the fact that you're using 16 cute chess sockets might be biasing the test a little in favor of the smaller net. but not sure if this is significant, just something to potentially keep in mind

desert tree Sep 21, 2025, 3:37 PM

#

stray reef mh actually, the fact that you're using 16 cute chess sockets might be biasing t...

i can drop it if you want

#

8?

#

and why would it favor either net?

stray reef Sep 21, 2025, 3:38 PM

#

ah nevermind

#

i was for some reason imagining that verbatim nets don't work between cutechess instances

#

which is of course wrong

desert tree Sep 21, 2025, 3:39 PM

#

i think nets should be shared the same way regardless of what number of cutechess instances is running

#

ah ok

#

ill just leave it as is

#

lmk if theres any issue

stray reef Sep 21, 2025, 3:49 PM

#

the STC (https://furybench.com/test/3001/, which is carried by your worker) is definitely producing worse results than what I ran locally after the first training stage (#1336647760388034610 message)

#

i don't know if such a large machine still has more problems with memory contention, even with verbatim nets?

#

given that the fixed nodes test is similar, speed seems to be the main thing that could cause this

desert tree Sep 21, 2025, 3:52 PM

#

in terms of memory contention this should be close-ish to tcec conditions

#

cause its 2 sockets with 128c each

#

idk how many memory channels

#

ill check after what memory speeds im getting

stray reef Sep 21, 2025, 4:05 PM

#

ngl i'm gonna repeat this test without your worker. -23.84 +- 2.77 vs -1.91 +/- 7.62 is way too big of a difference.
for now i'll let it run the SMP test

#

uhm @desert tree your worker now has 0.11M nps, that's a bit strange

desert tree Sep 21, 2025, 4:15 PM

#

wtf

#

its consistent too what the hell

#

yeah somethings wrong with it

#

i sure hope it didnt poison the other result

stray reef Sep 21, 2025, 4:18 PM

#

it probably did, but it was only that one STC, i'll just re-run it

#

can't really poison fixed nodes, and hasn't played any SMP games yet

desert tree Sep 21, 2025, 4:19 PM

#

and it disconnected

#

i think the host fucked something up

#

it went completely offline now

stray reef Sep 21, 2025, 4:19 PM

#

oh damn

desert tree Sep 21, 2025, 4:19 PM

#

ill see if i can get another worker

#

best i can find are zen3 workers

#

which wont be representative for TCEC

stray reef Sep 21, 2025, 4:21 PM

#

that's fine, we'll wait with the SMP test then

desert tree Sep 21, 2025, 4:21 PM

#

alr

#

nah its quick no waiting

#

few minutes at most

#

?

stray reef Sep 21, 2025, 4:22 PM

#

you mean finding another worker is quick?

desert tree Sep 21, 2025, 4:22 PM

#

ah i see lol

#

yeah

#

im sure if u ask styx hell help with this, too

stray reef Sep 21, 2025, 4:23 PM

#

true, @split warren i'm running some threat input tests on OB right now, mind helping out with the SMP LTC test?

desert tree Sep 21, 2025, 4:24 PM

#

2x 7Y83, should be up in a sec

stray reef Sep 21, 2025, 4:26 PM

#

cool

#

yeah STC is looking a lot nicer now

desert tree Sep 21, 2025, 4:34 PM

#

nice

split warren Sep 21, 2025, 4:40 PM

#

I am scramlbing with the baby atm, I will come back and do my best

twilit oriole Sep 21, 2025, 6:15 PM

#

Mine will be on within an hour

#

(2x EPYC 9654)

twilit oriole Sep 21, 2025, 6:33 PM

#

@stray reef it is on

stray reef Sep 21, 2025, 6:33 PM

#

awesome tysm

#

Hm i think @desert tree your worker is still giving different results. maybe it's just due to high concurrency. but if you look at the finished STC https://furybench.com/test/3003/ and the currently running LTC (-16.81 +- 6.22) https://furybench.com/test/3004/ and look at the individual elo of the worker (-26.92 +- 12.32)... it doesn't seem right

desert tree Sep 21, 2025, 6:47 PM

#

damn

#

i can take it off

#

idk what im doing wrong

#

:(

stray reef Sep 21, 2025, 6:47 PM

#

looking at the bench numbers, it matches the small workers (75% speed of main roughly)

#

i think it's not your fault this time, it must be due to concurrency

desert tree Sep 21, 2025, 6:48 PM

#

alright

#

i set concurrency to equal physical core count

#

aka no smt

stray reef Sep 21, 2025, 6:49 PM

#

yeah idk maybe there is still some effect we aren't thinking about rn. i'm not knowledgable enough in that regard

twilit oriole Sep 21, 2025, 6:49 PM

#

my worker loving the threat net kek

desert tree Sep 21, 2025, 6:49 PM

#

im thinking its just sss

#

if youd prefer i can turn it off

twilit oriole Sep 21, 2025, 6:50 PM

#

hm

#

well idk maybe my worker will hate it at ltc also

stray reef Sep 21, 2025, 6:50 PM

#

i think i want an LTC with just the small workers. it seems too far off

desert tree Sep 21, 2025, 6:50 PM

#

fairs

#

ill kill it for now

#

lmk if u want it back

twilit oriole Sep 21, 2025, 6:51 PM

#

nah keep it

#

kill the ltc for now

#

cos my worker is also there

stray reef Sep 21, 2025, 6:51 PM

#

ok i'll let everything do the SMP test then

twilit oriole Sep 21, 2025, 6:52 PM

#

this stuff is due to threads of test : threads of worker ratio i observed before. if u want favourable results especially on larger worker u should keep STC and crank the threads

stray reef Sep 21, 2025, 6:53 PM

#

so you're saying i should be running something like... 8+0.08 32th?

twilit oriole Sep 21, 2025, 6:54 PM

#

yep

stray reef Sep 21, 2025, 6:54 PM

#

alright

desert tree Sep 21, 2025, 6:54 PM

#

so ill put it back up then

stray reef Sep 21, 2025, 6:56 PM

#

unfortunately there is no good way to prevent one of the big workers to jump back to the LTC

#

i'll try my best by starting/stopping it if happens

twilit oriole Sep 21, 2025, 7:03 PM

#

the workers already started diverging on the SMP

twilit oriole Sep 21, 2025, 7:28 PM

#

stray reef i'll try my best by starting/stopping it if happens

big worker is on the ltc

#

and instantly lost kek

#

tbh i can just take my pgns at the end and run results through elo tool or smth. simpler

stray reef Sep 21, 2025, 7:40 PM

#

twilit oriole big worker is on the ltc

increased workload size and moved it back

#

yeah can easily filter the few games out at the end

twilit oriole Sep 21, 2025, 7:44 PM

#

  File "/home/neural/FuryBench/Client/worker.py", line 1282, in run_openbench_worker
    if config.workload: complete_workload(config)
  File "/home/neural/FuryBench/Client/worker.py", line 1023, in complete_workload
    rr.send_errors(timestamp, cutechess_cnt)
  File "/home/neural/FuryBench/Client/worker.py", line 698, in send_errors
    for header, moves in PGNHelper.slice_pgn_file(fname):
  File "/home/neural/FuryBench/Client/worker.py", line 567, in slice_pgn_file
    raise utils.OpenBenchMisssingPGNException(reason)
utils.OpenBenchMisssingPGNException: Unable to find PGNs/3007.35002.1758483719.0.pgn. Cutechess exited with no finished games.```
Lol u somehow managed to error the worker stopping that task

#

kekw

stray reef Sep 21, 2025, 7:44 PM

#

gigachad

twilit oriole Sep 21, 2025, 7:44 PM

#

started again

twilit oriole Sep 21, 2025, 8:53 PM

#

How much did the extra data help btw. Was there a measurement old 384 to new 384

stray reef Sep 21, 2025, 9:12 PM

#

i don't remember if i trained a 384 net before. but i don't think so

#

the best thing would be to simply add some more data now and see

rocky vigil Sep 21, 2025, 10:39 PM

#

the new LTC definitely looks better but won't be positive yet it seems

twilit oriole Sep 21, 2025, 11:00 PM

#

Yeah. More data + More L1 (768) and maybe pairwise

#

Might do it

rocky vigil Sep 21, 2025, 11:04 PM

#

has pairwise been tested with L1=512 (I recall it was tested at 256 and was negative, maybe?)

#

this seems maybe logical next step

#

although my suspicion is that the time taken for L1 -> L2 is not that big relative to the whole network

#

but who knows

#

actually yeah pairwise both halves L1 and doubles sparsity count

#

@stray reef do you have a bullet feature input set for (factorized) threat inputs + king buckets?
I am going to try (simplified threats + 2x768) -> 64 in shatranj because I think it'll actually finish training in a reasonable time with bullet-main single thread

#

I expect shatranj speed to be more favorable

#

since most of the pieces are leapers etc.

#

(rook is only slider)

#

and no special cases like castling/en passant

split warren Sep 22, 2025, 12:32 AM

#

i would recommend connecting with -T 120 -N 8 for a 128c, the cutechess overhead is significant with that concurrency, and it does consume quite a bit of CPU

#

maybe that was the issue?

#

god damn it @stray reef , can u lower the prio of ur other test or something? I put my machine for the test and it picked the other one

twilit oriole Sep 22, 2025, 12:58 AM

#

split warren god damn it <@415167192296849409> , can u lower the prio of ur other test or som...

i can lower it but i think the threat input test is basically done? lol

split warren Sep 22, 2025, 1:01 AM

#

cool i will just go back to Reckless datagen then

twilit oriole Sep 22, 2025, 3:22 AM

#

ok so i tried running plentychess datagen. it failed for some reason and then the focusing got ignored and it just started another task when i have explicitly inputted I do not want to run other tasks. so i had enough kek

#

https://furybench.com/event/22483/

#

/usr/bin/ld: src/fathom/src/tbprobe.o: relocation R_X86_64_32 against `.rodata' can not be used when making a PIE object; recompile with -fPIE
/usr/bin/ld: failed to set dynamic section sizes: bad value

#

Also i feel like running an ob worker shouldnt require u to be a dev and do debugging

stray reef Sep 22, 2025, 5:54 AM

#

rocky vigil has pairwise been tested with L1=512 (I recall it was tested at 256 and was nega...

not yet

stray reef Sep 22, 2025, 5:55 AM

#

twilit oriole ``` /usr/bin/ld: src/fathom/src/tbprobe.o: relocation R_X86_64_32 against `.roda...

wtf

stray reef Sep 22, 2025, 5:56 AM

#

rocky vigil <@415167192296849409> do you have a bullet feature input set for (factorized) th...

https://github.com/Yoshie2000/bullet/blob/plenty/examples/plenty/0120.rs
this is the config for the L1=384 net

#

but it's not factorised yet, i'm still working on that / need to see if what i did produces reasonable results elo-wise

#

so yeah pairwise and factorisation need to be tested next, hopefully they'll gain 5-6 LTC elo

#

and since L1=512 gained even at STC over 384, it's a no-brainer to go even bigger once everything else is figure out imo

tender fractal Sep 22, 2025, 7:59 AM

#

What is threat input exactly ?
Like, I understood that we put the threat in the input layer, but I haven't found anything on what are the threats

candid ivy Sep 22, 2025, 8:00 AM

#

read the first messages of this thread

rocky vigil Sep 22, 2025, 8:23 AM

#

alright I'm gonna try out simplified threats + 2 buckets for shatranj soon™ and see how it goes

rocky vigil Sep 22, 2025, 8:43 AM

#

will be heavily reduced L1(=64) since 1) i only get to use single CPU thread on bullet main and 2) only 840M pos of data

#

vs main currently 2 buckets L1=512

rocky vigil Sep 22, 2025, 10:32 AM

#

stray reef <https://github.com/Yoshie2000/bullet/blob/plenty/examples/plenty/0120.rs> this ...

random question: this is just using crelu, no pairwise right? if I read the config correctly

stray reef Sep 22, 2025, 10:33 AM

#

yes

rocky vigil Sep 22, 2025, 10:33 AM

#

ah interesting

#

is crelu multilayer or screlu multilayer better

stray reef Sep 22, 2025, 10:34 AM

#

no idea tbh

#

that's worth testing if pairwise does not work

rocky vigil Sep 22, 2025, 10:36 AM

#

wait also why do you just randomly discard 5% of data

#

is this like some tech

stray reef Sep 22, 2025, 10:36 AM

#

it gained a handful elo

#

not with threat inputs explicitly, this is just my main training schedule

rocky vigil Sep 22, 2025, 10:37 AM

#

bullet legacy user attempts to parse bullet main:

rocky vigil Sep 22, 2025, 10:37 AM

#

rocky vigil wait also why do you just randomly discard 5% of data

wait is this by like superbatch

#

or idk

stray reef Sep 22, 2025, 10:37 AM

#

per datapoint

#

but ofc every epoch the discarded data will be different

#

otherwise it'd just be 5% less data which would be bad

rocky vigil Sep 22, 2025, 10:38 AM

#

ah interesting

#

    for side in [Side::WHITE, Side::BLACK] {
        for piece in Piece::PAWN..=Piece::KING {
            let pc = 6 * side + piece - 2;
            map_bb(bbs[side] & bbs[piece], |sq| pieces[sq] = pc);
        }
    }```

#

what does this code do?

#

in

#

map_features

#

because I just copied the old simplified threat input code

#

and used that instead

#

and that doesn't have this

stray reef Sep 22, 2025, 10:39 AM

#

looks like it builds a mailbox from the bitboards

rocky vigil Sep 22, 2025, 10:40 AM

#

ohhh right

#

for

#

full threats

stray reef Sep 22, 2025, 10:40 AM

#

all this is mostly taken from the montytrain branch, and adapted as necessary fwiw

rocky vigil Sep 22, 2025, 10:40 AM

#

ok I shouldn't need it then yea

#


    let occ = bbs[0] | bbs[1];

    for side in [Side::WHITE, Side::BLACK] {
        let side_offset = offsets::END * side;
        let opps = bbs[side ^ 1];

        for piece in Piece::PAWN..=Piece::KING {
            map_bb(bbs[side] & bbs[piece], |sq| {
                let threats = match piece {
                    Piece::PAWN => Attacks::pawn(sq, side),
                    Piece::KNIGHT => Attacks::knight(sq),
                    Piece::BISHOP => Attacks::bishop(sq, occ),
                    Piece::ROOK => Attacks::rook(sq, occ),
                    Piece::QUEEN => Attacks::queen(sq, occ),
                    Piece::KING => Attacks::king(sq),
                    _ => unreachable!(),
                } & occ;

                count += 1;
                map_bb(threats, |dest| {
                    let enemy = (1 << dest) & opps > 0;
                    if let Some(idx) = map_piece_threat(piece, sq, dest, pieces[dest], enemy) {
                        f(side_offset + idx);
                        count += 1;
                    }
                });
            });
        }
    }``` wait where's the psq feature in this then

stray reef Sep 22, 2025, 10:42 AM

#

pieces[dest] is passed to map_piece_threat? not sure, i'm on mobile rn

rocky vigil Sep 22, 2025, 10:43 AM

#

hangon lemme attempt to figure this stuff out

#


    let occ = bbs[0] | bbs[1];

    for side in [Side::WHITE, Side::BLACK] {
        let side_offset = offsets::END * side;
        let opps = bbs[side ^ 1];

        for piece in Piece::PAWN..=Piece::KING {
            map_bb(bbs[side] & bbs[piece], |sq| {
                let threats = match piece {
                    Piece::PAWN => Attacks::pawn(sq, side),
                    Piece::KNIGHT => Attacks::knight(sq),
                    Piece::BISHOP => Attacks::bishop(sq),
                    Piece::ROOK => Attacks::rook(sq, occ),
                    Piece::QUEEN => Attacks::queen(sq),
                    Piece::KING => Attacks::king(sq),
                    _ => unreachable!(),
                } & occ;

                f(TOTAL_THREATS + [0, 384][side] + 64 * (piece - 2) + sq);
                count += 1;
                map_bb(threats, |dest| {
                    let enemy = (1 << dest) & opps > 0;
                    if let Some(idx) = map_piece_threat(piece, sq, dest, enemy) {
                        f(side_offset + idx);
                        count += 1;
                    }
                });
            });
        }
    }``` this is what montytrain simplified threat inputs has

stray reef Sep 22, 2025, 10:45 AM

#

yes it's not needed here as simple threat inputs only distinguish if the threatened piece is an enemy or not, it doesn't care about the type

rocky vigil Sep 22, 2025, 10:45 AM

#

yeah but this has a f([0, 384][side] + 64 * (piece - 2) + sq) that corresponds to the psq feature

#

idk where the other code has that

stray reef Sep 22, 2025, 10:46 AM

#

oh that's what you mean. sorry

#

i moved that into the main method i think, as this input type has factorised king buckets, there must be something like Chess768::map_features

rocky vigil Sep 22, 2025, 10:48 AM

#

ah this? fn map_features<F: FnMut(usize, usize)>(&self, pos: &Self::RequiredDataType, mut f: F) { let get = |ksq| (if ksq % 8 > 3 { 7 } else { 0 }, 768 * self.buckets[usize::from(ksq)]); let (stm_flip, stm_bucket) = get(pos.our_ksq()); let (ntm_flip, ntm_bucket) = get(pos.opp_ksq()); Chess768.map_features(pos, |stm, ntm| { let bucketed_offset = 768 + TOTAL_THREATS; f(bucketed_offset + stm_bucket + (stm ^ stm_flip), bucketed_offset + ntm_bucket + (ntm ^ ntm_flip)); // bucketed feature f(stm ^ stm_flip, ntm ^ ntm_flip) // factorised feature });

stray reef Sep 22, 2025, 10:48 AM

#

yes exactly

rocky vigil Sep 22, 2025, 10:48 AM

#

stray reef i moved that into the main method i think, as this input type has factorised kin...

so does that mean I don't have to add psq features in the map_features function

stray reef Sep 22, 2025, 10:49 AM

#

you can do it either way, but imo for factorised king buckets this makes it a lot easier

#

you need to have it somewhere, and exactly once

rocky vigil Sep 22, 2025, 10:50 AM

#

ok

#

so I'm just gonna go with whatever your code has in main method

#

meaning I should remove that from the map_features function I think

#

also what is ```impl ThreatInputsBucketsMirrored {
pub fn new(buckets: [usize; 32]) -> Self {
let num_buckets = get_num_buckets(&buckets);

    let mut expanded = [0; 64];
    for (idx, elem) in expanded.iter_mut().enumerate() {
        *elem = buckets[(idx / 8) * 4 + [0, 1, 2, 3, 3, 2, 1, 0][idx % 8]];
    }

    Self { buckets: expanded, num_buckets }
}

}```

stray reef Sep 22, 2025, 10:52 AM

#

just some code that mirrors the bucket layout from a 32 element array into a 64 element array

rocky vigil Sep 22, 2025, 10:52 AM

#

oh i see

#

ok the last thing I think I need to fiddle with is the settings

#

yay

#

how do I uh

#

run

#

bullet

#

with CPU backend?

stray reef Sep 22, 2025, 11:03 AM

#

--features cpu maybe? idk

rocky vigil Sep 22, 2025, 11:04 AM

#

nope that attempts to compile cudarc and fails

#

ah maybe --no-default-features

formal smelt Sep 22, 2025, 11:09 AM

#

rocky vigil nope that attempts to compile cudarc and fails

If you have a recent bullet commit it should tell you to add this

rocky vigil Sep 22, 2025, 11:09 AM

#

i am using main latest

#

well

#

no gpu anyways

formal smelt Sep 22, 2025, 11:11 AM

#

https://github.com/jw1912/bullet/blob/main/crates/bullet_cuda_backend/build.rs#L1

#

Like the error should be “bro disable default features”

rocky vigil Sep 22, 2025, 11:11 AM

#

what

stray reef Sep 22, 2025, 11:12 AM

#

ah yes i added the rand crate to the bullet_lib cargo.tml iirc (forgive me jw :P)

#

idk about the mismatched types

rocky vigil Sep 22, 2025, 11:13 AM

#

wait i don't see it in https://github.com/Yoshie2000/bullet/blob/plenty/Cargo.toml

#

or am I blind

stray reef Sep 22, 2025, 11:13 AM

#

https://github.com/Yoshie2000/bullet/blob/plenty/crates/bullet_lib/Cargo.toml

rocky vigil Sep 22, 2025, 11:13 AM

#

ohhh

#

yeah idk about mismatched types either

formal smelt Sep 22, 2025, 11:16 AM

#

It should be an &str

rocky vigil Sep 22, 2025, 11:22 AM

#

mm hmm

#

well then

#

@formal smelt i assume this means it can't be done on cpu backend

formal smelt Sep 22, 2025, 11:37 AM

#

Yes

rocky vigil Sep 22, 2025, 11:41 AM

#

what a shame

#

time to ask kevlu to do it ig

#

btw here's the actual edited config

📎 simple.rs

#

@stray reef @formal smelt does it look good

twilit oriole Sep 22, 2025, 11:47 AM

#

Spend $2 on vast.ai time?

rocky vigil Sep 22, 2025, 11:49 AM

#

or that

#

my parents might mald bout it tho

stray reef Sep 22, 2025, 3:01 PM

#

How many threat updates could a single move cause at most? (preferrably even split into add+sub counts) Has anyone put thought into this yet?

rocky vigil Sep 22, 2025, 10:00 PM

#

At most 32 add (8 from the moving piece, 8 from uncovering sliders, 16 from attacks to the dest) and same for subtract

#

Actually if deduplication is taken into account that 32 is lower

#

Maybe 20

#

Because you get at most 4 from uncovering sliders

#

And the other 8+16 is reduced to 16

rocky vigil Sep 22, 2025, 11:37 PM

#

rocky vigil Maybe 20

this is still very large i think

#

similar cost as a refresh

rocky vigil Sep 25, 2025, 3:39 AM

#

😔

#

we dreamed

#

for 2 moves

frosty imp Sep 25, 2025, 4:13 AM

#

threat inputs weakness at king safety? Kappa

stray reef Sep 25, 2025, 7:52 AM

#

gonna run a DFRC test of threat inputs actually, just out of curiosity

stray reef Sep 25, 2025, 8:20 AM

#

yeah it's about the same strength diff to master as in normal chess

#

which means it scales well but doesn't actually play much better DFRC than clover and stormphrax

stray reef Sep 26, 2025, 8:53 AM

#

Pairwise tests are now up on furybench (fixed nodes + STC)

rocky vigil Sep 26, 2025, 12:03 PM

#

lmao neutral fixed nodes

#

and +4 stc

rocky vigil Sep 26, 2025, 12:28 PM

#

Bench is anywhere from 0 to 10% faster

stray reef Sep 26, 2025, 12:31 PM

#

Fixed nodes

Elo   | 0.03 +- 3.07 (95%)
Conf  | N=20000 Threads=1 Hash=16MB
Games | N: 20140 W: 5862 L: 5860 D: 8418
Penta | [439, 2372, 4427, 2412, 420]

https://furybench.com/test/3080/
STC

Elo   | 4.28 +- 2.48 (95%)
SPRT  | 8.0+0.08s Threads=1 Hash=16MB
LLR   | 2.90 (-2.25, 2.89) [0.00, 2.50]
Games | N: 19550 W: 4940 L: 4699 D: 9911
Penta | [50, 2200, 5042, 2425, 58]

https://furybench.com/test/3083/

Nice result, and should make it even easier to increase L1 for an LTC gain

#

i would have expected it to be a bit worse at fixed nodes, ngl

#

I am training stage 1 of the factorised threat features now, it's very slow, but it might be worth it

rocky vigil Sep 26, 2025, 12:33 PM

#

stray reef i would have expected it to be a bit worse at fixed nodes, ngl

Same but I guess

#

Threat inputs is a little different

stray reef Sep 26, 2025, 12:35 PM

#

not sure if there are any other results of people trying pairwise with such small L1s

#

(without threat inputs)

rocky vigil Sep 26, 2025, 12:43 PM

#

If you have time could you try screlu multilayer as well

stray reef Sep 26, 2025, 12:48 PM

#

now that i have pairwise, screlu is no longer usable due to quantisation (probably, haven't thought a lot about it)

rocky vigil Sep 26, 2025, 12:51 PM

#

Ah

#

I meant screlu w/o pairwise

#

Maybe that still affects quantization though

stray reef Sep 26, 2025, 1:05 PM

#

no, but it would be slower and scale worse with L1 size, i don't think that's worth trying, seeing this pairwise result

rocky vigil Sep 26, 2025, 1:20 PM

#

Ok

#

Fair enough

rocky vigil Sep 26, 2025, 4:46 PM

#

Seems like higher quantization is quite effective

#

That should put it only what 5 STC elo away?

stray reef Sep 26, 2025, 5:46 PM

#

probably neutral at LTC

twilit oriole Sep 26, 2025, 5:50 PM

#

Eh test that. Threat inputs are more resistant to quantisation. So much so that we now i8 quantise ours

stray reef Sep 26, 2025, 6:12 PM

#

no i mean with gain it's now probably neutral to master at LTC

#

was expecting quantisation to scale linearly

naive comet Sep 27, 2025, 1:20 AM

#

stray reef no i mean with gain it's now probably neutral to master at LTC

holy fuck

#

vltc gainer then?

rocky vigil Sep 27, 2025, 6:07 AM

#

surprisingly 8192 works over 3072

#

in Monty at least

#

i guess the future for cpu mcts is just in big net

stray reef Sep 27, 2025, 8:25 AM

#

naive comet vltc gainer then?

that's an option, but i first want to test if my factoriser impl is worth any elo

stray reef Sep 27, 2025, 12:19 PM

#

rip i have a bug in the training script. guess i'll test LTC again then

rocky vigil Sep 27, 2025, 12:58 PM

#

oof

stray reef Sep 27, 2025, 1:37 PM

#

Factorisation is now fully working. We'll have the results of stage 1 tomorrow

desert tree Sep 27, 2025, 1:37 PM

#

🙏

stray reef Sep 27, 2025, 1:39 PM

#

(I am factorising similarly to small threat inputs, except I also encode if the threatened/protected piece is of higher value)

#

it's still a ton of features. potentially i'll have to cut it down more

rocky vigil Sep 27, 2025, 1:43 PM

#

Ah

#

Yeah it’s difficult to factor

#

It’s a subset of PP essentially

#

So the factorings are pretty much also just factorings of PP

stray reef Sep 27, 2025, 1:48 PM

#

it would be possible (but more complicated probably) to use what chef tried in vine recently as a factoriser, e.g. [colored_piece][sq][sq_attacked][sq_defended]

#

aka. 768x4

desert tree Sep 27, 2025, 1:54 PM

#

fwiw it seems to train pretty quickly

#

going from 600->800SBs is completely neutral

#

for our 1024hl net

twilit oriole Sep 27, 2025, 1:56 PM

#

Depends on amount of data

desert tree Sep 27, 2025, 1:57 PM

#

right yeah

#

with not that much data you dont need that many SBs
dont mind me forgetting basic stuff about training nets

stray reef Sep 27, 2025, 2:08 PM

#

#1220867251763286207 message this would be potentially a big issue with that scheme though (though practically probably not)

rocky vigil Sep 27, 2025, 3:01 PM

#

New LTC not looking terribly hot rn oof

twilit oriole Sep 27, 2025, 3:02 PM

#

Looks fine to me. There is still scaling the L1 and adding more data left

rocky vigil Sep 27, 2025, 3:02 PM

#

Yeah maybe neutral LTC was optimistic

#

btw

#

I got threat inputs branch of nnue PyTorch up

#

On my fork

#

In case anyone with gpu wants to try and see if it works

#

I largely copied the existing impl and just changed the function calls etc. to match the library

#

So here’s hoping nothing goes terribly wrong

rocky vigil Sep 27, 2025, 4:37 PM

#

notwithstanding the errors with nnue-pytorch, training seems to be quite fast

#UE Threat Inputs for AB