UE Threat Inputs for AB | Stockfish | Page 5

rocky vigil Sep 27, 2025, 4:39 PM

#

around half the speed of sf master net?

#

cannot read the numbers

stray reef Sep 27, 2025, 4:39 PM

#

with my current arch (including the factoriser) training speeds are basically identical to my current master net

rocky vigil Sep 27, 2025, 4:40 PM

#

rocky vigil cannot read the numbers

there is also a possibility that the numbers are different due to vondele running debug builds

#

at least one of the numbers was ~equal to sf master, one was 1/2 sf master, and one was 1/4

stray reef Sep 27, 2025, 4:41 PM

#

yeah. no idea either how to interpret the 80-90it/s either, but from what i've seen in the past it seems pretty good

violet badger Sep 27, 2025, 4:46 PM

#

same speed as master net (l1=3072) training.

#

so fairly straightforward to train.

#

certainly if we train just 1 or 2 stages.

#

actually even a bit faster to train.

stray reef Sep 27, 2025, 5:01 PM

#

Elo   | -3.77 +- 2.98 (95%)
SPRT  | 40.0+0.40s Threads=1 Hash=64MB
LLR   | -2.25 (-2.25, 2.89) [0.00, 2.50]
Games | N: 11898 W: 2877 L: 3006 D: 6015
Penta | [6, 1394, 3281, 1259, 9]

https://furybench.com/test/3100/
LTC vs main. slowly getting there. i'm hopeful that a factoriser is all that's needed now

rocky vigil Sep 27, 2025, 5:02 PM

#

mm

#

+2 LTC from previous but it doesn't say much

#

considering error bars

stray reef Sep 28, 2025, 9:13 AM

#

First factoriser not looking good (loss also sucked)

--------------------------------------------------
Results of ThreatsFactorised vs Threats (20000 nodes, 1t, 16MB, UHO_4060_v2.epd):
Elo: -13.87 +/- 9.37, nElo: -21.63 +/- 14.58
LOS: 0.18 %, DrawRatio: 43.30 %, PairsRatio: 0.81
Games: 2180, Wins: 602, Losses: 689, Draws: 889, Points: 1046.5 (48.00 %)
Ptnml(0-2): [58, 284, 472, 239, 37], WL/DD Ratio: 1.58
--------------------------------------------------
Results of ThreatsFactorised vs Threats (5+0.05, 1t, 16MB, UHO_4060_v2.epd):
Elo: -13.70 +/- 11.09, nElo: -26.13 +/- 21.12
LOS: 0.76 %, DrawRatio: 48.46 %, PairsRatio: 0.75
Games: 1040, Wins: 245, Losses: 286, Draws: 509, Points: 499.5 (48.03 %)
Ptnml(0-2): [5, 148, 252, 113, 2], WL/DD Ratio: 1.03
--------------------------------------------------

I mean i've never experimented with factoriser schemes, there is the possibility that smth is still bugged ofc even though i double checked the things i could think of, but it also doesn't look bad enough for it to be bugged

#

i'll try coding up 768x4 next, when i have time

rocky vigil Sep 28, 2025, 9:14 AM

#

yeah that looks cooked rip

stray reef Sep 28, 2025, 2:17 PM

#

i tried to describe the information encoded in various threat schemes, in the hope of getting some collective opinion on what a factoriser might need most.

large threat inputs: [src][src_pc][src_pc_col][dest][dest_pc][dest_pc_rel_col]
small threat inputs: [src][src_pc][src_pc_col][dest][dest_pc_rel_col] -> leave out attacked piece type
what i tried:        [src][src_pc][src_pc_col][dest][dest_pc_worth_more_than_src_pc][dest_pc_rel_col]
768x4:               [dest][dest_pc][dest_pc_col][dest_attacked][dest_defended]
alternative idea 1:  [src_pc][src_pc_col][dest][dest_pc][dest_pc_rel_col] -> leave out source square
alternative idea 2:  [src][src_pc][src_pc_col][dest_pc][dest_pc_rel_col] -> leave out destination square

i'm actually thinking alternative idea 1 might be best, the source square should not be super important for the factoriser. but i wanna hear some opinions

rocky vigil Sep 28, 2025, 4:10 PM

#

i think leaving out src square seems reasonable yea

twilit oriole Sep 28, 2025, 7:29 PM

#

Why not use small threat inputs as the factoriser?

#

Because it is known to not be terrible even as standalone

rocky vigil Sep 28, 2025, 7:37 PM

#

is that not what yoshie just tried

#

idk

twilit oriole Sep 28, 2025, 7:38 PM

#

Oh I see lol

#

It is basically

stray reef Sep 28, 2025, 8:53 PM

#

trying without encoding the source square now. for pawns, since source/destination are so closely tied, i encode source file+threat direction, but not rank. 6824 features

#

eta 17-22h from now, depends on when i'm home

stray reef Sep 29, 2025, 5:01 PM

#

Still not great

--------------------------------------------------
Results of ThreatsFactorised vs Threats (20000 nodes, 1t, 16MB, UHO_4060_v2.epd):
Elo: -7.04 +/- 6.13, nElo: -11.26 +/- 9.79
LOS: 1.21 %, DrawRatio: 45.16 %, PairsRatio: 0.86
Games: 4836, Wins: 1329, Losses: 1427, Draws: 2080, Points: 2369.0 (48.99 %)
Ptnml(0-2): [95, 617, 1092, 519, 95], WL/DD Ratio: 1.31
--------------------------------------------------

I don't know, maybe it's not a thing that can be factorised well, at least in the ways i've tried so far? I.e. the weights of each "bucket" are too different, what i'm doing seems to be doing more harm than good

twilit oriole Sep 30, 2025, 1:33 AM

#

Yeah

rocky vigil Sep 30, 2025, 12:51 PM

#

well then

#

the debugging starts again

rocky vigil Sep 30, 2025, 1:50 PM

#

ucinewgame position startpos eval (x2) gives two wildly different results

#

this is so bad

#

surprise surprise doing a no-ue inference hack on sf nnue vector code by treating the biases as accumulator caches breaks

#

because the biases themselves get updated

#

ok well this looks more like chess

#

idk how good this chess is

twilit oriole Sep 30, 2025, 2:09 PM

#

This is the early checkpoint right

rocky vigil Sep 30, 2025, 2:22 PM

#

yes

#

...      Frolic (stable) playing White: 0 - 46 - 4  [0.040] 50
...      Frolic (stable) playing Black: 0 - 48 - 2  [0.020] 50
...      White vs Black: 48 - 46 - 6  [0.510] 100
Elo difference: -603.9 +/- 183.4, LOS: 0.0 %, DrawRatio: 6.0 %
SPRT: llr 0 (0.0%), lbound -inf, ubound inf
100 of 100 games finished.

well seeing it can still destroy Frolic (~3080 CCRL blitz) at stc without ue

#

i think there shouldn't be any major issues with training/inference at this stage

#

...      Stockfish TI-experimental playing White: 6 - 29 - 15  [0.270] 50
...      Stockfish TI-experimental playing Black: 6 - 34 - 10  [0.220] 50
...      White vs Black: 40 - 35 - 25  [0.525] 100
Elo difference: -195.5 +/- 65.7, LOS: 0.0 %, DrawRatio: 25.0 %
SPRT: llr 0 (0.0%), lbound -inf, ubound inf
100 of 100 games finished.``` 10k node

#

idk how much the rest of training is worth

#

what should be the plan

#

start a full training run and compare fixed nodes?

#

https://github.com/sscg13/Stockfish/tree/threat-inputs-rebase

#

200 superbatches was it?

#

i honestly have no idea how undertrained that is

#

besides "very"

rocky vigil Sep 30, 2025, 2:31 PM

#

rocky vigil start a full training run and compare fixed nodes?

@twilit oriole @stray reef opinions?

twilit oriole Sep 30, 2025, 2:42 PM

#

well i think you should get rid of king buckets for the baseline lol

#

then we can compare to plenty results easier

#

Plenty has a L1 512 TI vs L1 1536 regular, SF would be L1 1024 TI vs L1 3072. So fixed nodes should be very similar

stray reef Sep 30, 2025, 2:45 PM

#

hm -190 sounds almost like something's broken honestly, or the end LR is still extremely high

for my training setup there is no way any stage can be so bad, assuming a reasonable LR schedule

#

plenty L1 is 1792 btw

#

#nnue-dev message
given this, the elo diff seems fine

violet badger Sep 30, 2025, 2:56 PM

#

stray reef https://discord.com/channels/435943710472011776/718853716266188890/1422596876817...

or just different levels of optimism 😉

#

Anyway, worthwhile training something stronger.

rocky vigil Sep 30, 2025, 2:56 PM

#

Yeah still cannot guarantee everything is perfectly fine

#

But at least this is a lower bound

violet badger Sep 30, 2025, 2:57 PM

#

right, but it is likely not outrageously wrong, which is good enough to put some more resources on this.

#

do you have some correlation plot, e.g. TI vs master net evals in a scatter plot?

rocky vigil Sep 30, 2025, 2:58 PM

#

Ah I can make that later if you tell me how

violet badger Sep 30, 2025, 2:58 PM

#

just take a random source of fens (e.g. a binpack), and evaluate once 1000 fens with your net and once with master net, and plot x,y..

rocky vigil Sep 30, 2025, 3:00 PM

#

ok

stray reef Sep 30, 2025, 3:08 PM

#

Btw @twilit oriole do you have any data on how much data and how many SBs/epochs a threat input net of a certain L1 size needs?

#

i'm wondering if mine is massively undertrained (not only wrt data, but also SBs)

twilit oriole Sep 30, 2025, 3:09 PM

#

hm not really. we are using 12k SBs and 160B positions for an L1 8192

#

and that seems slightly undertrained but not by much

#

though mcts might have higher data requirements

stray reef Sep 30, 2025, 3:10 PM

#

how many SBs would you do for L1 512, given enough data (whatever that may be)

twilit oriole Sep 30, 2025, 3:12 PM

#

difficult to say because you can nearly always squeeze a few more elo out

#

probably something like 1k minimum, 2k to be sure

stray reef Sep 30, 2025, 3:22 PM

#

hm ok

rocky vigil Sep 30, 2025, 3:57 PM

#

...      Stockfish TI-experimental playing White: 19 - 141 - 90  [0.256] 250
...      Stockfish TI-experimental playing Black: 18 - 165 - 67  [0.206] 250
...      White vs Black: 184 - 159 - 157  [0.525] 500
Elo difference: -208.9 +/- 27.1, LOS: 0.0 %, DrawRatio: 31.4 %
SPRT: llr 0 (0.0%), lbound -inf, ubound inf
500 of 500 games finished.``` 20k nodes but idt there's really much more substantial things to learn atm

violet badger Sep 30, 2025, 4:10 PM

#

I think the important check is to see if the inference is consistent with the trainer...

#

https://github.com/official-stockfish/nnue-pytorch/blob/master/cross_check_eval.py

#

(though the script might need verifying it still works)

rocky vigil Oct 1, 2025, 4:46 AM

#

violet badger https://github.com/official-stockfish/nnue-pytorch/blob/master/cross_check_eval....

https://github.com/official-stockfish/nnue-pytorch/blob/master/cross_check_eval.py#L151 seems old

#

actually my question is why isn't there a command that just returns the unnormalized eval

rocky vigil Oct 1, 2025, 9:14 AM

#

btw @twilit oriole sparsity on the threat net L1 -> L2 seems trashed

#

have you measured this before

twilit oriole Oct 1, 2025, 9:15 AM

#

wdym trashed

#

i found threat nets compress much better for us which would lead me to believe the opposite

rocky vigil Oct 1, 2025, 9:17 AM

#

combined zeros here seems much lower

#

than in the halfka

#

(master arch at this checkpoint has like 78 which is double the amount)

#

oh right L1 issue

rocky vigil Oct 1, 2025, 9:43 AM

#

@violet badger I'm measuring a large fixed nodes loss between the first checkpoint of the full run (nn-42b0b08a207a.nnue) and the net trained from the short run (nn-cc78fa7e0258.nnue) despite a lower validation loss (0.00405 vs 0.00425), is there a meaningful difference between the two in the first stage besides training time?

...      Stockfish TI-experimental playing White: 24 - 45 - 31  [0.395] 100
...      Stockfish TI-experimental playing Black: 20 - 48 - 32  [0.360] 100
...      White vs Black: 72 - 65 - 63  [0.517] 200
Elo difference: -86.9 +/- 40.7, LOS: 0.0 %, DrawRatio: 31.5 %
SPRT: llr 0 (0.0%), lbound -inf, ubound inf
200 of 200 games finished.```

violet badger Oct 1, 2025, 10:08 AM

#

full first stage (nn-42b....) should be better than the previous (nn-cc78...), there is no difference except increasing from 200 to 800 epochs the training. In this sense nn-42b can now also be compared to similarly trained nets of the master arch (which are roughly -50Elo compared to master fully trained).

#

do I understand your measurement as showing it is worse?

rocky vigil Oct 1, 2025, 10:10 AM

#

yes

#

I'll look into both impls again

violet badger Oct 1, 2025, 10:12 AM

#

yeah, I think this need some checking from the implementation point of view.

#

note on the loss during training, we adjust lambda (mix between eval and game outcome) during training (if lambda start and lambda end is not the same), so the loss doesn't mean the same at the same epoch if the max_epoch is different.

rocky vigil Oct 1, 2025, 10:14 AM

#

yeah i mean the 0.00405 vs 0.00425 comparison is from final epoch from both

#

but yeah something is strange

violet badger Oct 1, 2025, 10:15 AM

#

possible the final epoch can indeed be compared.

#

still strange.

#

even if painful, I think the thing to do right now is to ensure trainer and SF have the same inference result.

rocky vigil Oct 1, 2025, 10:20 AM

#

do all of the nnue-pytorch functions really need a gpu to run

#

i'll try to enlist a friend's help if that is the case

violet badger Oct 1, 2025, 10:21 AM

#

most likely, at least I don't think non-gpu runs are still supported. It would add a new dimension to testing..

rocky vigil Oct 1, 2025, 1:18 PM

#

violet badger even if painful, I think the thing to do right now is to ensure trainer and SF h...

ah shoot

(lldb) (int) 625```
```(lldb) print Eval::NNUE::Features::debug_threat_index(Stockfish::Color::BLACK, Stockfish::Piece::W_PAWN, Stockfish::Square::SQ_C5, Stockfish::Square::SQ_D6, Stockfish::Piece::B_KNIGHT, Stockfish::Square::SQ_E1)
(lldb) (int) 40049```

#

i should have guessed something was sketchy

#

in stockfish piece enum color is msb, in nnue-pytorch it is lsb...

rocky vigil Oct 1, 2025, 1:45 PM

#

ok well it turns out changing the threat indexing does not affect bench at all

#

um

#

something is highly wrong in my inference then

rocky vigil Oct 1, 2025, 2:03 PM

#

void init_threat_offsets() {
    int pieceoffset = 0;
    for (int c = WHITE; c <= BLACK; c++) {
        for (int pt = PAWN; pt <= KING; pt++) {
            Piece piece = make_piece(Color(c), PieceType(pt));
            threatoffsets[piece][65] = pieceoffset;
            int squareoffset = 0;
            for (int from = SQ_A1; from <= SQ_H8; from++) {
                threatoffsets[piece][from] = squareoffset;
                if (pt != PAWN) {
                    Bitboard attacks = attacks_bb(PieceType(pt), Square(from), 0ULL);
                    squareoffset += popcount(attacks);
                }
                else if (from >= SQ_A2 && from <= SQ_H7) {
                    Bitboard attacks = (piece < 8) ? pawn_attacks_bb<WHITE>(square_bb(Square(from)))
                                                   : pawn_attacks_bb<BLACK>(square_bb(Square(from)));
                    squareoffset += popcount(attacks);
                }
            }
            threatoffsets[piece][64] = squareoffset;
            pieceoffset += numvalidtargets[piece]*squareoffset;
        }
    }
}```
no matter how I swap the order of the top for loops (either way), I get the same bench

#

idk what is going wrong...

rocky vigil Oct 1, 2025, 5:00 PM

#

I legitimately do not know how changing the threat indexing does not affect bench at all

rocky vigil Oct 1, 2025, 7:25 PM

#

the battle begins again

#

nvm this is it packing two ints and interpreting it as a u64

rocky vigil Oct 1, 2025, 7:57 PM

#

more inclined to believe the issue is in the trainer now

#

#

so this is basically just a L1=1024 halfka net

#

no wonder it's -200 to master

#

doesn't explain how the 800 sb one is worse than the 200 sb one at fixed nodes

#

oh well

rocky vigil Oct 1, 2025, 8:01 PM

#

rocky vigil

@violet badger it looks like something is wrong right now so there isn't much point in continuing the run

#

I'll have to take a look into trainer again

violet badger Oct 1, 2025, 8:15 PM

#

okay, just let me know if there are fixes to the trainer to test out and we can restart.

rocky vigil Oct 1, 2025, 8:26 PM

#

yeah it's hard to work with nnue-pytorch w/o a gpu but hopefully my friend can help in the next few days

rocky vigil Oct 3, 2025, 11:08 AM

#

@violet badger is it safe to rebase against master

stray reef Oct 3, 2025, 11:22 AM

#

added 300M 5ksn-adversarial positions to the last training stage, it's probably passing LTC which is awesome
https://furybench.com/test/3149/
https://furybench.com/test/3155/

especially since including 5ksn-adversarial did not pass LTC in master, and it's super quick to generate compared to 20ksn-adversarial

rocky vigil Oct 3, 2025, 11:23 AM

#

oh nice

stray reef Oct 3, 2025, 1:41 PM

#

@twilit oriole threat inputs don't allow duplicate encoding of the same interaction, e.g. two queens attacking each other. did you ever measure the elo of this?

#

i realised there's quite a few unused features due to this (only 73360 are used)

rocky vigil Oct 3, 2025, 1:43 PM

#

i think it's possible to change the encoding itself

#

to reduce some of stuff like that

#

but it's annoying because you then have to treat it separately

#

not like indexing is the bottleneck anyways

#

it takes like < 1% of runtime

stray reef Oct 3, 2025, 1:44 PM

#

rocky vigil it takes like < 1% of runtime

well actually, this loop which calculates feature indices takes 12% of the entire runtime atm
https://github.com/Yoshie2000/PlentyChess/blob/threat-inputs/src/nnue.cpp#L249-L272

and it's already sped up by using a precomputed index lookup table (around 2.4MB)

twilit oriole Oct 3, 2025, 1:45 PM

#

the elo was not measured no. you need to be careful about unused features, sometimes it is an illusion due to rare underpromos that would for example allow u to have two own bishops of same square complex etc

rocky vigil Oct 3, 2025, 1:46 PM

#

huh that's strange

#

actually are you sure the lookup table is the right play here

stray reef Oct 3, 2025, 1:46 PM

#

it was faster than the usual calculation

rocky vigil Oct 3, 2025, 1:46 PM

#

hmmm

stray reef Oct 3, 2025, 1:46 PM

#

though i'm 100% sure there must be a different encoding to make this faster

#

and also to figure out if a feature is unused or not

rocky vigil Oct 3, 2025, 1:47 PM

#

back when diss ran profile the actual indexing portion was only 1% or so of runtime and generating the threats was like 20% over both sides

#

idk maybe stuff changes

stray reef Oct 3, 2025, 1:47 PM

#

well maybe i did smth really stupid but i didn't really get very far with profiling

rocky vigil Oct 3, 2025, 1:48 PM

#

do you actually have a profile of latest version

stray reef Oct 3, 2025, 1:48 PM

#

the time taken in this loop is roughly 1/3 unpacking DirtyThreat and calculating relative squares, 1/3 table lookup, 1/3 adding into the arrays

rocky vigil Oct 3, 2025, 1:48 PM

#

I can't do it bc windows sucks

violet badger Oct 3, 2025, 1:48 PM

#

rocky vigil <@713871252246495262> is it safe to rebase against master

always?

rocky vigil Oct 3, 2025, 1:48 PM

#

ok cool

naive comet Oct 3, 2025, 1:49 PM

#

yoshie have you tried to split into 2 DirtyThreat lists, one with add and one with subtract, to remove branching in the loop? i think it could be a minor speedup

stray reef Oct 3, 2025, 1:50 PM

#

i tried it in combination with smth else, can try it standalone as well

rocky vigil Oct 3, 2025, 1:50 PM

#

stray reef the time taken in this loop is roughly 1/3 unpacking DirtyThreat and calculating...

yeah i don't really see how to speed this up rn

stray reef Oct 3, 2025, 1:51 PM

#

#

this loop basically takes more time than all of addsub

#

it's crazy

#

i think going back and forth between indexing the table and the dirty threat lists is awful for the cache, especially if there's like 10 threat updates to process

#

though i've not managed to found a way to improve it yet

rocky vigil Oct 3, 2025, 1:52 PM

#

also random idea maybe don't use max capacity 128 indexlists

#

for add/remove

#

like 32 should do just fine

stray reef Oct 3, 2025, 1:53 PM

#

tried that, was not a speedup

rocky vigil Oct 3, 2025, 1:53 PM

#

oh well

#

yeah I tried once not to like create entirely new lists every time but that screwed with multithreading

stray reef Oct 3, 2025, 1:53 PM

#

@naive comet maybe you have some idea on how to improve the cache situation? to not jump back and forth between dirtyThreats and the lookup table?

rocky vigil Oct 3, 2025, 1:54 PM

#

how big is dirtythreats

stray reef Oct 3, 2025, 1:55 PM

#

struct DirtyThreat {
  Piece piece;
  Piece attackedPiece;
  Square square;
  Square attackedSquare;
  Color pieceColor;
  Color attackedColor;
  bool add;
};

struct Accumulator {
  alignas(ALIGNMENT) int16_t threatState[2][L1_SIZE];
  alignas(ALIGNMENT) int16_t pieceState[2][L1_SIZE];

  DirtyPiece dirtyPieces[4];
  int numDirtyPieces;
  DirtyThreat dirtyThreats[256];
  int numDirtyThreats;

  KingBucketInfo kingBucketInfo[2];
  Board* board;
};

lmao the 256 can definitely be made smaller

but it's not like that's an issue here, we're staying in the same accumulator

twilit oriole Oct 3, 2025, 1:55 PM

#

something else to try is measure threat activity per index over a long search. i think ultra rare threats could be combined

stray reef Oct 3, 2025, 1:56 PM

#

oh god

twilit oriole Oct 3, 2025, 1:56 PM

#

like the threats that only activate in underpromo situations etc

#

i expect the distribution has an extreme skew in general

rocky vigil Oct 3, 2025, 1:57 PM

#

yeah i mean it looks small

#

idk about cache but i wouldn't see how it's a big issue

#

if anything the lookup table looks much larger of an issue

#

but if you measured that it gains over using less

stray reef Oct 3, 2025, 1:58 PM

#

the lookup table is ofc way bigger than theoretically necessary

rocky vigil Oct 3, 2025, 1:58 PM

#

then idk either

stray reef Oct 3, 2025, 1:59 PM

#

but doing the calculations to reduce size (e.g. compressing the [64][64]) are more expensive apparently

naive comet Oct 3, 2025, 1:59 PM

#

stray reef ```c++ struct DirtyThreat { Piece piece; Piece attackedPiece; Square squar...

cant attackedColor/pieceColor be inferred from attackedSquare/square?

stray reef Oct 3, 2025, 1:59 PM

#

yes, that would work

rocky vigil Oct 3, 2025, 2:00 PM

#

if you're willing to do a bunch of mailbox lookups you only need the two squares

stray reef Oct 3, 2025, 2:01 PM

#

i don't have colored pieces so it'd have to be bitboard lookups but yeah

rocky vigil Oct 3, 2025, 2:01 PM

#

oh interesting

formal smelt Oct 3, 2025, 2:09 PM

#

stray reef <@156022481147133952> threat inputs don't allow duplicate encoding of the same i...

we just ditched them early on, you can see your example here https://github.com/official-monty/Monty/blob/master/src/networks/value/threats.rs#L197
just free space saving

#

i wouldn't expect it to make a notable difference in the resulting net, though the training would be slightly different

stray reef Oct 3, 2025, 8:07 PM

#

I tried a bunch more stuff to optimise the index calculation. Even tried unpacking the network like this

struct NetworkData {
  alignas(ALIGNMENT) int16_t inputWeightsPawn[ThreatInputs::LookupSizes::PAWN * L1_SIZE];
  alignas(ALIGNMENT) int16_t inputWeightsKnight[ThreatInputs::LookupSizes::KNIGHT * L1_SIZE];
  alignas(ALIGNMENT) int16_t inputWeightsBishop[ThreatInputs::LookupSizes::BISHOP * L1_SIZE];
  alignas(ALIGNMENT) int16_t inputWeightsRook[ThreatInputs::LookupSizes::ROOK * L1_SIZE];
  alignas(ALIGNMENT) int16_t inputWeightsQueen[ThreatInputs::LookupSizes::QUEEN * L1_SIZE];
  alignas(ALIGNMENT) int16_t inputWeightsKing[ThreatInputs::LookupSizes::KING* L1_SIZE];
  alignas(ALIGNMENT) int16_t inputWeightsPsq[768 * KING_BUCKETS * L1_SIZE];
  alignas(ALIGNMENT) int16_t inputBiases[L1_SIZE];
  alignas(ALIGNMENT) int8_t  l1Weights[OUTPUT_BUCKETS][L1_SIZE * L2_SIZE];
  alignas(ALIGNMENT) float   l1Biases[OUTPUT_BUCKETS][L2_SIZE];
  alignas(ALIGNMENT) float   l2Weights[OUTPUT_BUCKETS][2 * L2_SIZE * L3_SIZE];
  alignas(ALIGNMENT) float   l2Biases[OUTPUT_BUCKETS][L3_SIZE];
  alignas(ALIGNMENT) float   l3Weights[OUTPUT_BUCKETS][L3_SIZE + 2 * L2_SIZE];
  alignas(ALIGNMENT) float   l3Biases[OUTPUT_BUCKETS];
};

where the threat feature weights for each attacking piece are encoded as [64][64][6][2][2]. was equally fast. ofc there would be way too much unused space but i was hoping to at least achieve faster calculation, cache pressure was roughly similar still

#

i think i'll give up on speedups for now, and just generate some more data

twilit oriole Oct 4, 2025, 2:32 AM

#

Hm. Something else to try is have the L1 for piece square inputs be larger than that of the threat inputs

rocky vigil Oct 4, 2025, 3:08 AM

#

How are you inferencing that then

twilit oriole Oct 4, 2025, 3:27 AM

#

?

#

In the usual way. Just stop early in the L1 for the threat inputs

rocky vigil Oct 4, 2025, 3:28 AM

#

ah I see

#

asymmetric like that requires more extensive trainer modifications and stuff

twilit oriole Oct 4, 2025, 3:31 AM

#

In bullet should be easy

stray reef Oct 4, 2025, 7:48 AM

#

twilit oriole Hm. Something else to try is have the L1 for piece square inputs be larger than ...

honestly I think right now increasing L1 would easily pass LTC. 384 -> 512 passed STC with 4 elo or so, 640 should definitely be doable

rocky vigil Oct 4, 2025, 7:49 AM

#

yeah the issue rn is probably more data

#

than intrinsic scaling of the arch

violet badger Oct 4, 2025, 7:51 AM

#

have 160B positions on offer for the price of $0.0

twilit oriole Oct 4, 2025, 8:04 AM

#

stray reef honestly I think right now increasing L1 would easily pass LTC. 384 -> 512 passe...

Yeah I know. It wasn't suggested as an alternative to that

rocky vigil Oct 5, 2025, 4:33 PM

#

what was the difference in 123rrr4 btw

#

this is cool bc it should hopefully mean ltc is neutral now

#

so very close

stray reef Oct 5, 2025, 7:05 PM

#

Yeah wanted to post about this. 0123rrr4 is the last stage with 600M more positions (5ksn adversarial) compared to 0123rrr. Gained 2 elo at STC + LTC

The game plan is generate 600M more positions while I'm on holiday, and then train a 640 L1

rocky vigil Oct 5, 2025, 7:32 PM

#

nice it's looking very promising

#

hopefully you are rewarded for all of the effort soon enough

rocky vigil Oct 12, 2025, 3:51 AM

#

@violet badger we discovered an error in the threat offsets initializer not being run. That should be resolved now, so the threat features should actually train

#

same branch: https://github.com/sscg13/nnue-pytorch/tree/threat-inputs

#

let's try a short test run first, and I'll verify the fix works

prime mica Oct 12, 2025, 5:31 AM

#

super exciting

violet badger Oct 12, 2025, 6:48 AM

#

rocky vigil <@713871252246495262> we discovered an error in the threat offsets initializer n...

OK, will try to set that up today. Need to recall where we did our first experiment 😉

violet badger Oct 12, 2025, 7:09 AM

#

@rocky vigil do you happen to have a repo + sha of an SF that can use your net already? If I have it, I should be able to add this to the training pipeline already. Not urgent.

#

ok, think I found it threat-inputs-rebase last commit.

violet badger Oct 12, 2025, 8:20 AM

#

violet badger OK, will try to set that up today. Need to recall where we did our first experim...

humming https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/5137461961076608/2926829081096545/-/pipelines/2094796471

prime mica Oct 12, 2025, 8:30 AM

#

violet badger humming https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/513746196107660...

ooh cool

#

how long is it expected to take?

violet badger Oct 12, 2025, 8:32 AM

#

short test only, 1h for 'a bit of a net'

#

Full training schedule would e about 4days

prime mica Oct 12, 2025, 8:32 AM

#

gotcha

#

do you happen to know what hardware is being used

violet badger Oct 12, 2025, 8:32 AM

#

let me think..

prime mica Oct 12, 2025, 8:33 AM

#

(wondering how doable it is to experiment at home)

violet badger Oct 12, 2025, 8:33 AM

#

Needs experimenting, depends on your GPU.

#

that is on a H100 equivalent.

prime mica Oct 12, 2025, 8:33 AM

#

fancy schmancy

violet badger Oct 12, 2025, 8:34 AM

#

But I don't think this is ways faster than some fancy home GPU.

stray reef Oct 12, 2025, 10:49 AM

#

https://github.com/Yoshie2000/PlentyChess/pull/400

GitHub

Update default network to 0126rrr by Yoshie2000 · Pull Request #40...

This PR introduces efficiently updated threat inputs with PSQ king buckets,
aka. (79856+768x12 -> 640)x2 -> (16 -> 32 -> 1)x8 instead of the previous (768x12 -> 1...

#

It is merged

violet badger Oct 12, 2025, 10:57 AM

#

while threat inputs in SF won its first games against master...
[129, 722, 283, 29, 0]

#

@rocky vigil https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/5137461961076608/2926829081096545/-/jobs/11687793252/artifacts/browse/step_e06216ffc4a2/ is a net for download. Still young, 100 epochs only. Match here https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/5137461961076608/2926829081096545/-/jobs/11687793253 (not fixed nodes)

#

Elo: -150.87 +/- 7.78, nElo: -309.70 +/- 14.12

#

So, time to further increase epochs.

rocky vigil Oct 12, 2025, 11:20 AM

#

violet badger while threat inputs in SF won its first games against master... `[129, 722, 283...

Ah I need to re-index the threats in inference as well

#

Lemme do that quickly

#

And update against master as well

#

But it’s already looking better

violet badger Oct 12, 2025, 11:21 AM

#

oh, that's going to make a difference, but sure.

rocky vigil Oct 12, 2025, 11:21 AM

#

Gimme a bit to sanity check run through lldb etc.

violet badger Oct 12, 2025, 11:21 AM

#

sure...

#

the net won't run away

rocky vigil Oct 12, 2025, 11:23 AM

#

Would have done this yesterday if I knew it would’ve been a very fast response

violet badger Oct 12, 2025, 11:23 AM

#

it is exciting, so got bumped in priority 😉

rocky vigil Oct 12, 2025, 11:25 AM

#

Wait actually the current inference already seems to use the right indexing

#

Ah it was always the bullet indexing

#

Still lemme sanity check

rocky vigil Oct 12, 2025, 11:30 AM

#

violet badger <@693549181838819338> https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/5...

oh wait this isn't fixed nodes, that's insane, it's like 1/3-1/2 the speed of master since it's basic non-ue inference

violet badger Oct 12, 2025, 11:31 AM

#

so that sound promising..

#

it is still a very early net as well, I wouldn't expect a master net to be better than -100 Elo at this point.

rocky vigil Oct 12, 2025, 11:35 AM

#

this is much better and looks proper

rocky vigil Oct 12, 2025, 11:43 AM

#

violet badger it is still a very early net as well, I wouldn't expect a master net to be bette...

quick sanity 20k nodes (on 8moves, balanced book)

...      Stockfish TI-experimental playing White: 27 - 25 - 48  [0.510] 100
...      Stockfish TI-experimental playing Black: 17 - 36 - 47  [0.405] 100
...      White vs Black: 63 - 42 - 95  [0.552] 200
Elo difference: -29.6 +/- 35.0, LOS: 4.9 %, DrawRatio: 47.5 %
SPRT: llr 0 (0.0%), lbound -inf, ubound inf
200 of 200 games finished.```
so unless some recent search development is extremely good at low node counts, the net is already quite good

violet badger Oct 12, 2025, 11:44 AM

#

no magic search recently.

#

that's really rather strong already.

#

should definitely gain > 30 Elo from training.

#

There should be an updated net in like 12h or so, that should be equivalent to master -50Elo.

rocky vigil Oct 12, 2025, 11:45 AM

#

the estimate of -20% speed with optimization still seems accurate

#

but am hopeful 50 elo more from the full training can be gotten

#

especially now that the threats seem to actually work properly

violet badger Oct 12, 2025, 11:46 AM

#

pretty certain that 50Elo is still quite easy with training..

#

unless these nets train much faster

rocky vigil Oct 12, 2025, 11:46 AM

#

i wouldn't expect it, due to parameter count

#

monty used a similar training schedule of 3000 * (100M pos) I think at L1=3072 or so, and plentychess was 1200 * (100M pos) at L1=512

naive comet Oct 12, 2025, 11:48 AM

#

@stray reef do you have lofty's resource regarding incremental threat tracking?

violet badger Oct 12, 2025, 11:49 AM

#

so, the quick practical conclusion is that the inference code is fine for testing right now. No need for me to change things urgently.

rocky vigil Oct 12, 2025, 11:52 AM

#

yeah

#

switching to nnue-pytorch is good on my end

#

as I can just use the real inference code and it just "works"

violet badger Oct 12, 2025, 11:55 AM

#

not sure I fully followed that remark, but yes, SF inference code is working, though will need the speedup work that we kind of know how to start.

#

If nnue-pytorch is working fine, we'll have a next net in about 12h

#

And could have a fully trained net in 3-4d

rocky vigil Oct 12, 2025, 11:59 AM

#

violet badger not sure I fully followed that remark, but yes, SF inference code is working, th...

ah basically I don't have to hack in the later layers of inference also

#

needed to write the entire inference from scratch last attempt with bullet

violet badger Oct 12, 2025, 12:00 PM

#

I understand now... one day would still be nice to have a bullet compatible setup, but that's a different story.

stray reef Oct 12, 2025, 12:11 PM

#

naive comet <@415167192296849409> do you have lofty's resource regarding incremental threat ...

Not sure if there is a dedicated resource other than yukari code

#

though it is using bitlists instead of bitboards

#

Stefan Pohl is going to do some tests with the new net as well, against the latest release (net being the only diff to latest release). Will be interesting to have those results as well

twilit oriole Oct 12, 2025, 12:28 PM

#

rocky vigil monty used a similar training schedule of 3000 * (100M pos) I think at L1=3072 o...

it was 4k SB at L1 3072

rocky vigil Oct 12, 2025, 12:28 PM

#

ah

#

fair enough

naive comet Oct 12, 2025, 12:35 PM

#

stray reef Not sure if there is a dedicated resource other than yukari code

oh, I guess I should refer to your code then?

#

how are the (expanded) threat inputs indexed btw

#

I know your current input set is just that but squished right?

stray reef Oct 12, 2025, 12:38 PM

#

naive comet how are the (expanded) threat inputs indexed btw

are you referring to my attempt to simplify indexing for faster index calculations?

#

the current indexing setup stems from an old montytrain branch

naive comet Oct 12, 2025, 12:42 PM

#

stray reef the current indexing setup stems from an old montytrain branch

okay I should probably upread this thread

stray reef Oct 12, 2025, 12:42 PM

#

naive comet oh, I guess I should refer to your code then?

https://github.com/Yoshie2000/PlentyChess/blob/main/src/board.cpp#L404-L565 this is what i have rn

naive comet Oct 12, 2025, 12:42 PM

#

thank you

sharp sail Oct 12, 2025, 1:38 PM

#

rocky vigil i wouldn't expect it, due to parameter count

For other NN applications I had the experience that often (maybe counterintuitively?) large NNs train faster initially

rocky vigil Oct 12, 2025, 1:39 PM

#

huh

#

i actually did hear smth like this from leela

sharp sail Oct 12, 2025, 1:39 PM

#

The way I explained it for myself was because with each step you update more parameters than for a small NN

rocky vigil Oct 12, 2025, 1:39 PM

#

like about the large nets being faster initially but much slower to squeeze out maximum performance from

sharp sail Oct 12, 2025, 1:39 PM

#

So it has more potential to learn in a single step

#

But without systematic analysis I'll be careful to make a definite claim, it could also just be that that hyperparameters were optimized for large NNs

frosty imp Oct 12, 2025, 7:38 PM

#

@rocky vigil https://pastebin.com/q5T0zfFE

📎 message.txt

rocky vigil Oct 12, 2025, 7:38 PM

#

ok

#

sample 0 looks correct by manual inspection

#

i mean the fact that it's so close fixed nodes means it hopefully works

sharp sail Oct 12, 2025, 7:58 PM

#

how close?

#

one of the reasons why NNs are so hard to debug is because even when they're buggy, they often perform pretty well

rocky vigil Oct 12, 2025, 8:00 PM

#

sharp sail how close?

30 +- 30

twilit oriole Oct 12, 2025, 8:05 PM

#

What L1 size

#

@rocky vigil

rocky vigil Oct 12, 2025, 8:09 PM

#

1024

#

well that was 100 sb

twilit oriole Oct 12, 2025, 8:09 PM

#

Interesting. And that's like -20% speed?

rocky vigil Oct 12, 2025, 8:09 PM

#

should be according to plenty data

twilit oriole Oct 12, 2025, 8:09 PM

#

The plenty measurement didn't have pairwise?

#

I would have thought it's less than 20% slowdown

rocky vigil Oct 12, 2025, 8:11 PM

#

ahhh

#

mm hmm

twilit oriole Oct 12, 2025, 8:15 PM

#

I think anything above 30 fixed nodes should be passing easily at SF VVLTC for around 15% slowdown

rocky vigil Oct 12, 2025, 8:15 PM

#

rocky vigil 30 +- 30

i should clarify this is distance to master

#

so we still need ~60 more ish

violet badger Oct 12, 2025, 8:16 PM

#

should be quite straightforward to measure nps?

#

no need to speculate what it is right now?

twilit oriole Oct 12, 2025, 8:16 PM

#

It isn't. Very position dependent

rocky vigil Oct 12, 2025, 8:16 PM

#

right now the inference is not intended to optimize nps

violet badger Oct 12, 2025, 8:16 PM

#

speedtest works?

#

you just get a number

rocky vigil Oct 12, 2025, 8:16 PM

#

it is intended to optimize for correctness

violet badger Oct 12, 2025, 8:16 PM

#

sure

twilit oriole Oct 12, 2025, 8:16 PM

#

Not really. The elo dependence of speed is dependent on position

rocky vigil Oct 12, 2025, 8:17 PM

#

non-ue on my laptop is like 1/3 - 1/2 the speed of master

violet badger Oct 12, 2025, 8:17 PM

#

right so that's a number

rocky vigil Oct 12, 2025, 8:17 PM

#

but i think my laptop is not representative

#

do stuff wrong and it'll send processes between the P / E cores etc.

#

average intel laptop experience

violet badger Oct 12, 2025, 8:17 PM

#

let me measure

twilit oriole Oct 12, 2025, 8:18 PM

#

rocky vigil so we still need ~60 more ish

Where do u get that number

rocky vigil Oct 12, 2025, 8:18 PM

#

the target is +30 (fixed nodes), and the 100 SB one was -30 +- 30

violet badger Oct 12, 2025, 8:18 PM

#

at STC the difference is 150 Elo

twilit oriole Oct 12, 2025, 8:18 PM

#

Lol

rocky vigil Oct 12, 2025, 8:18 PM

#

and yeah -150 elo or so seems reasonable

#

for being 2x slower rn

twilit oriole Oct 12, 2025, 8:19 PM

#

Well I don't think we "need" 60 Elo kek. We need a measurement with lower error bars lol

rocky vigil Oct 12, 2025, 8:19 PM

#

this is true

rocky vigil Oct 12, 2025, 8:19 PM

#

twilit oriole Well I don't think we "need" 60 Elo kek. We need a measurement with lower error ...

you can run this locally if you have hardware for it, just pull my branch

#

threat-inputs-rebase

twilit oriole Oct 12, 2025, 8:20 PM

#

Where's the net

rocky vigil Oct 12, 2025, 8:20 PM

#

violet badger <@693549181838819338> https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/5...

up here

twilit oriole Oct 12, 2025, 8:20 PM

#

And I just set evalfile?

rocky vigil Oct 12, 2025, 8:20 PM

#

my branch should name this net as default

#

so it should compile with it yea

violet badger Oct 12, 2025, 8:22 PM

#

592817 nps vs 1084640nps , so 54%

#

(quick test via bench)

twilit oriole Oct 12, 2025, 8:23 PM

#

How many SBs are there in total

rocky vigil Oct 12, 2025, 8:23 PM

#

the full run should have 800 * num stages

twilit oriole Oct 12, 2025, 8:24 PM

#

Oh early days then. Might as well wait till at least first stage concludes

rocky vigil Oct 12, 2025, 8:24 PM

#

i think at some point we should have the first stage checkpoint

#

yeah

#

800 SB

twilit oriole Oct 12, 2025, 8:24 PM

#

I assume there is some numbers on how close it is after first stage?

#

In a regular master run

violet badger Oct 12, 2025, 8:24 PM

#

normal net would be -50Elo

#

First stage should be ready in a couple of hours.

rocky vigil Oct 12, 2025, 8:25 PM

#

theoretically we surpassed -50 elo with 1/8 of the first stage so hopefully the good stuff continues 🙏

violet badger Oct 12, 2025, 8:25 PM

#

I think it looks promising indeed.

#

56% of speed is a lot of Elo STC.

#

(consistent with your fixed nodes number and my STC number)

#

I think people should start looking at a faster inference now, full trained net will be there before the end of the week.

rocky vigil Oct 12, 2025, 8:28 PM

#

yeah I'll start working with yoshie and let's see how we should approach the incremental threat tracking

violet badger Oct 12, 2025, 8:29 PM

#

Pretty sure we'll get some more people to look at this as it makes progress.

prime mica Oct 12, 2025, 8:33 PM

#

exciting

#

once there's something testable I'll take a look into improving NPS

violet badger Oct 12, 2025, 8:33 PM

#

there is

rocky vigil Oct 12, 2025, 8:33 PM

#

ah yeah your upstream optimizations have also made it here (:

prime mica Oct 12, 2025, 8:34 PM

#

lol

rocky vigil Oct 12, 2025, 8:34 PM

#

what we need to do next for improving NPS is like set up the foundation basically

#

our UE framework, etc.

violet badger Oct 12, 2025, 8:34 PM

#

https://github.com/sscg13/Stockfish/commits/threat-inputs-rebase/

rocky vigil Oct 12, 2025, 8:34 PM

#

and after we do that it's minor optimizations go go

violet badger Oct 12, 2025, 8:34 PM

#

I agree..

prime mica Oct 12, 2025, 8:35 PM

#

for sure

violet badger Oct 12, 2025, 8:35 PM

#

though some pondering can go in parallel 😉

stray reef Oct 12, 2025, 8:36 PM

#

bestmove sleep ponder speedup_ideas

prime mica Oct 12, 2025, 8:36 PM

#

I have a strange plot atm to fuse FC0 with add/sub

violet badger Oct 12, 2025, 8:36 PM

#

bestmove do dishes

prime mica Oct 12, 2025, 8:36 PM

#

also to fuse consecutive add/subs together

#

will try to keep it well-abstracted tho

rocky vigil Oct 12, 2025, 8:36 PM

#

ooh interesting

#

working fusing would be quite good

#

since average threat update has multiple add/sub

prime mica Oct 12, 2025, 8:37 PM

#

ye

#

my hope™ is that if add/sub is really memory bandwidth limited, then we should be able to do useful work (like the dot products) at the same time

#

but there are complications ofc

rocky vigil Oct 12, 2025, 8:37 PM

#

rocky vigil our UE framework, etc.

the other foundational thing is to set up dual accumulator, which I have been procrastinating on

stray reef Oct 12, 2025, 8:37 PM

#

can probably do a lot by fusing threat updates if done right

rocky vigil Oct 12, 2025, 8:38 PM

#

can't have good ue without dual accumulator, as otherwise every king move is suddenly gonna be 4x as expensive

#

so i guess that might be the priority

stray reef Oct 12, 2025, 8:39 PM

#

i've been trying to come up with something similar to finny tables, that fuses threat updates on a per move basis (for frequent moves), but no good idea yet

stray reef Oct 12, 2025, 8:39 PM

#

rocky vigil can't have good ue without dual accumulator, as otherwise every king move is sud...

the good news is, full refreshes from mirroring changing are neglegible

rocky vigil Oct 12, 2025, 8:39 PM

#

yeah i expected as much

prime mica Oct 12, 2025, 8:43 PM

#

rocky vigil the other foundational thing is to set up dual accumulator, which I have been pr...

elaborate?

#

what is a "dual accumulator"

rocky vigil Oct 12, 2025, 8:43 PM

#

should track the contribution from threat features and psq features separately

prime mica Oct 12, 2025, 8:43 PM

#

ohh

#

smort

rocky vigil Oct 12, 2025, 8:43 PM

#

bc like, the refresh patterns are different

#

psq needs a full refresh every king move

#

but threats only need a full refresh when the king crosses d/e (due to horizontal mirroring)

prime mica Oct 12, 2025, 8:44 PM

#

interesting

frosty imp Oct 13, 2025, 12:22 AM

#

what's an up-to-date threat inputs branch/net

rocky vigil Oct 13, 2025, 12:36 AM

#

https://github.com/sscg13/Stockfish/tree/threat-inputs-rebase

frosty imp Oct 13, 2025, 12:36 AM

#

hmm the net is not on fishtest?

rocky vigil Oct 13, 2025, 12:36 AM

#

frosty imp what's an up-to-date threat inputs branch/net

wait for https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/5137461961076608/2926829081096545/-/jobs/11688159305 to finish

#

stage 1

#

in ~ a few hours

#

hopefully that'll be equal fixed nodes to master

#

at least

frosty imp Oct 13, 2025, 12:38 AM

#

is there a net I can use to just get it running

rocky vigil Oct 13, 2025, 12:38 AM

#

violet badger <@693549181838819338> https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/5...

yep

#

this is also the one named in my branch

frosty imp Oct 13, 2025, 12:38 AM

#

ah cool

#

uploaded it to fishtest btw

rocky vigil Oct 13, 2025, 12:51 AM

#

oh lol

frosty imp Oct 13, 2025, 1:34 AM

#

do threat inputs apply to psqt?

rocky vigil Oct 13, 2025, 1:51 AM

#

yes

#

I think they do

#

why not

prime mica Oct 13, 2025, 1:54 AM

#

noob question, why are there both piece square table and positional factors

frosty imp Oct 13, 2025, 1:54 AM

#

I see

prime mica Oct 13, 2025, 1:54 AM

#

like why not just the latter

frosty imp Oct 13, 2025, 1:55 AM

#

prime mica noob question, why are there both piece square table and positional factors

well it gains

#

(theoretically)

#

but practically it gains to use the difference between psqt and positional as information

lofty cedar Oct 13, 2025, 1:55 AM

#

It seems that capturing simpler features first makes it much easier for the rest of the net to focus on the nonlinear ones.

prime mica Oct 13, 2025, 1:56 AM

#

frosty imp but practically it gains to use the difference between psqt and positional as in...

is this like

#

what is this difference intuitively

#

how sharp the position is?

frosty imp Oct 13, 2025, 1:56 AM

#

some kind of complexity measurement

#

yeah

prime mica Oct 13, 2025, 1:56 AM

#

interseting

frosty imp Oct 13, 2025, 1:57 AM

#

rocky vigil why not

misread the code

lofty cedar Oct 13, 2025, 1:57 AM

#

Well, for some reason... thr psqt and the positional factors are 125/128 and 131/128... but actually, they were trained on both being 1.

#

And somehow it gained.

frosty imp Oct 13, 2025, 1:57 AM

#

also are we planning to use psqt biases?

lofty cedar Oct 13, 2025, 1:57 AM

#

I think the 125, 131 are tuned...

rocky vigil Oct 13, 2025, 1:58 AM

#

frosty imp also are we planning to use psqt biases?

no i just set it to 0 to make inference easier fo rme

rocky vigil Oct 13, 2025, 3:23 AM

#

@twilit oriole stage 1 net is at https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/5137461961076608/2926829081096545/-/jobs/11688159305/artifacts/file/step_574f3061fd9e/nn-3e22bf1f564d.nnue, a brief test indicates it is at least on par with master at fixed (20k) nodes

#

...      Stockfish TI-experimental playing White: 33 - 12 - 55  [0.605] 100
...      Stockfish TI-experimental playing Black: 19 - 30 - 51  [0.445] 100
...      White vs Black: 63 - 31 - 106  [0.580] 200
Elo difference: 17.4 +/- 33.1, LOS: 84.9 %, DrawRatio: 53.0 %
SPRT: llr 0 (0.0%), lbound -inf, ubound inf
200 of 200 games finished.```

#

now pushed a bench to my branch

#

corresponding stc:

Elo: -116.83 +/- 6.88, nElo: -237.30 +/- 12.98
LOS: 0.00 %, DrawRatio: 30.60 %, PairsRatio: 0.08
Games: 2752, Wins: 300, Losses: 1192, Draws: 1260, Points: 930.0 (33.79 %)
Ptnml(0-2): [81, 802, 421, 72, 0], WL/DD Ratio: 1.18
LLR: -2.95 (-100.0%) (-2.94, 2.94) [-101.00, -99.00]```

rocky vigil Oct 13, 2025, 3:47 AM

#

rocky vigil ```Score of Stockfish TI-experimental vs Stockfish 10/07/25: 52 - 42 - 106 [0.52...

update: Score of Stockfish TI-experimental vs Stockfish 10/07/25: 255 - 251 - 494 [0.502] ... Stockfish TI-experimental playing White: 148 - 99 - 253 [0.549] 500 ... Stockfish TI-experimental playing Black: 107 - 152 - 241 [0.455] 500 ... White vs Black: 300 - 206 - 494 [0.547] 1000 Elo difference: 1.4 +/- 15.3, LOS: 57.1 %, DrawRatio: 49.4 % SPRT: llr 0 (0.0%), lbound -inf, ubound inf 1000 of 1000 games finished.

stray reef Oct 13, 2025, 3:51 AM

#

prime mica noob question, why are there both piece square table and positional factors

there are cases where a piece is neither threatened nor attacked - the net still needs to know about it

#

(you may be talking about the integrated psq of the sf arch, not threats, in that case nvm)

rocky vigil Oct 13, 2025, 3:53 AM

#

oh btw yoshie

#

we should probably also concurrently start working on setting up the ue

#

actually lemme start by figuring how how to do dual accumulator

stray reef Oct 13, 2025, 3:55 AM

#

alright, i can start with incremental threat tracking today

rocky vigil Oct 13, 2025, 3:57 AM

#

like add on to my branch?

#

that would be welcome yeah

#

how much do you estimate you'd have to overhaul sf stuff

rocky vigil Oct 13, 2025, 3:58 AM

#

rocky vigil how much do you estimate you'd have to overhaul sf stuff

this was the main concern i had when trying to think of this

#

ostensibly you need to add stuff to the position structure etc.

stray reef Oct 13, 2025, 4:04 AM

#

rocky vigil how much do you estimate you'd have to overhaul sf stuff

not sure how complex the sf position structure is, hopefully less than you think

rocky vigil Oct 13, 2025, 4:05 AM

#

well good luck with it

#

i sleep soon

#

i don't know if the rest of the training stages are happening but that would also be interesting to see

rocky vigil Oct 13, 2025, 4:15 AM

#

rocky vigil actually lemme start by figuring how how to do dual accumulator

The simplest way I see for this is to have two accumulator(stack) classes one for threats and one for psq

#

But this is a lot of code duplication

frosty imp Oct 13, 2025, 4:18 AM

#

i'm doing dual acc right now

#

it's done except for whatever bug in psqt

rocky vigil Oct 13, 2025, 4:19 AM

#

Ohhhh

#

Very cool

frosty imp Oct 13, 2025, 4:41 AM

#

@stray reef https://github.com/xu-shawn/Stockfish/tree/threat_inputs

#

cc @rocky vigil

#

updated branch with dual acc. Just FYI I refactored some stuff with the input features so it's probably best to write incremental threats on top of this

rocky vigil Oct 13, 2025, 4:43 AM

#

Ok cool

#

There is also a new net (see above) just to note

#

Bench looks right

#

For the older net

violet badger Oct 13, 2025, 4:53 AM

#

nice, that worked well, so roughly 30 Elo progress and parity at fixed nodes.

#

With some luck adding the other training stages adds another 30+ Elo. So I'll start those soon

frosty imp Oct 13, 2025, 5:10 AM

#

I’m wondering if some pairwise multiplication-ish architecture is possible with threat inputs

#

Since dual accumulators is already a thing

violet badger Oct 13, 2025, 5:28 AM

#

so, final 4 stages added here https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/5137461961076608/2926829081096545/-/pipelines/2095621268

rocky vigil Oct 13, 2025, 1:35 PM

#

frosty imp updated branch with dual acc. Just FYI I refactored some stuff with the input fe...

Wait I thought if you only wanted to update threats from scratch I have a different function append_active_threats for this

#

Lemme look more carefully

desert tree Oct 13, 2025, 1:36 PM

#

violet badger so, final 4 stages added here https://gitlab.com/cscs-ci/ci-testing/webhook-ci/m...

seems to have crashed

rocky vigil Oct 13, 2025, 1:36 PM

#

Ah you updated it to exclude the psq parts

#

Fair enough

#

Does it achieve any speedup

#

Now that the halfkav2hm part is being ue’d normally

rocky vigil Oct 13, 2025, 1:38 PM

#

desert tree seems to have crashed

Unexpected EOF

#

Very strange

rocky vigil Oct 13, 2025, 3:27 PM

#

frosty imp updated branch with dual acc. Just FYI I refactored some stuff with the input fe...

Stockfish dev-20251012-536051bf by the Stockfish developers (see AUTHORS file)
info string Using 1 thread
Warmup position 3/3
Position 258/258
===========================
Version                    : Stockfish dev-20251012-536051bf
Compiled by                : g++ (GNUC) 15.1.0 on MinGW64
Compilation architecture   : x86-64-bmi2
Compilation settings       : 64bit BMI2 AVX2 SSE41 SSSE3 SSE2 POPCNT
Compiler __VERSION__ macro : 15.1.0
Large pages                : no
User invocation            : speedtest 1
Filled invocation          : speedtest 1 128 150
Available processors       : 0-15
Thread count               : 1
Thread binding             : none
TT size [MiB]              : 128
Hash max, avg [per mille]  :
    single search          : 43, 25
    single game            : 732, 453
Total nodes searched       : 122156946
Total search time [s]      : 153.585
Nodes/second               : 795370```
```./stockfish speedtest 1
Stockfish dev-20251012-3a5c355e by the Stockfish developers (see AUTHORS file)
info string Using 1 thread
Warmup position 3/3
Position 258/258
===========================
Version                    : Stockfish dev-20251012-3a5c355e
Compiled by                : g++ (GNUC) 15.1.0 on MinGW64
Compilation architecture   : x86-64-bmi2
Compilation settings       : 64bit BMI2 AVX2 SSE41 SSSE3 SSE2 POPCNT
Compiler __VERSION__ macro : 15.1.0
Large pages                : no
User invocation            : speedtest 1
Filled invocation          : speedtest 1 128 150
Available processors       : 0-15
Thread count               : 1
Thread binding             : none
TT size [MiB]              : 128
Hash max, avg [per mille]  :
    single search          : 47, 29
    single game            : 798, 525
Total nodes searched       : 137345907
Total search time [s]      : 153.564
Nodes/second               : 894388```

rocky vigil Oct 13, 2025, 4:08 PM

#

nice initial speedup

#

so now that takes care of the psq part

#

so we can focus on incremental threats

violet badger Oct 13, 2025, 7:24 PM

#

rocky vigil Unexpected EOF

restarted, some network issue can cause that (somewhere between the gitlab runner reading the output and the actual calculation).

rocky vigil Oct 13, 2025, 7:24 PM

#

huh

#

is the progress lost or no

violet badger Oct 13, 2025, 7:24 PM

#

no worries.

#

relatively transparent.

#

so, looks like we already made progress with the inference code... nice!

rocky vigil Oct 13, 2025, 7:28 PM

#

stray reef not sure how complex the sf position structure is, hopefully less than you think

how is this going? i can attempt to help if you want

stray reef Oct 13, 2025, 8:00 PM

#

Some other stuff got in the way, should get somewhere tomorrow

rocky vigil Oct 13, 2025, 8:03 PM

#

ah fair

rocky vigil Oct 14, 2025, 3:00 AM

#

stage 2, (nn-a878500a97a8.nnue), 8moves_v3.epd, 20k nodes

...      Stockfish TI-experimental playing White: 151 - 72 - 277  [0.579] 500
...      Stockfish TI-experimental playing Black: 104 - 116 - 280  [0.488] 500
...      White vs Black: 267 - 176 - 557  [0.545] 1000
Elo difference: 23.3 +/- 14.3, LOS: 99.9 %, DrawRatio: 55.7 %
SPRT: llr 0 (0.0%), lbound -inf, ubound inf
1000 of 1000 games finished.```

#

🙏

prisma hatchBOT Oct 14, 2025, 7:20 AM

#

2b2r2/p7/r1p1p3/P1p1P3/2P3R1/1P3kP1/2KB4/8 w - - 0 1Lichess Link | Image

frosty imp Oct 14, 2025, 7:20 AM

#

threat net solves this while master can't 👀

#

updated to latest net btw https://github.com/xu-shawn/Stockfish/tree/threat_inputs

#

significant static eval diff for 8/p6b/r1p1p3/P1p1P3/2P2P2/1P6/3Bk3/2K5 w - - 15 10

stray reef Oct 14, 2025, 7:37 AM

#

alright i got something written up, getting to the debugging part now

#

(just incremental threat tracking, no UE yet)

stray reef Oct 14, 2025, 8:18 AM

#

https://github.com/xu-shawn/Stockfish/pull/9 ig i'll PR it to shawns branch for now

#

gonna start working on UE now (though i might get stuck in SF inference hell there, we'll see)

rocky vigil Oct 14, 2025, 8:21 AM

#

ah

#

wait i think the net changed in the meanwhile

#

to stage 2 net

stray reef Oct 14, 2025, 8:22 AM

#

yeah i rebased

rocky vigil Oct 14, 2025, 8:23 AM

#

ah yeah i guess the bench

#

but whatever

stray reef Oct 14, 2025, 8:24 AM

#

ah forgot to update the bench in the PR, but the commit has the right bench

rocky vigil Oct 14, 2025, 8:24 AM

#

oh lol

#

very cool!

#

getting much farther than the previous attempt half a year ago

stray reef Oct 14, 2025, 8:25 AM

#

it's definitely not a good way to just do what the branch currently has:

std::vector<AccumulatorState> accumulators;
std::vector<AccumulatorState> threat_accumulators;

since AccumulatorState has a dirty piece, but now also needs a dirty threats list, which we don't want to duplicate

#

nnue_accumulator.h/.cpp looks awful to work with lmao

rocky vigil Oct 14, 2025, 8:26 AM

#

yeah we probably want to distinguish the two

#

but it's extra effort

desert tree Oct 14, 2025, 8:27 AM

#

frosty imp threat net solves this while master can't 👀

fwiw master does solve this after a while

#

just not quickly

rocky vigil Oct 14, 2025, 8:32 AM

#

stray reef nnue_accumulator.h/.cpp looks awful to work with lmao

yeah it like fits halfkav2hm very well but is very hard to extend

rocky vigil Oct 14, 2025, 8:40 AM

#

rocky vigil yeah we probably want to distinguish the two

i think the easiest way to do this is probably to distinguish AccumulatorState with ThreatAccumulatorState

#

so maybe if that is done everything will still work nicely

stray reef Oct 14, 2025, 8:43 AM

#

I think I won't produce anything reasonable here. Adding more abstraction is going to make this code even worse, making what's there fit is ugly, I'd want to simplify it if anything

rocky vigil Oct 14, 2025, 8:48 AM

#

that would also be nice if it can be done in a good way

#

how would you want to simplify?

#

we should probably also get @frosty imp's opinion since he probably understands this code the best

stray reef Oct 14, 2025, 8:58 AM

#

rocky vigil how would you want to simplify?

i have no idea :P

candid ivy Oct 14, 2025, 9:18 AM

#

If i understand correctly the "problem" is that the AccumulatorState

    Accumulator<TransformedFeatureDimensionsBig>   accumulatorBig;
    Accumulator<TransformedFeatureDimensionsSmall> accumulatorSmall;
    DirtyPiece

has this but it actually only needs one accumulator? and no dirty pieces?

stray reef Oct 14, 2025, 9:19 AM

#

are we removing smallnet support?

rocky vigil Oct 14, 2025, 9:22 AM

#

we could

#

i think shawn wanted to keep it in case it was still useful

#

but right now bool use_smallnet is just false

stray reef Oct 14, 2025, 9:24 AM

#

imo this is a maintainer decision

#

it makes no sense to remove it now if we need to re-implement it in 2 weeks

rocky vigil Oct 14, 2025, 9:26 AM

#

candid ivy If i understand correctly the "problem" is that the AccumulatorState ``` Acc...

if we still want to support smallnet, then ideally we split between PsqAccumulatorState and ThreatAccumulatorState, where PsqAccumulatorState has smallnet, bignet, dirtypiece while ThreatAccumulatorState has bignet and dirtythreats

candid ivy Oct 14, 2025, 9:29 AM

#

i'd be fine with removing it if the threat inputs itself is strong enough to compensate the loss obviously

stray reef Oct 14, 2025, 9:35 AM

#

We should be able to remove it then

violet badger Oct 14, 2025, 10:17 AM

#

I think if we remove it it will reappear... threat net doesn't solve what smallnet provides (i.e. speed at decided positions)

rocky vigil Oct 14, 2025, 10:18 AM

#

it's probably worth testing later

#

if it turns out to be a big gain many small things can be masked underneath it

rocky vigil Oct 14, 2025, 2:55 PM

#

@frosty imp would you mind setting a low throughput stc vs master as well

#

my guess is around the range of -50 to -40 elo

frosty imp Oct 14, 2025, 3:03 PM

#

stray reef it's definitely not a good way to just do what the branch currently has: ```c++ ...

can you pass the dirty type through a template?

rocky vigil Oct 14, 2025, 3:04 PM

#

would it really be better to template it

#

instead of just having two separate ones

rocky vigil Oct 14, 2025, 3:04 PM

#

rocky vigil instead of just having two separate ones

i personally feel this would be more clear

frosty imp Oct 14, 2025, 3:05 PM

#

eh sounds like a lot of code duplication

#

that would add templates with accumulatorStack operations anyway

frosty imp Oct 14, 2025, 3:12 PM

#

stray reef nnue_accumulator.h/.cpp looks awful to work with lmao

I would say add a variable length version of AccumulatorUpdateContext::apply(IndexList added, removed)

prime mica Oct 14, 2025, 3:12 PM

#

stray reef nnue_accumulator.h/.cpp looks awful to work with lmao

ngl this file has caused me great pain over the last few days haha

#

honestly what would be ideal to me is some sort of simple DSL to describe the network layout

#

and a Python (or whatever) script to generate nice C++ code

#

that way you don't have to futz around with template metaprogramming

#

it'd also make performance improvements easier by allowing layers to be fused together

violet badger Oct 14, 2025, 5:35 PM

#

stage 3 arrived .. https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/5137461961076608/2926829081096545/-/jobs/11701503559/artifacts/browse/step_6d0eccfc51a2/

rocky vigil Oct 14, 2025, 5:45 PM

#

nice

#

i think stage 4 should be the big gain (according to master results) but we'll see

violet badger Oct 14, 2025, 5:47 PM

#

I think that's a bit unpredictable, I've seen it jump or not at that point or earlier.

rocky vigil Oct 14, 2025, 5:48 PM

#

anyways this is probably beyond what I can run reasonably fast locally fixed nodes so I'll just put a reduced throughput stc up on fishtest vs stage 2

violet badger Oct 14, 2025, 5:49 PM

#

at the end of the full training run there will be accurate results on the testing, so patience will also get us there.

#

i.e. will be clear which net to pick

rocky vigil Oct 14, 2025, 5:50 PM

#

fair

frosty imp Oct 14, 2025, 5:52 PM

#

https://tests.stockfishchess.org/tests/view/68ee8da528e6d77fcff9fd86

rocky vigil Oct 14, 2025, 5:52 PM

#

oh nice

violet badger Oct 14, 2025, 6:08 PM

#

shawn impatient 😉

#

now, I'm much more curious to see the inference speedup patch being tested like that... seems like this was another good improvement though.

green moat Oct 14, 2025, 6:15 PM

#

@violet badger
Did you check if removing those duplicated lines actually improves "master" nets?
#nnue-dev message

violet badger Oct 14, 2025, 6:16 PM

#

green moat <@713871252246495262> Did you check if removing those duplicated lines actually...

#nnue-dev message ... wrong thread to ask though.

green moat Oct 14, 2025, 6:16 PM

#

violet badger https://discord.com/channels/435943710472011776/718853716266188890/1427675740484...

ok, sorry

frosty imp Oct 14, 2025, 6:38 PM

#

violet badger now, I'm much more curious to see the inference speedup patch being tested like ...

the dual accumulator patch or the incremental threats patch

violet badger Oct 14, 2025, 6:38 PM

#

all steps needed to get to full speed inference 😉

#

but I meant the net test you did (with good improvement)

#

if I'm not mistaken that suggests another 10+ Elo from stage 2 to stage 3?

twilit oriole Oct 14, 2025, 9:31 PM

#

You will want to try doubling length of each stage after this run completes. Convergence time goes up a lot because some threats are very rare

#

Also u can ditch small net, try later disabling threats for decided positions (will need a new training run as well obviously)

#

It should do a similar thing with benefit of that regular part of the accumulator always being up to date if it switches back to regular eval

frosty imp Oct 14, 2025, 11:35 PM

#

incremental threats done

#

debug time nohope

frosty imp Oct 15, 2025, 12:06 AM

#

@stray reef have you debugged the incremental threats calculation? my bench isn't matching and I'm not sure where the problem is

rocky vigil Oct 15, 2025, 12:23 AM

#

Do you have a branch

#

I’ll also try looking through it

rocky vigil Oct 15, 2025, 12:24 AM

#

frosty imp debug time <:nohope:991297740753612812>

The pain begins

frosty imp Oct 15, 2025, 12:24 AM

#

rocky vigil Do you have a branch

https://github.com/xu-shawn/Stockfish/tree/threat_incremental_updates

#

suspecting something is wrong when capturing a piece

rocky vigil Oct 15, 2025, 12:32 AM

#

@frosty imp https://github.com/xu-shawn/Stockfish/blob/threat_incremental_updates/src/nnue/features/full_threats.cpp#L213 uh

#

this is not true actually

#

needs to be refreshed when mirroring changes

frosty imp Oct 15, 2025, 12:32 AM

#

oh crap yeah

rocky vigil Oct 15, 2025, 12:35 AM

#

https://github.com/xu-shawn/Stockfish/blob/threat_incremental_updates/src/nnue/features/full_threats.cpp#L197 you should also guard against this being Dimensions, because that indicates deduplication

#

i.e. smth like

let index = make_index(...)
if (index < Dimensions) { append(index) }

#

never really figured out a better way to handle deduplication

#

afaik plentychess does same thing

rocky vigil Oct 15, 2025, 12:37 AM

#

frosty imp oh crap yeah

the psqdifftype is actually more useful for this as you might guess

frosty imp Oct 15, 2025, 12:39 AM

#

rocky vigil <https://github.com/xu-shawn/Stockfish/blob/threat_incremental_updates/src/nnue/...

wait wdym

rocky vigil Oct 15, 2025, 12:39 AM

#

https://github.com/xu-shawn/Stockfish/blob/threat_incremental_updates/src/nnue/features/full_threats.cpp#L85

#

in short some threats imply the existence of the corresponding ones in the opposite direction

#

i.e. rook attacking queen implies queen attacking rook

#

so in that case we filter so that only one of the two is active

frosty imp Oct 15, 2025, 12:40 AM

#

ah I see

rocky vigil Oct 15, 2025, 12:45 AM

#

yeah besides this most of the failure points would come from the incremental threat calculation

#

but yoshie claims he tested this thoroughly against from scratch

#

so I'm hoping it just works after these fixes

frosty imp Oct 15, 2025, 12:53 AM

#

let's go

#

bench matches

rocky vigil Oct 15, 2025, 12:53 AM

#

wait !!!!

#

let

#

's go?

#

how fast lol

frosty imp Oct 15, 2025, 12:54 AM

#

around 20%

rocky vigil Oct 15, 2025, 12:54 AM

#

ah

#

wait vs previous?

frosty imp Oct 15, 2025, 12:55 AM

#

yeah

rocky vigil Oct 15, 2025, 12:55 AM

#

hmm

frosty imp Oct 15, 2025, 12:55 AM

#

vs threat tracking but no incr update

rocky vigil Oct 15, 2025, 12:55 AM

#

ok ok i see

#

so that moves -52 to -15 or so?

#

this gonna be a close one at stc

#

but should scale

frosty imp Oct 15, 2025, 12:56 AM

#

let's see

#

https://github.com/xu-shawn/Stockfish/tree/threat_inputs

#

merged

rocky vigil Oct 15, 2025, 12:57 AM

#

nice

frosty imp Oct 15, 2025, 12:58 AM

#

https://tests.stockfishchess.org/tests/view/68eef18c28e6d77fcff9fe6e

#

stc

rocky vigil Oct 15, 2025, 1:04 AM

#

oh nice most of the compilation warnings also died

#

the small things xd

frosty imp Oct 15, 2025, 1:11 AM

#

oh huh no difference on speedtest

rocky vigil Oct 15, 2025, 1:15 AM

#

Stockfish dev-20251014-895f63de by the Stockfish developers (see AUTHORS file)
info string Using 1 thread
Warmup position 3/3
Position 258/258
===========================
Version                    : Stockfish dev-20251014-895f63de
Compiled by                : g++ (GNUC) 15.1.0 on MinGW64
Compilation architecture   : x86-64-avxvnni
Compilation settings       : 64bit VNNI BMI2 AVX2 SSE41 SSSE3 SSE2 POPCNT
Compiler __VERSION__ macro : 15.1.0
Large pages                : no
User invocation            : speedtest 1
Filled invocation          : speedtest 1 128 150
Available processors       : 0-15
Thread count               : 1
Thread binding             : none
TT size [MiB]              : 128
Hash max, avg [per mille]  :
    single search          : 56, 31
    single game            : 821, 583
Total nodes searched       : 141977379
Total search time [s]      : 153.54
Nodes/second               : 924693```
```./stockfish speedtest 1
Stockfish dev-20251014-75edbee0 by the Stockfish developers (see AUTHORS file)
info string Using 1 thread
Warmup position 3/3
Position 258/258
===========================
Version                    : Stockfish dev-20251014-75edbee0
Compiled by                : g++ (GNUC) 15.1.0 on MinGW64
Compilation architecture   : x86-64-avxvnni
Compilation settings       : 64bit VNNI BMI2 AVX2 SSE41 SSSE3 SSE2 POPCNT
Compiler __VERSION__ macro : 15.1.0
Large pages                : no
User invocation            : speedtest 1
Filled invocation          : speedtest 1 128 150
Available processors       : 0-15
Thread count               : 1
Thread binding             : none
TT size [MiB]              : 128
Hash max, avg [per mille]  :
    single search          : 78, 40
    single game            : 914, 712
Total nodes searched       : 190559772
Total search time [s]      : 153.52
Nodes/second               : 1241270```

#

-25% maybe?

frosty imp Oct 15, 2025, 1:16 AM

#

rip

rocky vigil Oct 15, 2025, 1:22 AM

#

@stray reef can you get similar numbers for plentychess L1=1024?

frosty imp Oct 15, 2025, 1:35 AM

#

~~https://tests.stockfishchess.org/tests/view/68eefa6228e6d77fcff9fe7f~~ https://tests.stockfishchess.org/tests/view/68eefb0828e6d77fcff9fe86 speed test here

frosty imp Oct 15, 2025, 2:26 AM

#

rocky vigil Oct 15, 2025, 2:31 AM

#

this isn't too bad

#

15% overhead

twilit oriole Oct 15, 2025, 2:32 AM

#

Well. It's -10 at STC ofc it's not too bad kek

#

Finish training + SPSA is already enough to just pass at higher TCs ig

frosty imp Oct 15, 2025, 2:33 AM

#

really hoping we won't need net SPSA

#

search SPSA maybe though

twilit oriole Oct 15, 2025, 2:33 AM

#

Well even if you don't "need" it is just a large gain eventually

#

8 Elo is a lot

rocky vigil Oct 15, 2025, 2:35 AM

#

finish training already might be enough

#

assuming LTC scales by +5 or so

#

am curious how https://tests.stockfishchess.org/tests/live_elo/68eee5fd28e6d77fcff9fe4f will affect the speeds thoug

frosty imp Oct 15, 2025, 2:36 AM

#

still SSS Kappa

rocky vigil Oct 15, 2025, 2:37 AM

#

actually that one probably benefits us as well

#

since it's related to overhead of finny tables

rocky vigil Oct 15, 2025, 3:29 AM

#

violet badger stage 3 arrived .. https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/5137...

and the websocket thing happened in stage 4 again

frosty imp Oct 15, 2025, 3:32 AM

#

oof

rocky vigil Oct 15, 2025, 3:33 AM

#

also this is so confusing

#

-52 elo

#

speedup 10 elo

#

net 6 elo

#

= -14 elo?

#

it does not add

#

btw does not fusing make it faster

#

yoshie also said that fusing the threat updates never worked well

#

oh fishtest

#

ig we'll see

frosty imp Oct 15, 2025, 3:40 AM

#

speedtest looks the same, so I put it on fishtest

rocky vigil Oct 15, 2025, 3:40 AM

#

rocky vigil it does not add

something is strange here

frosty imp Oct 15, 2025, 3:44 AM

#

ig just errors bars

rocky vigil Oct 15, 2025, 3:44 AM

#

hmm

#

really thought the speedup would be more though

#

like 20 elo

#

or even 30

rocky vigil Oct 15, 2025, 4:11 AM

#

twilit oriole You will want to try doubling length of each stage after this run completes. Con...

i also like did not add factorizer to psq part for this run, that might also play a role

#

the next run should also have this once I figure out how to do it

#

@stray reef seems plentychess with 640 is typically 30-40% faster than current branch (based on manual inspection of nps in two LTC games), is this reasonable numbers?

violet badger Oct 15, 2025, 4:38 AM

#

also removal of smallnet, let's say 10 Elo .. or even more in this case.

stray reef Oct 15, 2025, 4:40 AM

#

frosty imp <@415167192296849409> have you debugged the incremental threats calculation? my ...

huh? you mean the bench of the PR with no further changes?
yes i did debug it, everything was 100% identical

stray reef Oct 15, 2025, 4:41 AM

#

rocky vigil <@415167192296849409> can you get similar numbers for plentychess L1=1024?

Later today, sure

violet badger Oct 15, 2025, 4:41 AM

#

rocky vigil and the websocket thing happened in stage 4 again

"fixed"

rocky vigil Oct 15, 2025, 4:41 AM

#

stray reef huh? you mean the bench of the PR with no further changes? yes i did debug it, e...

btw we fixed it lmao

#

everything was indeed good

#

it was not in the threat calculation

#

there are some tests up on fishtest rn

#

around -16 stc

stray reef Oct 15, 2025, 4:42 AM

#

rocky vigil <@415167192296849409> seems plentychess with 640 is typically 30-40% faster than...

30-40% faster than 1024... hm
might be reasonable, if closer to the 30 side ig

rocky vigil Oct 15, 2025, 4:42 AM

#

is where we are at

stray reef Oct 15, 2025, 4:42 AM

#

rocky vigil around -16 stc

damn nice

rocky vigil Oct 15, 2025, 4:42 AM

#

prayer for scaling

#

i think we wait until stage 5 for that

#

since i strongly suspect lack of factorizer + threat inputs itself means it benefits more from more stages

violet badger Oct 15, 2025, 5:22 AM

#

am I correc that this is the current branch to be used in testing https://github.com/xu-shawn/Stockfish/commits/threat_inputs/

#

b7f553ee8b28a4abace6c1056dceb1d69169873a

frosty imp Oct 15, 2025, 5:27 AM

#

yeah

rocky vigil Oct 15, 2025, 5:51 AM

#

stray reef damn nice

bruh it dropped to -20 i hate error bars

lofty cedar Oct 15, 2025, 7:03 AM

#

To think that some obscure monty led to more than a thousand nontrivial LOCs, the rewrite of Stockfish training infrastructure, etc... that could finally be gaining.

naive comet Oct 15, 2025, 7:06 AM

#

"some obscure monty"?

lofty cedar Oct 15, 2025, 7:10 AM

#

~50 stars on github vs 14000.

naive comet Oct 15, 2025, 7:11 AM

#

it's not obscure in the chess engine sphere

#

idk otherwise you could call practically any other engine obscure

lofty cedar Oct 15, 2025, 7:13 AM

#

Ethereal is about 400. Even stormphrax is like 100.
Koivisto is 150.

And given that github stars are already skewed toward programmers who are familiar with the chess engines, the popularity of monty compared to Stockfish in the wider chess world is probably even lesser.

#

But let's look at a more objective metric: TCEC. Monty isn't even in TCEC.

naive comet Oct 15, 2025, 7:16 AM

#

Monty should've been in tcec if not for some small issues

#

anyways that's besides the point

#

@frosty imp I might try some speed stuff later

#

do I speedtest or start a test on fishtest?

rocky vigil Oct 15, 2025, 7:17 AM

#

speedtest probably fine

#

am very curious how uh

#

ue only managed to gain like 5% speed

#

or smth

frosty imp Oct 15, 2025, 7:19 AM

#

naive comet do I speedtest or start a test on fishtest?

both are good ig

naive comet Oct 15, 2025, 7:20 AM

#

rocky vigil ue only managed to gain like 5% speed

prolly cuz of Finny being cracked

lofty cedar Oct 15, 2025, 7:21 AM

#

naive comet Monty should've been in tcec if not for some small issues

Though it shows one thing...

A chess engine doesn't even have to be even remotely close to the strength to be able to improve another engine.

#

Which is wild.

stray reef Oct 15, 2025, 7:22 AM

#

i think you underestimate monty in many ways, including strength

lofty cedar Oct 15, 2025, 7:23 AM

#

Isn't Monty like 700 elo behind?

rocky vigil Oct 15, 2025, 7:23 AM

#

naive comet prolly cuz of Finny being cracked

no but like prior to this it was literally compute every threat and add it up

#

i guess lazy eval probably screws around with the threats

violet badger Oct 15, 2025, 7:27 AM

#

lofty cedar Though it shows one thing... A chess engine doesn't even have to be even remot...

hasn't lead to an improvement yet ... you're often a bit too speculative, let's stay close to the facts..

rocky vigil Oct 15, 2025, 7:27 AM

#

sad

violet badger Oct 15, 2025, 7:28 AM

#

This is ongoing work.... let's not forget that something similar was tried years ago by sopel, and at that time it didn't gain either.

#

things have changed, not the least the amount of data available, improved trainer, etc etc... so worthwhile trying again.

#

as usual a lot of work has to come together to replace sota stuff..

rocky vigil Oct 15, 2025, 7:29 AM

#

2 more hours until stage 4 or so i presume?

violet badger Oct 15, 2025, 7:29 AM

#

something like that.

stray reef Oct 15, 2025, 7:30 AM

#

lofty cedar Isn't Monty like 700 elo behind?

monty dev is like 3500 or 3600 afaik

#

and under tcec conditions a lot better than whatever ccrl or so would show

lofty cedar Oct 15, 2025, 7:30 AM

#

stray reef monty dev is like 3500 or 3600 afaik

What? I see... the info might be outdated.

rocky vigil Oct 15, 2025, 7:31 AM

#

yeah tcec conditions a lot better

#

since gigantic net reduces contention

#

~~let's also not forget that PlentyChess is also #1 at ccrl 40/15 rn~~

violet badger Oct 15, 2025, 7:32 AM

#

rocky vigil yeah tcec conditions a lot better

measurements where?

rocky vigil Oct 15, 2025, 7:33 AM

#

e.g. things like https://tests.montychess.org/tests/view/68d5e6bd56f229dd4390f2b4 compared with https://tests.montychess.org/tests/view/68d5e99756f229dd4390f2b8

#

i suppose like

#

200+2 5thread

#

is similar to CCRL Blitz 8CPU

lofty cedar Oct 15, 2025, 7:34 AM

#

violet badger hasn't lead to an improvement yet ... you're often a bit too speculative, let's ...

I said it could... not that it did...

#

Though I often use the back-of-the-envelope calculation that if fixed node elo gains more than two third of the elo loss from slowdown, it should gain at LTC.

naive comet Oct 15, 2025, 7:36 AM

#

lofty cedar Though it shows one thing... A chess engine doesn't even have to be even remot...

Yeah, especially since the current top engine in the world PlentyChess has already adopted this.

frosty imp Oct 15, 2025, 7:37 AM

#

guys stop distracting cj from coming up with bangers Kappa

lofty cedar Oct 15, 2025, 7:37 AM

#

Oh... welp... I guess yeah...

Though Stockfish is often a bit more conservative in adopting ideas than in other engines because it often has to be done well to gain.

stray reef Oct 15, 2025, 7:38 AM

#

I understand, I did not do it well in plenty Kappa

lofty cedar Oct 15, 2025, 7:38 AM

#

Oh, not that... I meant that in Stockfish, since the baseline is higher, it's much harder to gain with new ideas.

stray reef Oct 15, 2025, 7:39 AM

#

just joking ofc

frosty imp Oct 15, 2025, 7:39 AM

#

@naive comet here's the profile if you haven't seen #1336647760388034610 message

stray reef Oct 15, 2025, 7:39 AM

#

sf is a much bigger entity

frosty imp Oct 15, 2025, 7:39 AM

#

maybe there's opportunities in incremental threat tracking? idk

#

the refresh scheme might also be improvable

rocky vigil Oct 15, 2025, 7:46 AM

#

well threat specific stuff

#

would be in tracking i think

#

or like the actual accumulator updates

#

like we should see

#

if backwards updates are still worth it for threats

#

considering that refreshing from scratch is not as heavy as expected

frosty imp Oct 15, 2025, 7:46 AM

#

rocky vigil if backwards updates are still worth it for threats

seems so when I measured with speedtest

rocky vigil Oct 15, 2025, 7:47 AM

#

huh

#

strange

#

why are full refreshes so op

#

then

#

or like

#

how is it possible to come so close

#

with literally most basic strat

frosty imp Oct 15, 2025, 7:47 AM

#

how huge is the diff usually

#

compared to a full refresh

rocky vigil Oct 15, 2025, 7:48 AM

#

on average is what, 8 or so?

#

compared to a full refresh probably being like at least 20

frosty imp Oct 15, 2025, 7:50 AM

#

well the percentage reduction from full refresh to incremental isn't as good as halfkav2

#

ig maybe that's where the problem is

#

could be interesting to try alternative update schemes based on that

naive comet Oct 15, 2025, 7:51 AM

#

frosty imp

okay I'll think about it

rocky vigil Oct 15, 2025, 8:08 AM

#

frosty imp well the percentage reduction from full refresh to incremental isn't as good as ...

am curious how it's only 1/2 as slow in standard psq when mathematically ue is like 20% the work of full refresh

lofty cedar Oct 15, 2025, 8:17 AM

#

Maybe it takes a lot of work to compute what needs to get updated?

#

You can try byteboard technology if it helps.

stray reef Oct 15, 2025, 8:20 AM

#

Yeah that's something worth investigating... lots of simd stuff possible

naive comet Oct 15, 2025, 8:53 AM

#

how do I clone Shawn's branch and only that branch?

#

nvm I got it thanks to my friend chatgpt

rocky vigil Oct 15, 2025, 9:04 AM

#

yeah smth like

#

set shawnxu as a remote

#

and then pull from it

desert tree Oct 15, 2025, 9:14 AM

#

stray reef monty dev is like 3500 or 3600 afaik

3600 or so, and they’re focusing on strength under TCEC conditions afaik

rocky vigil Oct 15, 2025, 9:28 AM

#

violet badger "fixed"

is there a potential reason why this might trigger at a much higher rate with the threat inputs?

#

since it has happened again

violet badger Oct 15, 2025, 9:32 AM

#

no independent of what runs in CI, really just somehow timeout or dropped connection somewhere, needs some more robust polling mechanism in the CI infrastructure. Not our concern right now, just restart and wait a bit.

rocky vigil Oct 15, 2025, 9:33 AM

#

ah i see

violet badger Oct 15, 2025, 9:50 AM

#

so stage 4 finished https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/5137461961076608/2926829081096545/-/jobs/11722794988/artifacts/browse/step_bedc9e9b73fd/

#

for restart, I also updated the SF used in the final testing, so we'll get info on all steps with the current best inference in 24h or so.

#

step 5 should be running now.

#

I'm not expecting these training steps to gain miracolously, but we'll see.

rocky vigil Oct 15, 2025, 9:53 AM

#

hmm

#

should still hopefully be decent gains at least

violet badger Oct 15, 2025, 9:55 AM

#

fingers crossed, we'll see.

stray reef Oct 15, 2025, 12:30 PM

#

rocky vigil <@415167192296849409> can you get similar numbers for plentychess L1=1024?

Trained a 1SB L1=1024 net quickly.

Since plenty does not have a speedtest command, the most comparable thing I think I can easily do is a single-threaded d=20 bench, since plenty uses the same bench positions as SF.

2100060 nps ao3 for Stockfish latest dev
1486695 nps ao3 for Plenty
1553660 nps ao3 for shawns TI branch

looking pretty good i'd say :P

#

i think it is possible that a fully trained net is a bit sparser, so maybe the "real" number for plenty would be a big higher. but SF speeds are looking nice

naive comet Oct 15, 2025, 1:39 PM

#

@frosty imp @rocky vigil small speedup https://github.com/cj5716/Stockfish/tree/threat_inputs_3

#

1014074 vs 907784 but idk my hardware is noisy

#

I used speedtest btw

#

what should I do? pr to your branch?

#

also ideally I'd need someone with stable hardware to test this maybe

#

https://github.com/cj5716/Stockfish/commit/0aff9b1845d16bdc17972d454661fd57dee865fe vs https://github.com/xu-shawn/Stockfish/commit/b7f553ee8b28a4abace6c1056dceb1d69169873a

violet badger Oct 15, 2025, 1:50 PM

#

since when would 10% be small 😉

naive comet Oct 15, 2025, 1:50 PM

#

violet badger since when would 10% be small 😉

I don't trust that its 10% to be honest cuz i was typing in word during the speedtest

#

I can rerun without and see I guess

violet badger Oct 15, 2025, 1:50 PM

#

like at 1200 words per minute 😉

#

Anyway, I think PR to the branch of shawn, and he can integrate.. ?

#

Can be tested on fishtest, but I think this is not essential for speedups right now.

naive comet Oct 15, 2025, 1:54 PM

#

https://github.com/xu-shawn/Stockfish/pull/11 whoo

naive comet Oct 15, 2025, 2:18 PM

#

I bring good news: I might have found a further speedup

#

will wait and see

naive comet Oct 15, 2025, 2:37 PM

#

ehh its within a %ish

#

but from an empirical pov it saves 1 instr

frosty imp Oct 15, 2025, 2:59 PM

#

failed again? https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/5137461961076608/2926829081096545/-/jobs/11719829390

rocky vigil Oct 15, 2025, 3:04 PM

#

frosty imp failed again? <https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/51374619...

Oh that was already resolved

frosty imp Oct 15, 2025, 3:05 PM

#

I see the new pipeline now

#

new net https://tests.stockfishchess.org/tests/view/68efb98928e6d77fcff9ffd9

#

Version                    : Stockfish dev-20251014-b7f553ee
Compiled by                : g++ (GNUC) 14.2.0 on Linux
Compilation architecture   : x86-64-avxvnni
Compilation settings       : 64bit VNNI BMI2 AVX2 SSE41 SSSE3 SSE2 POPCNT
Compiler __VERSION__ macro : 14.2.0
Large pages                : yes
User invocation            : speedtest 1
Filled invocation          : speedtest 1 128 150
Available processors       : 0-3
Thread count               : 1
Thread binding             : none
TT size [MiB]              : 128
Hash max, avg [per mille]  : 
    single search          : 41, 22
    single game            : 668, 439
Total nodes searched       : 97642350
Total search time [s]      : 153.564
Nodes/second               : 635841

Version                    : Stockfish dev-20251015-40e85beb
Compiled by                : g++ (GNUC) 14.2.0 on Linux
Compilation architecture   : x86-64-avxvnni
Compilation settings       : 64bit VNNI BMI2 AVX2 SSE41 SSSE3 SSE2 POPCNT
Compiler __VERSION__ macro : 14.2.0
Large pages                : yes
User invocation            : speedtest 1
Filled invocation          : speedtest 1 128 150
Available processors       : 0-3
Thread count               : 1
Thread binding             : none
TT size [MiB]              : 128
Hash max, avg [per mille]  : 
    single search          : 42, 23
    single game            : 685, 459
Total nodes searched       : 100965370
Total search time [s]      : 153.551
Nodes/second               : 657536

local speedtest on cj speedup

naive comet Oct 15, 2025, 3:23 PM

#

OK, so closer to 3% than 10%

#

oh I understand all the noise now

#

it was using all the threads for speedtest

#

I dropped speedtest down to single thread

#

should be more accurate now

rocky vigil Oct 15, 2025, 4:19 PM

#

Stockfish dev-20251014-895f63de by the Stockfish developers (see AUTHORS file)
info string Using 1 thread
Warmup position 3/3
Position 258/258
===========================
Version                    : Stockfish dev-20251014-895f63de
Compiled by                : g++ (GNUC) 15.1.0 on MinGW64
Compilation architecture   : x86-64-avxvnni
Compilation settings       : 64bit VNNI BMI2 AVX2 SSE41 SSSE3 SSE2 POPCNT
Compiler __VERSION__ macro : 15.1.0
Large pages                : no
User invocation            : speedtest 1
Filled invocation          : speedtest 1 128 150
Available processors       : 0-15
Thread count               : 1
Thread binding             : none
TT size [MiB]              : 128
Hash max, avg [per mille]  :
    single search          : 58, 31
    single game            : 832, 590
Total nodes searched       : 143169711
Total search time [s]      : 153.545
Nodes/second               : 932428```
```./stockfish speedtest 1
Stockfish dev-20251015-40e85beb by the Stockfish developers (see AUTHORS file)
info string Using 1 thread
Warmup position 3/3
Position 258/258
===========================
Version                    : Stockfish dev-20251015-40e85beb
Compiled by                : g++ (GNUC) 15.1.0 on MinGW64
Compilation architecture   : x86-64-avxvnni
Compilation settings       : 64bit VNNI BMI2 AVX2 SSE41 SSSE3 SSE2 POPCNT
Compiler __VERSION__ macro : 15.1.0
Large pages                : no
User invocation            : speedtest 1
Filled invocation          : speedtest 1 128 150
Available processors       : 0-15
Thread count               : 1
Thread binding             : none
TT size [MiB]              : 128
Hash max, avg [per mille]  :
    single search          : 52, 31
    single game            : 820, 582
Total nodes searched       : 142798016
Total search time [s]      : 153.549
Nodes/second               : 929983``` yeah neutral on my laptop

#

but my laptop might be noisy

#

I hope this ran on the P cores the entire time

violet badger Oct 15, 2025, 4:21 PM

#

I think you can handle that with priority https://github.com/official-stockfish/Stockfish/issues/6213

rocky vigil Oct 15, 2025, 4:22 PM

#

yeah 900k+ suggests it used P core at least a majority of the time

twilit oriole Oct 15, 2025, 4:33 PM

#

just do fishtest test?

violet badger Oct 15, 2025, 4:40 PM

#

so, what's the speed-ups we could reasonably still expect?

#

(relative to shawn's nn-598188c9a702.nnue branch, which has most of it already).

frosty imp Oct 15, 2025, 4:44 PM

#

https://tests.stockfishchess.org/tests/view/68efb5d028e6d77fcff9ffd0

#

maybe we need @prime mica's expertise 😛

violet badger Oct 15, 2025, 4:45 PM

#

he'll make master faster faster than branch 😉

#

anyway, doing a quick test of your 598188c9a702 branch against master..

#

seems like we need another 30Elo or so..

rocky vigil Oct 15, 2025, 4:47 PM

#

violet badger so, what's the speed-ups we could reasonably still expect?

Depends on if smallnet is counted as a speedup, probably

candid ivy Oct 15, 2025, 4:47 PM

#

Maybe we can do a ralph wiggum approach with this

rocky vigil Oct 15, 2025, 4:48 PM

#

How much does smallnet speed up master?

violet badger Oct 15, 2025, 4:50 PM

#

I think that could be 5-10 Elo, but I don't know the exact number.

frosty imp Oct 15, 2025, 4:50 PM

#

is it possible to start a smallnet run now?

violet badger Oct 15, 2025, 4:51 PM

#

you mean a threat smallnet?

frosty imp Oct 15, 2025, 4:51 PM

#

yeah

violet badger Oct 15, 2025, 4:51 PM

#

sure.

#

like 128 ?

frosty imp Oct 15, 2025, 4:51 PM

#

lgtm

#

https://tests.stockfishchess.org/tests/view/68efd0a928e6d77fcff9fff4

#

threat vs master 10k games

violet badger Oct 15, 2025, 4:51 PM

#

have it

#

--------------------------------------------------
Results of master vs patch (10+0.1, 1t, 16MB, UHO_Lichess_4852_v1.epd):
Elo: 28.66 +/- 3.69, nElo: 53.68 +/- 6.88
LOS: 100.00 %, DrawRatio: 47.71 %, PairsRatio: 1.86
Games: 9806, Wins: 3046, Losses: 2239, Draws: 4521, Points: 5306.5 (54.11 %)
Ptnml(0-2): [39, 859, 2339, 1588, 78], WL/DD Ratio: 1.26
--------------------------------------------------

rocky vigil Oct 15, 2025, 4:52 PM

#

Ouch

#

Worse than fishtest

violet badger Oct 15, 2025, 4:52 PM

#

we don't have it on fishtest right?

rocky vigil Oct 15, 2025, 4:53 PM

#

frosty imp yeah

Wouldn’t a regular smallnet be better

frosty imp Oct 15, 2025, 4:53 PM

#

dunno

rocky vigil Oct 15, 2025, 4:53 PM

#

violet badger we don't have it on fishtest right?

We had -20 +- smth on fishtest

#

From the sprt

frosty imp Oct 15, 2025, 4:53 PM

#

https://tests.stockfishchess.org/tests/view/68eef18c28e6d77fcff9fe6e

#

nn-bf4519f857f4.nnue net

violet badger Oct 15, 2025, 4:53 PM

#

I see.

#

might depend quite a bit on HW.

#

(i.e. different memory architecture and so on).

#

Though arguably, ranges still overlap more or less.

rocky vigil Oct 15, 2025, 4:55 PM

#

Fair

rocky vigil Oct 15, 2025, 4:55 PM

#

frosty imp dunno

I mean I thought smallnet purpose was speed

#

So it would be better to have it not be threats

violet badger Oct 15, 2025, 4:56 PM

#

we might be mixing conversations, but yeah, if possible regular small net would be faster, unless there is something sharable between the two.

rocky vigil Oct 15, 2025, 4:57 PM

#

I think we could get the existing small net to work first and give it a try

violet badger Oct 15, 2025, 4:57 PM

#

I think that's probably better

rocky vigil Oct 15, 2025, 4:57 PM

#

Maybe some template bool use_threats or whatever

violet badger Oct 15, 2025, 4:59 PM

#

I will do a check at larger TC and more threads, just to have a reference.

rocky vigil Oct 15, 2025, 5:00 PM

#

That would be good

#

Yeah

rocky vigil Oct 15, 2025, 5:51 PM

#

Will start working on smallnet in ~2 hours

violet badger Oct 15, 2025, 5:56 PM

#

The standard or the threats one? I did start a threats net optimization at 128 as well, just one stage, so probably ready in like 8h or so.

rocky vigil Oct 15, 2025, 5:57 PM

#

Standard

twilit oriole Oct 15, 2025, 5:57 PM

#

yeah dont use threats for small net

rocky vigil Oct 15, 2025, 5:58 PM

#

It shouldn’t be too hard

#

At most more template bool use_threats nonsense

frosty imp Oct 15, 2025, 5:58 PM

#

eh you need more than that

rocky vigil Oct 15, 2025, 5:58 PM

#

Huh

frosty imp Oct 15, 2025, 5:58 PM

#

because threat accumulators are a class field

rocky vigil Oct 15, 2025, 5:59 PM

#

How I was gonna hack it in was just keep threat accumulators for smallnet but never touch them

frosty imp Oct 15, 2025, 5:59 PM

#

I guess maybe for a temporary hack

rocky vigil Oct 15, 2025, 5:59 PM

#

Yeah

frosty imp Oct 15, 2025, 6:00 PM

#

then you can check with constexpr bool UseThreats = Dimensions == TransformedFeatureDimensionsSmall

rocky vigil Oct 15, 2025, 6:01 PM

#

Ohhh indeed

frosty imp Oct 15, 2025, 6:01 PM

#

don't even need templates

rocky vigil Oct 15, 2025, 6:01 PM

#

This works

#

Btw if I call eval

#

Will it use smallnet when applicable

#

Or always big net?

frosty imp Oct 15, 2025, 6:04 PM

#

bignet always it seems

rocky vigil Oct 15, 2025, 6:05 PM

#

bruh

#

If I give it like KQQk

#

Will that default to smallnet then

#

In a real search

#

Like 8/8/8/3k4/8/8/6K1/6QQ b - - 0 1 for instance

frosty imp Oct 15, 2025, 6:08 PM

#

well just disable bignet in evaluate.cpp

rocky vigil Oct 15, 2025, 6:08 PM

#

Oh ok

#

Replace use smallnet with true

#

Sure

frosty imp Oct 15, 2025, 6:09 PM

#

also remove the re-eval

rocky vigil Oct 15, 2025, 6:09 PM

#

mm

#

Well won’t be back for another hour and a bit

#

But yeah

rocky vigil Oct 15, 2025, 6:23 PM

#

violet badger I will do a check at larger TC and more threads, just to have a reference.

What is the tc/thread count of this test?

violet badger Oct 15, 2025, 6:24 PM

#

60+0.6, 288t, 16000MB, UHO_Lichess_4852_v1.epd

rocky vigil Oct 15, 2025, 6:27 PM

#

Crazy

violet badger Oct 15, 2025, 6:28 PM

#

funny, 11 drawn game pairs in a row for now.

rocky vigil Oct 15, 2025, 6:28 PM

#

Can that even get a meaningful sample size

#

I was expecting LTC smp at most

violet badger Oct 15, 2025, 6:28 PM

#

you were the one talking about TCEC style dev 😉

rocky vigil Oct 15, 2025, 6:28 PM

#

hehe

violet badger Oct 15, 2025, 6:31 PM

#

but I must say that if it doesn't gain at LTC I would have quite strong reservations...

twilit oriole Oct 15, 2025, 6:33 PM

#

Can you do an updated fixed nodes with smaller error bars

rocky vigil Oct 15, 2025, 6:34 PM

#

twilit oriole Can you do an updated fixed nodes with smaller error bars

Do you have hardware available

twilit oriole Oct 15, 2025, 6:34 PM

#

Yes but vondele already has it set up lol

rocky vigil Oct 15, 2025, 6:35 PM

#

Oh

rocky vigil Oct 15, 2025, 6:37 PM

#

violet badger but I must say that if it doesn't gain at LTC I would have quite strong reservat...

Yeah I think LTC smp is out target

#

LTC if possible

violet badger Oct 15, 2025, 7:07 PM

#

when you set back the concurrency but forget to set back the hash ...

rocky vigil Oct 15, 2025, 7:16 PM

#

@frosty imp which branch is preferable for me to test smallnet against

frosty imp Oct 15, 2025, 7:16 PM

#

just the threat_inputs branch

rocky vigil Oct 15, 2025, 7:17 PM

#

ok

#

so including the cj commit?

frosty imp Oct 15, 2025, 7:17 PM

#

yep

rocky vigil Oct 15, 2025, 7:17 PM

#

and stage 4 net or still stage 3

frosty imp Oct 15, 2025, 7:17 PM

#

uh let's keep stage 3 net

rocky vigil Oct 15, 2025, 7:17 PM

#

ok

#

prayer for stage 5 😔

#

stage 4 disappointing

violet badger Oct 15, 2025, 7:19 PM

#

--------------------------------------------------
Results of master vs patch (20000 nodes, 1t, 16MB, UHO_Lichess_4852_v1.epd):
Elo: -29.60 +/- 2.28, nElo: -44.34 +/- 3.40
LOS: 0.00 %, DrawRatio: 41.82 %, PairsRatio: 0.64
Games: 40000, Wins: 10775, Losses: 14175, Draws: 15050, Points: 18300.0 (45.75 %)
Ptnml(0-2): [1551, 5531, 8363, 3877, 678], WL/DD Ratio: 1.96
--------------------------------------------------

#

so net is not bad per se, but speed matters

rocky vigil Oct 15, 2025, 7:28 PM

#

violet badger ``` -------------------------------------------------- Results of master vs patc...

@twilit oriole

rocky vigil Oct 15, 2025, 7:33 PM

#

frosty imp just the threat_inputs branch

btw is 3231282 the correct bench for master smallnet only?

twilit oriole Oct 15, 2025, 7:33 PM

#

Cool

rocky vigil Oct 15, 2025, 7:42 PM

#

actually @frosty imp how is the threat input branch able to read the smallnet without dying

violet badger Oct 15, 2025, 7:42 PM

#

doesn't read it?

#

at least there are warnings related to that..

#

(compile time warnings that is)

#

meanwhile, master and patch battling it out at scale, and deciding to break the UHO_Lichess_4852_v1.epd while doing so.

rocky vigil Oct 15, 2025, 7:48 PM

#

i kind of want a pgn of the games

#

if this is the 288 thread ltc

violet badger Oct 15, 2025, 7:48 PM

#

not saved I'm afraid

rocky vigil Oct 15, 2025, 7:48 PM

#

aww

#

shame

violet badger Oct 15, 2025, 7:48 PM

#

nah.

rocky vigil Oct 15, 2025, 7:48 PM

#

violet badger meanwhile, master and patch battling it out at scale, and deciding to break the ...

is it still all drawn pairs

violet badger Oct 15, 2025, 7:48 PM

#

Ptnml(0-2): [0, 1, 18, 1, 0]

#

on the UHO book.

rocky vigil Oct 15, 2025, 7:49 PM

#

ok

violet badger Oct 15, 2025, 7:49 PM

#

that's pretty insane IMO.

rocky vigil Oct 15, 2025, 7:49 PM

#

wait one each

#

insane

violet badger Oct 15, 2025, 7:49 PM

#

but 90% drawn game pairs

#

like by construction the book is aiming for 50% of those.

rocky vigil Oct 15, 2025, 7:59 PM

#

rocky vigil actually <@453859636890828802> how is the threat input branch able to read the s...

it seems if I just do it, it reads bignet only and dies

violet badger Oct 15, 2025, 8:00 PM

#

I'd assume the format of the data structures in memory is changed?

frosty imp Oct 15, 2025, 8:06 PM

#

rocky vigil actually <@453859636890828802> how is the threat input branch able to read the s...

it doesn't

#

I commented out the smallnet read + verification

rocky vigil Oct 15, 2025, 8:06 PM

#

so.

#

ok

#

checks out

#

shouldn't be too hard

#

bignet works fine

#

it isn't too bad

#

just frankenstein master and threat input code together

violet badger Oct 15, 2025, 8:09 PM

#

git checkout -b frankenstein ?

rocky vigil Oct 15, 2025, 8:20 PM

#

  * frame #0: 0x00007fff7fd5b212 msvcrt.dll`memcpy + 146
    frame #1: 0x00007ff74b8d5216 stockfish.exe`Stockfish::Eval::NNUE::AccumulatorCaches::Cache<128u>::Entry::clear(this=0x0000015a4ed26ac0, biases=0x0000000000000000)
    frame #2: 0x00007ff74b8d52cf stockfish.exe`void Stockfish::Eval::NNUE::AccumulatorCaches::Cache<128u>::clear<Stockfish::Eval::NNUE::Network<Stockfish::Eval::NNUE::NetworkArchitecture<128u, 15, 32>, Stockfish::Eval::NNUE::FeatureTransformer<128u>>>(this=0x0000015a4ed26ac0, network=0x0000015a40626108)
    frame #3: 0x00007ff74b8d536a stockfish.exe`void Stockfish::Eval::NNUE::AccumulatorCaches::clear<Stockfish::Eval::NNUE::Networks>(this=0x0000015a4ece2ac0, networks=0x0000015a40626090)
    frame #4: 0x00007ff74b8d53a0 stockfish.exe`Stockfish::Eval::NNUE::AccumulatorCaches::AccumulatorCaches<Stockfish::Eval::NNUE::Networks>(this=0x0000015a4ece2ac0, networks=0x0000015a40626090)```we love to see it

#

how are the smallnet biases null pointer

frosty imp Oct 15, 2025, 8:23 PM

#

branch?

rocky vigil Oct 15, 2025, 8:24 PM

#

oh lemme push

#

https://github.com/sscg13/Stockfish/tree/use_smallnet

frosty imp Oct 15, 2025, 8:30 PM

#

hmm try uncomment networks->small.verify

#

might be hash verification issues

rocky vigil Oct 15, 2025, 8:31 PM

#

oh

rocky vigil Oct 15, 2025, 8:32 PM

#

frosty imp hmm try uncomment networks->small.verify

thonk

#

vscode no autosave

rocky vigil Oct 15, 2025, 8:33 PM

#

rocky vigil btw is 3231282 the correct bench for master smallnet only?

3231282 now

#

gg?

#

bruh this smallnet is 1.8M vs 2.3M in master

frosty imp Oct 15, 2025, 8:36 PM

#

threat tracking maybe?

rocky vigil Oct 15, 2025, 8:36 PM

#

real weakness SHOWEN

frosty imp Oct 15, 2025, 8:36 PM

#

try benchmark

rocky vigil Oct 15, 2025, 8:36 PM

#

frosty imp threat tracking maybe?

oh wait

#

right

#

forgot it still did that

#

oh threat tracking pretty fast ngl

frosty imp Oct 15, 2025, 8:37 PM

#

frosty imp

~7% runtime with big threat net

rocky vigil Oct 15, 2025, 8:41 PM

#

Stockfish dev-20251015-4c91a5c9 by the Stockfish developers (see AUTHORS file)
info string Using 1 thread
Warmup position 3/3
Position 258/258
===========================
Version                    : Stockfish dev-20251015-4c91a5c9
Compiled by                : g++ (GNUC) 15.1.0 on MinGW64
Compilation architecture   : x86-64-avxvnni
Compilation settings       : 64bit VNNI BMI2 AVX2 SSE41 SSSE3 SSE2 POPCNT
Compiler __VERSION__ macro : 15.1.0
Large pages                : no
User invocation            : speedtest 1
Filled invocation          : speedtest 1 128 150
Available processors       : 0-15
Thread count               : 1
Thread binding             : none
TT size [MiB]              : 128
Hash max, avg [per mille]  :
    single search          : 56, 32
    single game            : 852, 602
Total nodes searched       : 150141109
Total search time [s]      : 153.543
Nodes/second               : 977844```

#

(with smallnet)

rocky vigil Oct 15, 2025, 8:42 PM

#

rocky vigil ```./stockfish-old speedtest 1 Stockfish dev-20251014-895f63de by the Stockfish ...

maybe 5% faster

#

we'll see how much elo this is

prime mica Oct 15, 2025, 8:43 PM

#

frosty imp

update_piece_threats and append_changed_indices can definitely be optimized to a tiny fraction of the runtime

#

unless I'm misunderstanding what threats are

rocky vigil Oct 15, 2025, 8:44 PM

#

aprs

#

shawn

#

approve my test

violet badger Oct 15, 2025, 8:55 PM

#

just what was needed

rocky vigil Oct 15, 2025, 9:02 PM

#

what the sprt gods grant they taketh away

rocky vigil Oct 15, 2025, 10:23 PM

#

prime mica update_piece_threats and append_changed_indices can definitely be optimized to a...

Append changed indices mostly calculates threat indices, idk if there is faster way for this

naive comet Oct 16, 2025, 12:44 AM

#

prime mica update_piece_threats and append_changed_indices can definitely be optimized to a...

that benchmark is old (before my speedup)

rocky vigil Oct 16, 2025, 2:06 AM

#

Ouch

#

No big improvement from stage 5 either

rocky vigil Oct 16, 2025, 2:23 AM

#

@twilit oriole expecting that at fishtest conditions rn at STC without any major breakthroughs we can get it to -15 +- 5 or so

#UE Threat Inputs for AB