FSRS Megathread | Anki | Page 4

tepid spoke Mar 8, 2025, 12:52 AM

#

do not optimize it

bold terrace Mar 8, 2025, 12:52 AM

#

Don't know that one, I'll take a look

#

Right now I really like my own deck

tepid spoke Mar 8, 2025, 12:53 AM

#

https://ankiweb.net/shared/info/911122782

#

it's a fully integrated grammar course in an Anki-Deck

bold terrace Mar 8, 2025, 12:53 AM

#

Ah I see 🙂

#

I did a lot of Bunpro for grammar

#

but I stopped after reaching N1, because most points are more vocabulary than grammar now

tepid spoke Mar 8, 2025, 12:54 AM

#

There is a surprising amount of overlap between "it's a grammar point" and "it's a vocab"

#

Like, is stuff like によって a vocab or a grammar point?

bold terrace Mar 8, 2025, 12:55 AM

#

Yeah that one I can understand

#

but at some point it was really like "Put いきなり to mean it's sudden"

#

And a lot of things I was training didn't really occurs in any material I checked

#

に加えて、を込めて... feels more like vocab to me

cursive badge Mar 8, 2025, 12:57 AM

#

*shakes fist in general direction of Japan*

#

I find myself sometimes going too fast and getting meanings wrong. Then I take a harder look, remember the reading and go "of course!"

tepid spoke Mar 8, 2025, 1:03 AM

#

I really wish WK would export their warning list via the API...

cursive badge Mar 8, 2025, 1:04 AM

#

I can understand though. They don't want to make it too easy to steal their secret sauce.

#

It would be really interesting if we could have access to all their review records. Then you could start doing some Kanji-specific SRS optimisation.

tepid spoke Mar 8, 2025, 1:06 AM

#

It's literally just a list of "yes this is right but not what we're asking for", "this is a common typo" and other cases where they give you a second chance when entering them

#

I implemented that to a limited degree myself

#

where if you type a correct reading of a Kanji, but not the one WK asks for, it will flash the input yellow and you can reconsider

cursive badge Mar 8, 2025, 1:08 AM

#

tepid spoke It's literally just a list of "yes this is right but not what we're asking for",...

But they had to pay people to curate that data. We are quite lucky that give us access to structured data at all.

tepid spoke Mar 8, 2025, 1:08 AM

#

Well, to use the API, you require a paid account

cursive badge Mar 8, 2025, 1:09 AM

#

I know. I have made my own script that downloads the data and generates notes/cards.

tepid spoke Mar 8, 2025, 1:10 AM

#

They actually took down the WaniKani deck from AnkiWeb

#

which is fair, it's literally piracy imo

cursive badge Mar 8, 2025, 1:11 AM

#

I got a lifetime sub. So I feel happy downloading the data.

tepid spoke Mar 8, 2025, 1:11 AM

#

Writing an Anki-AddOn that sync it for that one user is fair game

#

but then sharing that deck publicly is not

cursive badge Mar 8, 2025, 1:11 AM

#

I keep meaning to improve my cards, but I got it good enough and keep getting distracted by shiny new projects.

tepid spoke Mar 8, 2025, 1:12 AM

#

I just took the old WK3 decks templates and slightly tweaked them

#

https://ankiweb.net/shared/info/391275087 if you want to see how it looks

cursive badge Mar 8, 2025, 1:16 AM

#

I wish their SRS was not so bad. With their dataset you could probably get into fancy domain-specific SRS stuff like Math Academy. Instead they just do fixed intervals 😦

tepid spoke Mar 8, 2025, 1:17 AM

#

Yeah, but it'll be very hard to analyse

#

given they treat meaning+reading as separate but not seperate things

polar maple Mar 8, 2025, 2:47 AM

#

if you can systematically differentiate these then you can just make more presets

tepid spoke Mar 8, 2025, 3:20 AM

#

I wouldn't know how to possibly do that

#

I'd have to manually go through over 18000 cards and classify them

quasi shadow Mar 8, 2025, 4:23 AM

#

Why not press easy on them?

bold terrace Mar 8, 2025, 8:45 AM

#

https://media.discordapp.net/attachments/1347522619460812810/1347852526271205427/CleanShot_2025-03-08_at_09.41.262x.png?ex=67cd54fc&is=67cc037c&hm=01119607ef3afe7fcc014a675f8110ce3cb1bfca8298bf6842f27a8f87c21f5b&=&format=webp&quality=lossless&width=1595&height=1164

#

Interestingly, if you zoom enough on the 100% difficluty spike, you get something that look like a normal distribution

#

This was [90%,100%] with 100 steps

hasty fractal Mar 8, 2025, 10:47 AM

#

can we have tooltips (help text like in deck options) for stats page? I think it'll be helpful.

#

especially for the new fsrs related stats which are somewhat complex imo

#

expertium did bring this up before but nothing transpired after it

#

for now, the writings can be just copy pasted from the manual

unique salmon Mar 8, 2025, 11:03 AM

#

@quasi shadow

hasty fractal Mar 8, 2025, 11:06 AM

#

bruh...

#

materialists be like, "gravity is also physical, it's made of 'virtual' particles like gravitons"

#

one day they will say consciousness is made up of virtual particles 🤣

unique salmon Mar 8, 2025, 11:09 AM

#

sorata, I mean this in the nicest way possible: stick to crunching numbers. No philosophy.

hasty fractal Mar 8, 2025, 11:09 AM

#

bro u should meditate. crunching numbers has destroyed your psyche.

unique salmon Mar 8, 2025, 11:25 AM

#

Since D is the hot topic these days, I decided to get back to trying to improve D
First, I tried a very simple approach:
` def surprise_f(self, r: Tensor, binary_rating: Tensor):
r = r.clamp(0.0001, 0.9999)
surprise = -torch.log(1 - torch.abs(r - binary_rating))
return surprise.clamp(0, 100)

def next_d(self, old_d: Tensor, r: Tensor, rating: Tensor) -> Tensor:
    binary_rating = torch.where(rating > 1, torch.ones_like(rating), torch.zeros_like(rating))
    delta_d = -self.w[6] * (rating - 3) * self.surprise_f(r, binary_rating)
    new_d = old_d + self.linear_damping(delta_d, old_d)
    new_d = self.mean_reversion(self.init_d(4), new_d)
    return new_d`

Here we multiply delta_d by a surprise factor=-ln(1-abs(R - grade)), where grade is binary. The bigger the difference between R (prediction) and binary grade (reality), the bigger the surprise factor.
As you can see in the image, it didn't help. Next I'll try completely re-defining D.

robust hill Mar 8, 2025, 12:05 PM

#

#

couldnt find in manual

#

what is the dotted lining supposed to represent

#

and whys it there

bold terrace Mar 8, 2025, 12:05 PM

#

desired retention

unique salmon Mar 8, 2025, 12:05 PM

#

Yep

robust hill Mar 8, 2025, 12:05 PM

#

no the one going down

#

sorry

unique salmon Mar 8, 2025, 12:05 PM

#

Ah, just a projection

#

Into the future

robust hill Mar 8, 2025, 12:06 PM

#

i see

#

#

no lapses in this card, 93% desired retention

#

#

but these are its intervals

#

surely this cannot be right

tepid spoke Mar 8, 2025, 12:14 PM

#

quasi shadow Why not press `easy` on them?

Cause it's exceptionally rare that I'm so confident about an answer that I'd give it an "Easy".

#

It happens sometimes, but only for some few very basic words and kanji that I'm 100% confident I won't forget in the 3+ years hitting Easy will push it into the future.

bold terrace Mar 8, 2025, 12:15 PM

#

Same same

#

So many cases where you were "Oh this one I'll press easy" then you realize you got it wrong for some reasons 😂

#

#

Also, the few .5-.1s to think "Was it easy ?" is like ~10-20% of my avg review time (~5s)

#

So yeah, easy is really more like card well known before Anki

naive dome Mar 8, 2025, 12:20 PM

#

bold terrace Also, the few .5-.1s to think "Was it easy ?" is like ~10-20% of my avg review t...

that's why you should use a binary grading system of pass/fail, it's way less cognitive load

tepid spoke Mar 8, 2025, 12:21 PM

#

I don't see how pass/fail would work

#

I could do without the Easy button, but a lot of cards are Hard

#

and if the Hard button was gone, I'd probably press Again on them instead of passing them

bold terrace Mar 8, 2025, 12:22 PM

#

naive dome that's why you should use a binary grading system of pass/fail, it's way less co...

Also, if "Hard" means "Took me longer than expected", that's the kind of things that with AI you can detect without the user having to press "Hard"

tepid spoke Mar 8, 2025, 12:22 PM

#

The answer time is absolutely useless to judge anything by

#

Unless you install surveilance cameras around the PC, and feed that into some predictive network, there can be so many other reasons a card took long to answer...

#

There's also plenty of other reasons that something was Hard, other than time

bold terrace Mar 8, 2025, 12:23 PM

#

tepid spoke The answer time is absolutely useless to judge anything by

It is !

#

And I'm the kind of guy to alt tab

robust hill Mar 8, 2025, 12:24 PM

#

real ones know 1234 = goated

tepid spoke Mar 8, 2025, 12:24 PM

#

Anything that felt hard

robust hill Mar 8, 2025, 12:24 PM

#

tepid spoke Mar 8, 2025, 12:24 PM

#

Can be that it took me a lot of thinking to piece it back together from the mnemonics, or having almost confused it with a very loosely similar word

#

Really anything that makes me feel like this wasn't "Good"

#

Another example would be getting the meaning right, but thinking the most uncommon nuance of the vocab

#

In that case I often burry the card and then rate it hard the next day if I get it right then, or Fail it if I still don't get the nuance

bold terrace Mar 8, 2025, 12:26 PM

#

That's also the beauty with neural networks, you give them all those values, if there is a trend, it will train to recognize it, if not, it won't

robust hill Mar 8, 2025, 12:26 PM

#

what if the neural network goes rogue

#

and just fucks up my deck so i fail my exams

#

yea checkmate

bold terrace Mar 8, 2025, 12:26 PM

#

Well then don't use Anki in case it fuck up your collection, back to paper

robust hill Mar 8, 2025, 12:27 PM

#

i dont trust paper..

bold terrace Mar 8, 2025, 12:27 PM

#

NN doesn't just go rogue, they minimize a goal function

tepid spoke Mar 8, 2025, 12:27 PM

#

Anki so far behaves very predictable. Adding some neural network to do "something" would make it more or less a random unpredictable rollercoaster

bold terrace Mar 8, 2025, 12:27 PM

#

if the goal function is difference between prediction and outcome, if the difference is minimal, it can't be "THAT" bad

tepid spoke Mar 8, 2025, 12:27 PM

#

Neural Networks have a tendency to be incredibly volatile and hard/impossible to understand and debug. So no thanks.

#

It COULD be THAT bad

naive dome Mar 8, 2025, 12:28 PM

#

tepid spoke Another example would be getting the meaning right, but thinking the most uncomm...

I'm assuming you're studying Japanese, in the case of vocab cards you would just:

look at the kanji
recall the reading (if you miss the reading, you need to fail the card)
recall picture or definition (but if you have a strong gut feeling when seeing it, you should immediately pass the card)

tepid spoke Mar 8, 2025, 12:28 PM

#

The NN could conclude that it can get your desired retention to the set value by just showing you the same 10 cards every day, and the others never.

unique salmon Mar 8, 2025, 12:28 PM

#

@polar maple I hear NN slander 🤣

bold terrace Mar 8, 2025, 12:29 PM

#

tepid spoke The NN could conclude that it can get your desired retention to the set value by...

If you design the goal function as a the daily desired retention yes, but then it's your goal function that is wrong

#

If your goal function is to reduce the distance by reviews, it's OK

#

BTW

#

FSRS is not NN but it's how it's working right now

#

It minimize a cost function

#

Sooo, stop using FSRS

#

And do your how mind gymnastic to evaluate what you think is the best interval

tepid spoke Mar 8, 2025, 12:30 PM

#

naive dome I'm assuming you're studying Japanese, in the case of vocab cards you would just...

Meaning and Reading are seperate cards

bold terrace Mar 8, 2025, 12:30 PM

#

But the big advantage in both, is that FSRS/NN are aimed to REDUCE your cognitive load

tepid spoke Mar 8, 2025, 12:30 PM

#

It's WaniKani

#

And the why is pretty simple, cause they're different things to learn.

#

I'm doing WaniKani, just in Anki.

bold terrace Mar 8, 2025, 12:31 PM

#

Why not using WK in WK ?

tepid spoke Mar 8, 2025, 12:31 PM

#

Cause their website and SRS sucks

#

It's not that slow, I think you can finish on their site in 1-1.5 years if you always stay on top of the reviews

#

But that's quite an intense workload

#

It'll have taken me ~2.5 years now when I'm done in mid-April

#

"done" in the sense of no more new or young cards

robust hill Mar 8, 2025, 12:34 PM

#

what if we cant trust our own memories

#

what should we do

unique salmon Mar 8, 2025, 12:37 PM

#

robust hill what should we do

Live with the monks somewhere in the mountains

cursive badge Mar 8, 2025, 12:39 PM

#

bold terrace It is !

Mine goes up again at the end

tepid spoke Mar 8, 2025, 12:40 PM

#

WK has 6609 Vocabs and 2080 Kanji. Though the Vocab are primarily a reinforcement-tool for the Kanji, and learning them themselves is more a bonus.

robust hill Mar 8, 2025, 12:41 PM

#

cursive badge Mine goes up again at the end

what in the hard usageee

#

😭

#

i only press hard around 4% of the time

cursive badge Mar 8, 2025, 12:41 PM

#

robust hill what in the hard usageee

I know I'm weird, but it's working. 🤷‍♂️

bold terrace Mar 8, 2025, 12:42 PM

#

cursive badge Mine goes up again at the end

Which also would be fine with NN since it has a lot of parameters to bend the predictions 🙂

tepid spoke Mar 8, 2025, 12:43 PM

#

I could randomly throw them into some deck, but I highly doubt I could learn them nearly as well by "just" doing that

#

While with the WK method, I'm pretty confident about the vast majority of the Kanji

#

WK just does an excellent job of giving you and reinforcing tools to be able to recognize even an "aged" Kanji, and it works exceptionally well

#

And I simply don't consider language learning any kind of rush or race. I can already read the vast majority of stuff, and am fairly confident about it.

#

So why would I go hard on trying to hyper-optimize it?

#

I don't see what's "in isolation" about it

#

The Vocab are somewhat, but WK very clearly says that it uses the Vocab to give you context for the Kanji

#

It's then your job to find context for the Vocab :D

#

That seems like it'd be horribly overloaded

#

way too much stuff on one card

bold terrace Mar 8, 2025, 12:52 PM

#

People tend to forget other were perfectly learning languages before SRS apps existed

#

It's a complimentary tool

#

The appeal is having your little sandbox with little graphs

tepid spoke Mar 8, 2025, 12:53 PM

#

Learning Kanji without some kind of SRS system seems borderline impossible to me

#

I think it's what Japanese kids in school have been using since ages, just manually

bold terrace Mar 8, 2025, 12:55 PM

#

Believing is not possible is often the first step into making it not possible

#

For example when I started learning english I had no internet, and no SRS knowledge or anything

#

I just looked up words in a dictionnary book

#

took me ages

#

but got it eventually

#

So yeah, in my case I see Anki as very nice supplement

#

but not like some kind of requirement

#

But guess what ... We think "Internet + Easy Tools to learn/review", so can only mean better learning right ?
Except now you also have "Constantly getting notified for random anonymous people talking to you online, getting "recommendation" for a new video, switching core decks every 6 months"

#

I'm also culprit of it but right now I have still 200 reviews to do today that should take me ~20min, but guess what, I'm losing there here, discussing "optimal way of doing it"

#

So no offense, it's also a self-reflection criticize

cursive badge Mar 8, 2025, 1:02 PM

#

bold terrace I'm also culprit of it but right now I have still 200 reviews to do today that s...

We need to make an addon that locks you out of discord until you have done all your reviews ;p

bold terrace Mar 8, 2025, 1:02 PM

#

cursive badge We need to make an addon that locks you out of discord until you have done all y...

I guess ;D

#

Point is : NOTHING beat hard work and true effort. But we always play pretend by pretending trying to "optimize"

#

but I guess I'm off topic now

bold terrace Mar 8, 2025, 1:39 PM

#

I was wondering, anyone could explain how to read a "B-W Matrix" ?

I searched a bit online but I'm not really sure I found the right info

#

It's under the "Memorised" graph

cosmic hedge Mar 8, 2025, 1:40 PM

#

bold terrace It's under the "Memorised" graph

hover over the cells

bold terrace Mar 8, 2025, 1:40 PM

#

I do I do

#

For example :
"Predicted 71.81, Actual 66 (Prediction at 71.81 ?) compared to a total of 84 prediction ?"

#

So it predicted 71.81 less than it should have ?

#

What does it describe ?

cosmic hedge Mar 8, 2025, 1:41 PM

#

its for all the reviews that are done with that stability and difficulty

#

hold on i'll find the thing i copied XD

#

#1282005522513530952 message

bold terrace Mar 8, 2025, 1:42 PM

#

Ah, x-axis is difficulty and y-axis stability

cosmic hedge Mar 8, 2025, 1:42 PM

#

it means that fsrs is underestimating how well you know cards with that difficulty and stability by 13%

#

oh yeah i should really label that XD

bold terrace Mar 8, 2025, 1:43 PM

#

yeaaaah with the axis explained now it makes more sense 😄

#

So yeah for example in my case, for Stabily around 7d, for Difficulty at 90%, it is over estimating for 7%

cosmic hedge Mar 8, 2025, 1:44 PM

#

yeah

bold terrace Mar 8, 2025, 1:44 PM

#

@unique salmon if we can determine that, even if it has only a very low impact on global RMSE, why not mitigate that by a malus for those card retention ?

#

Ok global RMSE won't change much

#

But it's not going to hurt to do a ~2-3% malus on that, even if we over estimate it for a few reviews, worst case scenario taht -6.9 will just go up until the "reverse mitigation" kicks in

#

I mean, I'm all for crunching numbers, but RMSE is just one part of the story in this case, no ?

unique salmon Mar 8, 2025, 1:47 PM

#

I'm not sure what would be a good way to do that. And right now I want Jarrett to test something else: https://github.com/open-spaced-repetition/srs-benchmark/issues/186

GitHub

A new idea for decay as a function of D · Issue #186 · open-spaced-...

#1282005522513530952 message Right now we estimate S0 like this: def loss(stability): y_pred = self.forgetting_curve(delta_t, stability) l...

bold terrace Mar 8, 2025, 1:49 PM

#

Sure no worries

#

But do you more or less agree with the fact that, if RMSE itself can't be reduced that much anymore, having some compensation techniques for specific problematic class would be a nice way forward ?

unique salmon Mar 8, 2025, 1:50 PM

#

I mean, we can't just add or subtract stuff from the forgetting curve, that would cause all sorts of issues

bold terrace Mar 8, 2025, 1:50 PM

#

I see

#

And I agree

#

I was thinking more like "Post Processing" techniques

unique salmon Mar 8, 2025, 1:51 PM

#

We should aim to improve FSRS formulas

bold terrace Mar 8, 2025, 1:51 PM

#

Sure, if it's doable, I'm all for it 🙂

unique salmon Mar 8, 2025, 1:51 PM

#

Or just make a neural net that is far more accurate than FSRS ¯_(ツ)_/¯

bold terrace Mar 8, 2025, 1:51 PM

#

That might also explain why though

unique salmon Mar 8, 2025, 1:51 PM

#

Technically, we have one already

bold terrace Mar 8, 2025, 1:52 PM

#

With enough parameters all those problematic classes of cards could be customized

#

But a one-shot pre-training might not be enough then

#

Who knows if someone else might have a different class of problematic cards

#

but I think @polar maple said it would be possible to modify the weight incrementally with new reviews

unique salmon Mar 8, 2025, 1:53 PM

#

bold terrace But a one-shot pre-training might not be enough then

We'll wait for Alex's RWKV. It seems like it really does Just Work

unique salmon Mar 8, 2025, 1:53 PM

#

unique salmon We'll wait for Alex's RWKV. It seems like it really does Just Work

As in - pretraining only beats the hell out of FSRS

bold terrace Mar 8, 2025, 1:53 PM

#

Exciting stuff

#

If you search guinea pig, you know where I am

#

In the mean time ...

#

"deck:Japan::1. Vocabulary" prop:s>5 prop:s<7 prop:d>0.9 prop:r<.90 -is:due

My good old Filtered Deck will go "brrrrrrr" 😄

#

Poor's man "AI"

unique salmon Mar 8, 2025, 2:07 PM

#

bold terrace Poor's man "AI"

This is like a whole new level of tweaking...

#

Holy tweaking Batman

bold terrace Mar 8, 2025, 2:18 PM

#

Yeaaah half my review are coming from those

#

#

That's also a bit why sometimes I say I don't think FSRS should be the only one to "find solution"

#

I mean, Anki could have different services, a prediction services which would be FSRS or RWKV

#

On top of those 2, you could then have some anomaly detection service, card interference service ...

#

So instead of having an ever-growing equation for FSRS, or having nothing left to be able to interact with RWKV, you could extend certain capabilities

cosmic hedge Mar 8, 2025, 2:20 PM

#

bold terrace I mean, Anki could have different services, a prediction services which would be...

you're free to use sm2 🤷‍♂️

bold terrace Mar 8, 2025, 2:21 PM

#

Yes ! But SM2 / FSRS still have really different paradigms

#

SM2 doesn't really predict for example

#

So you need to have a very clear responsability separations between those different capabilities

cosmic hedge Mar 8, 2025, 2:22 PM

#

also i'd like to clarify that the b-w matrix is for reviews not cards so if your cards have changed stabilises or difficulties since that review then the search wont reallly find all those cards

#

i mean it still works well enough

bold terrace Mar 8, 2025, 2:22 PM

#

Gotcha

#

But I think it's fine because then, I'm assuming that the difference for S=7 D=9 is high enough that potentially the current one might have issues

unique salmon Mar 8, 2025, 2:23 PM

#

bold terrace On top of those 2, you could then have some anomaly detection service, card inte...

Well, we have the load balancer/fuzz and Easy Days

#

So yeah, we could add extra stuff on top

#

I have an idea for leech detection, but that requires storing DR at each review in Card Info

#

Actually, no, not DR. It would require storing R at each review

bold terrace Mar 8, 2025, 2:27 PM

#

Wouldn't R be always more or less equal to the DR ?

unique salmon Mar 8, 2025, 2:27 PM

#

Not always and not exactly

bold terrace Mar 8, 2025, 2:27 PM

#

The DR I can understand, you want to check compared to an expected baseline

#

But maybe you should explicit your idea instead of us trying to guess what it is 😄

unique salmon Mar 8, 2025, 2:28 PM

#

Uh, no, you can't

#

Not without some really weird conversion mechanism

#

That's what I meant by "weird conversion mechanism"

unique salmon Mar 8, 2025, 2:30 PM

#

bold terrace But maybe you should explicit your idea instead of us trying to guess what it is...

I'm not 100% sure myself. I want to use a series of reviews to calculate the probability that a card would be failed k times out of n total reviews, each with it's own R_i probability of recall. But I don't know how to do this exactly

bold terrace Mar 8, 2025, 2:30 PM

#

I think I get the idea, and I think it's also a nice addition

unique salmon Mar 8, 2025, 2:30 PM

#

I'll have to Google/try to do the math myself

bold terrace Mar 8, 2025, 2:31 PM

#

If your doing reviews each time with a R=80% but you get them wrong 90% of the time, it's a bit strange

unique salmon Mar 8, 2025, 2:31 PM

#

Yeah

bold terrace Mar 8, 2025, 2:31 PM

#

It might also be more reactive than my previous idea of looking at the full history

unique salmon Mar 8, 2025, 2:31 PM

#

Btw, this could also be used to find anti-leeches: cards that are so easy you almost wonder why you are even reviewing them

bold terrace Mar 8, 2025, 2:32 PM

#

I mean, if you fail 3 times in a row a 90%, you can react to that more quickly than checking if those 3 fails happened in 30 reviews lapse of time

#

I mean, if you have 10% chance of getting something wrong, getting it 3-times in a row is a 0.1% chance

unique salmon Mar 8, 2025, 2:34 PM

#

Yes, but I want to extend that to fails that didn't happen in a row

#

And all had different p(recall)

#

That gets complicated

bold terrace Mar 8, 2025, 2:34 PM

#

Yes

#

Not good enough at stats to remember how to do that by heart haha

#

probabilities*

unique salmon Mar 8, 2025, 2:35 PM

#

https://en.wikipedia.org/wiki/Bernoulli_trial

Bernoulli trial

In the theory of probability and statistics, a Bernoulli trial (or binomial trial) is a random experiment with exactly two possible outcomes, "success" and "failure", in which the probability of success is the same every time the experiment is conducted. It is named after Jacob Bernoulli, a 17th-century Swiss mathematician, who analyzed them in ...

#

But it's for fixed p

#

Where every trial has the same probability of success

#

Oh, here we go
https://en.wikipedia.org/wiki/Poisson_binomial_distribution

Poisson binomial distribution

In probability theory and statistics, the Poisson binomial distribution is the discrete probability distribution of a sum of independent Bernoulli trials that are not necessarily identically distributed. The concept is named after Siméon Denis Poisson.
In other words, it is the probability distribution of the
number of successes in a collectio...

unique salmon Mar 8, 2025, 3:08 PM

#

Alright, I found a package that does this
https://github.com/tsakim/poibin
I am not even going to try to understand the math with complex numbers, but the usage is actually fairly simple. You just give it a list of probabilities for each trial and the number of successes, and then you can calculate the probability of a given number of successes.
Example:
p = np.asarray([0.9, 0.85, 0.95, 0.92, 0.87]) n_succ = 2
This gives me a p-value of 0.836%. So if a card has been reviewed 5 times with these probabilities (note that the order doesn’t matter) there is a 0.836% chance that 2 or fewer reviews will be successful.
@polar maple @quasi shadow Here's the code and an example of usage

EDIT: see an updated example #1282005522513530952 message

📎 poibin.py 📎 poisson_binom.py

GitHub

GitHub - tsakim/poibin: Poisson Binomial Probability Distribution f...

Poisson Binomial Probability Distribution for Python - tsakim/poibin

#

So now we can make an automated leech detector

#

https://tenor.com/view/yes-yes-sir-yayy-kataman-gif-12260883688244422951

Tenor

#

(as long as we figure out how to port this to Rust)

#

(and if Anki stores R at the time of each review in Card Info, otherwise we can't do this. Actually, maybe we could do the same trick of re-calculating R that we do for the forgetting curve)

bold terrace Mar 8, 2025, 3:29 PM

#

unique salmon (and if Anki stores R at the time of each review in Card Info, otherwise we can'...

Yeah we could recompute and store it retroactively if necessary I think

unique salmon Mar 8, 2025, 3:29 PM

#

https://forums.ankiweb.net/t/automated-leech-detection/56887

Anki Forums

Automated leech detection

Given the card’s history, we can either store or re-calculate the probability of recall predicted by FSRS, and then use the Poisson binomial distribution to calculate the probability of a given number of successes. I am not even going to try to understand the math with complex numbers, but the usage is actually fairly simple. You just give...

bold terrace Mar 8, 2025, 3:30 PM

#

Good job though

unique salmon Mar 8, 2025, 3:48 PM

#

There is another issue, actually: with this function the p-value is always 0 if the number of successes is 0

#

So if the number of successes is 0, we have to do something else

#

I guess realistically it doesn't matter since cases where a user has never pressed Hard/Good/Easy would be extraordinarily rare.

bold terrace Mar 8, 2025, 3:59 PM

#

If it's 0 success, normally it would not even be leaving the learning phase right ?

#

Not if "Set Due Date" is used though

#

Is it not a bug of the function/lib though ? If you have 1 fail 0 success at 80% DR, it's strange it's considered as a leech

unique salmon Mar 8, 2025, 4:00 PM

#

Same-day reviews don't count

unique salmon Mar 8, 2025, 4:01 PM

#

bold terrace Is it not a bug of the function/lib though ? If you have 1 fail 0 success at 80%...

I think it's less of a bug and more just a limitation of the formula

bold terrace Mar 8, 2025, 4:02 PM

#

There's always the "lazy" way to only consider cards with >= N day of review

#

like N=3

unique salmon Mar 8, 2025, 4:02 PM

#

You mean N reviews?

bold terrace Mar 8, 2025, 4:02 PM

#

Just wanted to be sure we exclude the same day review

#

but yeah

unique salmon Mar 8, 2025, 4:04 PM

#

There is an issue with that. For example, if you have 3 reviews at 70% p(recall), the probability of failing all 3 is 2.7%, not low enough. At 90% it would be sufficient, at 70% - nope

bold terrace Mar 8, 2025, 4:06 PM

#

Isn't it what you would also expect from that algo ?

#

I mean, failing 3 times if it's lower DR is more expected than failing it 3 times with higher DR

#

And at least here you know exactly why, there's a clear formula

unique salmon Mar 8, 2025, 4:07 PM

#

I mean that applyign this formula after a fixed number of reviews doesn't work equally well for all cards

#

And all DRs

#

What is sufficient to identify a leech at DR=90% is not sufficient at DR=70%

bold terrace Mar 8, 2025, 4:09 PM

#

aaah sure

#

But I mean the idea of the >= N is just to have handle the edge case

#

if the N is too low to be possible ,it's not that much an issue

#

in general in programming you make sure your edge cases are treated (denominator different null, sqrt positive...) but then if a value is not possible with certain other parameters, you don't necessarly over complexify the code (except if, you can really change the algorithm complexity, but in this case, it's not really worth it)

unique salmon Mar 8, 2025, 4:22 PM

#

Ok, hold on, something doesn't add up here, I'm investigating it

#

I'm running simulations to confirm that the function works, and it's off

#

Or, rather, the simulations do match the output of the function, but for the other number of successes...

#

Ok, I have no idea why the p-value in this function is calculated the way it is, I'll have to actually use my brain to figure out how to get what I need

#

Ok, so the way they calculate the p-value is weird plus there is the whole "exactly n successes" vs "n successes or fewer successes" thing. Ok, I got it all figured out now

#

@polar maple @quasi shadow Here's an updated usage example

Also, I asked Claude 3.7 Thinking to re-write it in Rust and remove the calculation of p-values (I calculate them from the PMF) and CDF, leaving only PMF. Idk if it's any good, but so far Claude 3.7 Thinking has been really freaking good, at least for Python.
https://drive.google.com/file/d/10QaOXwyh8F58wRTlGizOUc0VOaEOqBIy/view?usp=sharing

Google Docs

PoiBin.rar

#

📎 poisson_binom2.py

unique salmon Mar 8, 2025, 5:04 PM

#

unique salmon Alright, I found a package that does this https://github.com/tsakim/poibin I am ...

See this

#

Also, I was initially thinking of using 0.1% as a cutoff, but it seems like that's too conservative, let's use 1%

unique salmon Mar 8, 2025, 5:37 PM

#

Oh, and let's limit it to N reviews >=2, in other words, if a card has only been reviewed once, let's not tag is as a leech

#

But yeah, we can identify leeches at very high DR with merely 2 reviews, that's cool as hell!

bold terrace Mar 8, 2025, 5:37 PM

#

I agree it's really nice

#

Even if we can't really schedule it differently, at least we have something to flag those more precisely

#

The whole "Flag leeches after X lapses" was not correct for FSRS, not at least there is a nice alternative

#

At least, someone has to implement it haha

#

You take comissions @ashen light ? 😄

unique salmon Mar 8, 2025, 5:52 PM

#

Also, it seems like identifying anti-leeches may not be viable. With 10 reviews at 90% R, there is a 34.9% chance of every single review being successful. Even with 10 reviews at 70% R, there is still a 2.8% chance of all of them being successful.

#

Then again, it's also arguably less important

#

Now the big question is: do we want a "Recalculate leeches" button if automatic leech detection is enabled? 🤔
Since changing FSRS parameters will change R, which in turn can change whether the card counts as a leech or no

bold terrace Mar 8, 2025, 6:51 PM

#

unique salmon Then again, it's also arguably less important

Yes I would expect those card to grow stability very quickly

bold terrace Mar 8, 2025, 6:52 PM

#

unique salmon Now the big question is: do we want a "Recalculate leeches" button if automatic ...

Hmm if R is stored at-review, and we don't touch it anymore, then I guess normally it should not be that necessary ?

unique salmon Mar 8, 2025, 6:53 PM

#

bold terrace Hmm if R is stored at-review, and we don't touch it anymore, then I guess normal...

Right now it's not stored, but recalculated (for the forgetting curve)

bold terrace Mar 8, 2025, 6:54 PM

#

In your proof-of-concept, you consider all the history of the reviews or only a certain amount ? In both case, I guess a card marked as "leech" could leave that state in theory, with new good reviews ?

unique salmon Mar 8, 2025, 6:54 PM

#

And recalculating would make it more accurate, so that's another reason to recalculate leeches too

unique salmon Mar 8, 2025, 6:54 PM

#

bold terrace In your proof-of-concept, you consider all the history of the reviews or only a ...

It could. Also, we can limit the number of recent reviews used for the calculation

#

Say, last 32 or last 64 reviews

#

Not that it would matter for most cards

bold terrace Mar 8, 2025, 6:56 PM

#

Yeah I was thinking, if a card let say with 10 Reviews, is counted as a leech, because let say there was 9 fail. Then, the user start to review it "more normally", with a success rate of around 90% (The DR). I guess, with the formula, it would slowly increase the p-value until it goes above the threshold

#

But of course, if something is a leech with 200 reviews, the user would have to review it a lot I guess before it leaves the Leeches state (which could be logical since well, it was a leech for so much time)

#

So limiting the window is maybe not a game changer

unique salmon Mar 8, 2025, 7:18 PM

#

We should also probably add a rule that a change in the leech/not a leech status should occur no more frequently than once per 3 reviews, in case some cards are very close to the threshold all the time

#

So if card has been tagged as a leech, it needs at least 3 more reviews before it can be untagged

#

Which will likely be annoying to implement

hasty fractal Mar 8, 2025, 7:34 PM

#

unique salmon We should also probably add a rule that a change in the leech/not a leech status...

have u tested the idea itself?

unique salmon Mar 8, 2025, 7:45 PM

#

hasty fractal have u tested the idea itself?

Lol, no
https://forums.ankiweb.net/t/automated-leech-detection/56887
I'm just waiting for Jarrett, ain't no way I'm writing Rust code

Anki Forums

Automated leech detection

Given the card’s history, we can either store or re-calculate the probability of recall predicted by FSRS, and then use the Poisson binomial distribution to calculate the probability of a given number of successes. I am not even going to try to understand the math with complex numbers, but the usage is actually fairly simple. You just give...

polar maple Mar 8, 2025, 8:31 PM

#

unique salmon Alright, I found a package that does this https://github.com/tsakim/poibin I am ...

wdym complex numbers

#

you can do it with a couple for loops

#

unless you are crunching large inputs and want to use FFT but we don't need it for anki

unique salmon Mar 8, 2025, 8:32 PM

#

polar maple unless you are crunching large inputs and want to use FFT but we don't need it f...

The package uses FFT

#

And yeah, for small n you can calculate the probabilities exactly, but I don't want to mess with it

#

And we do want it to be fast for large n, for cards with a lot of reviews

polar maple Mar 8, 2025, 8:45 PM

#

unique salmon And we **do** want it to be fast for large n, for cards with a lot of reviews

for FFT to start to benefit i would expect that you need at least n = 10^4 or which just isn't happening

#

theres a simple O(n^2) way to compute this that can be written in like 5 lines of code

unique salmon Mar 8, 2025, 8:47 PM

#

polar maple for FFT to start to benefit i would expect that you need at least n = 10^4 or wh...

Here's a new example file

For n<=6 combinatorics are faster (thank Claude 3.7, lol, it has been saving my ass so many times). For n>6 FFT is faster

📎 poisson_binom2.py

#

n=20, 10 successes

polar maple Mar 8, 2025, 8:50 PM

#

the only conclusion to draw is that claude didn't write good code

unique salmon Mar 8, 2025, 8:50 PM

#

Lol

#

Maybe

polar maple Mar 8, 2025, 8:50 PM

#

its not a maybe lol this is simpe computing

#

FFT has a larger constant overhead

#

6^2 can be done in tight for loops very quickly

unique salmon Mar 8, 2025, 8:55 PM

#

Ok, I asked it to speed it up
Now combinatorics are faster for around n=40, then FFT is faster

📎 poisson_binom2.py

#

Tbf, both are way under 1 second for n=64

#

polar maple Mar 8, 2025, 9:05 PM

#

unique salmon

try this


def poisson_binomial_pmf(p, k=None):
    p = np.asarray(p)
    n = len(p)
    pmf = np.zeros(n + 1)
    pmf[0] = 1
    npmf = np.zeros(n + 1)
    for i in range(n):
        for j in range(n + 1):
            npmf[j] = 0
        for j in range(n):
            npmf[j] += pmf[j] * (1 - p[i])
        for j in range(1, n+1):
            npmf[j] += pmf[j - 1] * p[i]
        pmf, npmf = npmf, pmf
    return pmf

unique salmon Mar 8, 2025, 9:07 PM

#

polar maple try this ```python def poisson_binomial_pmf(p, k=None): p = np.asarray(p) ...

polar maple Mar 8, 2025, 9:09 PM

#

lol what your computer isn't 1000x slower than mine

unique salmon Mar 8, 2025, 9:09 PM

#

?

#

Here's the code

📎 poisson_binom2.py

#

Meanwhile I tried to completely redefine D in terms of R minus binary grade. I won't go into the details because it sucks anyway, even if I let it run for 10 epochs instead of 5 like normally
So now I will work on implementing decay based on D...or at least try to

polar maple Mar 8, 2025, 9:34 PM

#

unique salmon

I wrote the same version of my code in C++ because python loops are slow. When i read the updated claude code it is pretty much equivalent to my code but writes more of it in python compared to numpy, i was considering doing a similar thing but luckily i read the claude code first. The c++ performance is more in line with what I expect with such small and tight iteration. For the C++ took the average of n = 100 to n = 200. Doing n = 40 alone took 2.9 microseconds.

milli: 0.01059
micro: 10.58614

#

Rust would prob get similar performance as C++ here

#

so FFT not needed

ashen light Mar 8, 2025, 9:43 PM

#

bold terrace You take comissions <@135651514298400769> ? 😄

surely you can do it yourself 🍃

#

can you commission me to do a study on the difference between truely forgetting and "was it A or B?"

unique salmon Mar 8, 2025, 10:00 PM

#

ashen light surely you can do it yourself 🍃

There are, like, 2 people on this planet who can implement a leech tagger in Anki. One of them is you, the other one is Jarrett 😅

ashen light Mar 8, 2025, 10:07 PM

#

wait actually?

#

surely it isn't that hard

#

I mean I showed up like a handful of months ago and did shit its not like I did anything particuarly difficult

unique salmon Mar 8, 2025, 10:11 PM

#

ashen light wait actually?

Unless there is some random Rust enjoyer who never comments but always reads everything in this channel/on the forums and is currently reading this...yes, actually

ashen light Mar 8, 2025, 10:13 PM

#

I mean surely someone here can just do it

#

like honestly, I would not say I have particuarly deep knowledge

#

I mean I showed up one day not touching anki in a decade and made a PR, its actually not that big a thing. someone else can just do something similar

#

even you could 🍃

unique salmon Mar 8, 2025, 10:14 PM

#

I don't know Rust, man

ashen light Mar 8, 2025, 10:14 PM

#

its not hard

unique salmon Mar 8, 2025, 10:15 PM

#

I wouldn't even be able to make a PR in Python

ashen light Mar 8, 2025, 10:15 PM

#

you know python

unique salmon Mar 8, 2025, 10:15 PM

#

Apps have their own App Python

#

It's like legalese

#

But for apps

#

Same goes for any other language tbh

ashen light Mar 8, 2025, 10:15 PM

#

https://ankiweb.net/shared/info/1541471942

#

here this is what you need

#

I mean, syntax stuff is the least interestng part of languages, just fix what the compiler complains about

unique salmon Mar 8, 2025, 10:16 PM

#

https://cdn.discordapp.com/attachments/774628346415939585/1237753564219179048/will_smith_funny_distorted.gif

ashen light Mar 8, 2025, 10:17 PM

#

actually, do you even use anki

#

I have seen no evidence that you do

unique salmon Mar 8, 2025, 10:17 PM

#

ashen light I have seen no evidence that you do

ashen light Mar 8, 2025, 10:17 PM

#

lame

#

here I was hoping to out you as a fraud

#

rip

unique salmon Mar 8, 2025, 10:17 PM

#

Lol

ashen light Mar 8, 2025, 10:18 PM

#

me and another unnamed person were speculating on your actual usage

#

🍃

#

anyway

#

go hit that rust deck

#

then become a valued anki contributor

#

so is your background just in math (or stats) then?

unique salmon Mar 8, 2025, 10:31 PM

#

If by "background" you mean "reading articles on the Internet and watching YouTube", then yes

ashen light Mar 8, 2025, 10:31 PM

#

oh if thats your background then you're fully equipped to read/watch the internet but with rust instead of math 🍃

cursive badge Mar 9, 2025, 1:39 AM

#

unique salmon But yeah, we can identify leeches at very high DR with merely 2 reviews, that's ...

Wouldn't that be far too aggressive?
It seems to me that this method only works if you have a good model of the probability of recalling the card.
At only two reviews FSRS has not been given much of a chance to tweak D and S so its prediction might not be great.
You might get a lot of false positives for cards where D₀ and S₀ are not a good fit.

hasty fractal Mar 9, 2025, 6:43 AM

#

I think we can see how many false positives we get in the 20k dataset

#

comparing what the method gives us with a few first reviews vs with all the reviews

unique salmon Mar 9, 2025, 11:00 AM

#

cursive badge Wouldn't that be far too aggressive? It seems to me that this method only works ...

We can limit it to start only after the 3rd/4th/nth review

unique salmon Mar 9, 2025, 11:01 AM

#

hasty fractal I think we can see how many false positives we get in the 20k dataset

I'm not sure how you would do that without user feedback. How would you know whether the user considers this card a leech or not?

#

Well, considering that neither Jarrett nor Jake seem to be interested in implementing it anyway, I'm not sure if there is a reason to discuss it

bold terrace Mar 9, 2025, 11:09 AM

#

ashen light surely it isn't that hard

Algo itself probably not, but I already apprehend building anki from sources locally on my mac 😂

#

But it's never too late to try

#

Also complex thing with jumping into codebase like this, is to know where to best put those logic, and being careful of different entrypoints you might not have expected

#

But unfortunately it's not something you learn until you break that software 😂

quasi shadow Mar 9, 2025, 11:47 AM

#

unique salmon Well, considering that neither Jarrett nor Jake seem to be interested in impleme...

I haven't taken a look. The messages flooded me again.😂

unique salmon Mar 9, 2025, 12:34 PM

#

quasi shadow I haven't taken a look. The messages flooded me again.😂

Hopefully, you'll take a look at the forum post I made once you have some time

hasty fractal Mar 9, 2025, 12:47 PM

#

my first post on forums was about using difficulty for leech detection

#

well that didn't work out and I didn't expect this to progress any further

#

a year later, seem like things will move forward

#

maybe it'll take a few years to get there

unique salmon Mar 9, 2025, 1:32 PM

#

hasty fractal maybe it'll take a few years to get there

It'll take either two weeks or an infinite number years, nothing inbetween 🤣
Seriously though, it's just math + doing the tagging. I don't see any major roadblocks. The only issue is whether there are people who are willing to implement it. I assume Dae will have no objections.
So it's either:

There are people who want to implement it -> it gets done by the next Anki release
Or:
Nobody wants to implement it -> the idea dies in obscurity

#

It's like Easy Days - it could have been implemented literally years ago, it's just that it relies on one guy, Jake, to do the work

#

It's not like there were any new developments that suddenly made Easy Days possible

#

If we had a clone of Jake, he could've implemented Easy Days, like, 5 years ago 🤣

hasty fractal Mar 9, 2025, 1:39 PM

#

unique salmon It'll take either two weeks or an infinite number years, nothing inbetween 🤣 S...

well you're too hopeful if you expect it to easily work out. I imagine it'll take some amount of polishing to be actually good.

#

but yeah agree with u on other points.

unique salmon Mar 9, 2025, 1:43 PM

#

hasty fractal well you're too hopeful if you expect it to easily work out. I imagine it'll tak...

Not really

Use 1% as the cutoff for the leech tagger.
Use the leech detector only if there are at least 3 reviews, to avoid early false positives.
Make it so that a card's status as leech/not leech can only change once every 3 reviews, to avoid "zig-zagging" where it's a leech after one review, then not a leech after the next review, then a leech again, then not a leech again. Such cases would be rare, but we should still consider them.
Add a "Automatic leech detection" button.

The only part that is debatable is re-calculating leeches. Should it be a separate button? Should it be combined with "Optimize", so that leeches are automatically recalculated when FSRS parameters change?

hasty fractal Mar 9, 2025, 1:45 PM

#

unique salmon Not really 1) Use 1% as the cutoff for the leech tagger. 2) Use the leech detect...

well, the metric itself might need some polishing is what I was trying to say.

#

but ofc no use debating over this

#

imo leeches should be recalculated yeah

#

and no more options please

unique salmon Mar 9, 2025, 1:46 PM

#

hasty fractal well, the metric itself might need some polishing is what I was trying to say.

There's not a whole lot of polishing to do, it's just a bunch of math. The only "polishing" that I can think of is the choice of the cutoff %

#

@polar maple I tested it again and FFT is somehow slower even for n=20000 🤔

📎 poisson_binom2.py

#

📎 poibin.py

#

💀

#

Ok, I genuinely don't understand what this guy was cooking

#

FFT uses an enormous amount of RAM for n>10000

#

It's just strictly worse than combinatorics

cursive badge Mar 9, 2025, 2:56 PM

#

If I did the maths right the "Direct Convolution" algorithm should only need ~800KB of memory for n=50,000.
Pretty good memory efficiency compared to 74.5GB 😂

polar maple Mar 9, 2025, 5:28 PM

#

unique salmon <@142448513622605824> I tested it again and FFT is somehow **slower** even for n...

tried fft on c++

unique salmon Mar 9, 2025, 5:29 PM

#

polar maple tried fft on c++

Now if you could do it in Rust... ankieyes

polar maple Mar 9, 2025, 5:29 PM

#

nah no card has this many reviews

unique salmon Mar 9, 2025, 5:30 PM

#

No, I mean just implement it at all

polar maple Mar 9, 2025, 5:30 PM

#

cursive badge If I did the maths right the "Direct Convolution" algorithm should only need ~80...

it turns out that we can write it with only 1 array so it would take 400 KB, even better

polar maple Mar 9, 2025, 5:31 PM

#

unique salmon No, I mean just implement it at all

i got the same excuse as you, i don't know how to write rust

unique salmon Mar 9, 2025, 5:31 PM

#

dang

ashen light Mar 9, 2025, 5:44 PM

#

all this talk of me writing a feature and I still don't even know what the feature actually is

ashen light Mar 9, 2025, 5:45 PM

#

bold terrace Also complex thing with jumping into codebase like this, is to know where to bes...

I literally just showed up and picked as spot for the load balancer

unique salmon Mar 9, 2025, 5:48 PM

#

ashen light all this talk of me writing a feature and I still don't even know what the featu...

Take all probabilities of recall over the card's history
Plug them into The Mathematizer 9000
It returns a bunch of probabilities for every possible outcome, like 0 successes, 1 success, 2 successes...n successes, where n is the number of reviews aka length of the array with probabilities, without the first review
Check how many successes (k) the card actually has
Sum the first k probabilities to find p(successes<=k)
If it's <1%, tag the card as a leech

Basically, we check how likely it is that a card would be successfully reviewed k times (or less than k, it's a "less than or equal to" kind of situation) out of n total reviews, given an array of probabilities from FSRS

#

Plus extra rules to avoid the card going from "leech" to "not a leech" too often and early false positives

bold terrace Mar 9, 2025, 6:01 PM

#

ashen light I literally just showed up and picked as spot for the load balancer

1PM I was like "Let's try to build Anki locally", 1:01PM wife was : "Let's replace the washing machine". That's my excuse of letting my development anxiety win today

sonic forge Mar 9, 2025, 6:10 PM

#

@ashen light, is there any reason why load_balancer is required for QueueBuilder and CardQueues, so it is load_balancer: LoadBalancer and not load_balancer: Option<LoadBalancer>?
In the sense that can it be refactored to be Option?
Because at the moment, even with disabled LB, Anki still uses LB code / runs code that required for LB functionality:

Computing LoadBalancer::new for QueueBuilder::new: https://github.com/ankitects/anki/blob/63c2a09ef6760890c03be4bd83f613c03c512d1f/rslib/src/scheduler/queue/builder/mod.rs#L149-L158
add_card: https://github.com/ankitects/anki/blob/63c2a09ef6760890c03be4bd83f613c03c512d1f/rslib/src/scheduler/answering/mod.rs#L352-L364
remove_card: https://github.com/ankitects/anki/blob/63c2a09ef6760890c03be4bd83f613c03c512d1f/rslib/src/scheduler/queue/undo.rs#L42-L51

cursive badge Mar 9, 2025, 6:22 PM

#

polar maple i got the same excuse as you, i don't know how to write rust

If you already have a background in something like C++ it is not too hard to learn Rust.

For reference here is what I got when I had a little go last night:

pub fn poisson_binomial_pmf(probabilities: &[f64]) -> Vec<f64> {
    let n = probabilities.len();

    let mut prev = vec![0.0; n + 1];
    let mut curr = vec![0.0; n + 1];

    prev[0] = 1.0;

    for i in 1..=n {
        let p = probabilities[i - 1];

        curr[0] = prev[0] * (1.0 - p);

        for j in 1..=i {
            curr[j] = (prev[j] * (1.0 - p)) + (prev[j - 1] * p);
        }

        std::mem::swap(&mut prev, &mut curr);
    }

    prev
}

N.B. I don't really follow exactly what is happening in this algorithm, so I may have messed it up a little. It seems to be giving sensible results though.

ashen light Mar 9, 2025, 6:25 PM

#

@sonic forge ...I could have sworn that it was an Option<LoadBalancer> exactly because of that toggle

unique salmon Mar 9, 2025, 6:25 PM

#

cursive badge If you already have a background in something like C++ it is not too hard to lea...

Yeah, but then we also need to do all the other stuff, not just pure math

#

Like tagging cards

ashen light Mar 9, 2025, 6:26 PM

#

oh wait it did need it, it probably could be optional if you want to make a pr for it

cursive badge Mar 9, 2025, 6:27 PM

#

unique salmon Yeah, but then we also need to do all the other stuff, not just pure math

Yep, I looked at that bit and decided it didn't look fun enough to drop my current project.

ashen light Mar 9, 2025, 6:30 PM

#

also I'll look into the leech stuff later today I've only half-read this backlog and yuki's question was an easier answer

polar maple Mar 9, 2025, 6:34 PM

#

cursive badge If you already have a background in something like C++ it is not too hard to lea...

hey if Expertium can't do it then I can't either lol
to reduce the space complexity you can just maintain the current value in a register and update the array in-place. a[i] affects a[i+1] in the next iteration which is why we just store the unchanged value in a register first

unique salmon Mar 9, 2025, 6:36 PM

#

polar maple hey if Expertium can't do it then I can't either lol to reduce the space comple...

Yeah, but I don't know C++ either 🤣

polar maple Mar 9, 2025, 6:39 PM

#

unique salmon Yeah, but I don't know C++ either 🤣

honestly though rust is similar enough to something like python that you can just pick it up, i hear good things about the package manager as well

#

it's only hard learning a new language when it's a completely different paradigm, like going from python to haskell

cursive badge Mar 9, 2025, 6:40 PM

#

Cargo ❤️

#

I'm kind of amazed how bad Python has been in comparison for so long. It has been getting a lot better in recent years. I'm really liking UV.

cursive badge Mar 9, 2025, 6:55 PM

#

polar maple it's only hard learning a new language when it's a completely different paradigm...

Have you ever had a go at constraint programming? That's quite a fun weird one. Not something that you can use for everything though.

polar maple Mar 9, 2025, 7:14 PM

#

cursive badge Have you ever had a go at constraint programming? That's quite a fun weird one. ...

I haven't. The syntax on wikipedia seems similar to something you can achieve on haskell with the List monad and do notation

#

I often used Haskell to verify my combinatorics homework for this reason, nice syntax

ashen light Mar 9, 2025, 7:22 PM

#

unique salmon It's like Easy Days - it could have been implemented literally years ago, it's j...

I like how I'm getting credit now for work I didn't do

#

jarret did easy days cause I totally ghosted

#

also why is everyone here afraid of rust

#

its literally the easiest language because it doesn't let you do stupid shit

cursive badge Mar 9, 2025, 7:25 PM

#

polar maple I haven't. The syntax on wikipedia seems similar to something you can achieve on...

I used https://www.minizinc.org/ . It's quite interesting because you have to focus more on defining what a good solution looks like instead of how to get the solution.

MiniZinc

MiniZinc is a free and open-source constraint modeling language.

hasty fractal Mar 9, 2025, 7:26 PM

#

ashen light jarret did easy days cause I totally ghosted

do u plan to do anything again 🥺

#

u brought us some good stuff (LB)

unique salmon Mar 9, 2025, 7:29 PM

#

ashen light I like how I'm getting credit now for work I didn't do

I guess it was Load Balancer then. My point still stands: Easy Days and/or Load Balancer could've been implemented years ago if someone had enough expertise and enthusiasm

ashen light Mar 9, 2025, 7:30 PM

#

I've made a handful of small PRs

#

maybe yall should learn rust so you can leverage your own enthusiam instead of praying someone randomly shows up and does it

hasty fractal Mar 9, 2025, 7:33 PM

#

we can write a wiki post on the forums titled "Cool Ideas to Implement: Needs Dev"

#

new dev comes here and we link it to them. then we sit down and just pray 🙏

ashen light Mar 9, 2025, 7:33 PM

#

at that point it would literally be easier to just do these things yourselves

#

here: I'll coach someone doing this leech thing

#

and like

#

Plug them into The Mathematizer 9000
someone pls write spec for mathematizer 9000

#

and don't just say "its like the mathematizer 6000 but with more features"

unique salmon Mar 9, 2025, 7:34 PM

#

ashen light > 2) Plug them into The Mathematizer 9000 someone pls write spec for mathematize...

https://drive.google.com/file/d/10QaOXwyh8F58wRTlGizOUc0VOaEOqBIy/view?usp=sharing

Google Docs

PoiBin.rar

#

Written by Claude 3.7

#

Python version:
`def fast_poisson_binomial_pmf(p):
"""
Calculate the exact PMF of the Poisson Binomial distribution using
dynamic programming and vectorized NumPy operations.

Parameters:
-----------
p : array-like
    Array of success probabilities for each Bernoulli trial

Returns:
--------
numpy array of PMF values for k=0,1,...,len(p)
"""
p = np.asarray(p, dtype=np.float64)
n = len(p)

# Validate input
if not np.all((0 <= p) & (p <= 1)):
    raise ValueError("All probabilities must be between 0 and 1")

# Handle trivial cases
if n == 0:
    return np.array([1.0])

# Initialize the PMF - we'll use a dynamic programming approach
# pmf[j] will represent P(X = j) after considering the first i trials
pmf = np.zeros(n + 1, dtype=np.float64)
pmf[0] = 1.0  # Base case: probability of 0 successes with 0 trials is 1

# Process each probability one at a time
for prob in p:
    # For each new Bernoulli trial, we update the entire PMF
    # We do this in reverse order to avoid overwriting values we still need
    # The key insight: P(X=k after adding new trial) =
    #   P(X=k with no success in new trial) + P(X=k-1 with success in new trial)

    # Calculate the effect of this probability on the entire PMF at once
    # This is where the vectorization happens
    pmf_shifted = np.zeros_like(pmf)
    pmf_shifted[1:] = pmf[:-1] * prob  # Probability of success for this trial

    # Update PMF by combining the two possibilities
    pmf = pmf * (1 - prob) + pmf_shifted  # No success + success for this trial

return pmf`

#

pmf_exact = fast_poisson_binomial_pmf(p_succ) p_value_exact = sum(pmf_exact[0:n_succ + 1])
n_succ is how many successes there were in reality

ashen light Mar 9, 2025, 7:36 PM

#

cool can you turn that into rust for me

unique salmon Mar 9, 2025, 7:36 PM

#

ashen light cool can you turn that into rust for me

See the Google link

hasty fractal Mar 9, 2025, 7:36 PM

#

why not just ask claude expertium

unique salmon Mar 9, 2025, 7:37 PM

#

I did

#

Hence the link

#

But I can't ask it do the PR

hasty fractal Mar 9, 2025, 7:37 PM

#

woah, let's hope C3.8 gets that feature for us

#

btw, can I ask what happened with the hyperoptimise thingy?

ashen light Mar 9, 2025, 7:38 PM

#

@unique salmon prove ai isn't garbage and get an entire PR written only using ai

#

bet you can't

unique salmon Mar 9, 2025, 7:38 PM

#

hasty fractal btw, can I ask what happened with the hyperoptimise thingy?

You mean preset grouping?

#

@spring adder

hasty fractal Mar 9, 2025, 7:38 PM

#

ye that

#

hyperoptimise is a better name actually

unique salmon Mar 9, 2025, 7:38 PM

#

ashen light <@530106856593424407> prove ai isn't garbage and get an entire PR written only u...

Give me 5 years 🤣
Then Github will support LLMs making PRs natively

hasty fractal Mar 9, 2025, 7:39 PM

#

cuz imo in the ideal future presets should be seperated from params

unique salmon Mar 9, 2025, 7:39 PM

#

hasty fractal cuz imo in the ideal future presets should be seperated from params

wat

#

how

#

The whole point was to group decks into presets optimally

#

Not to do some...uhhhh...idk

#

Idk what you want

hasty fractal Mar 9, 2025, 7:39 PM

#

unique salmon wat

do u want 30 presets for you collection?

unique salmon Mar 9, 2025, 7:40 PM

#

You mean presets? Why not

hasty fractal Mar 9, 2025, 7:40 PM

#

params aren't the

unique salmon Mar 9, 2025, 7:40 PM

#

How do you separate parameters from presets?

hasty fractal Mar 9, 2025, 7:40 PM

#

only thing u change

hasty fractal Mar 9, 2025, 7:40 PM

#

unique salmon How do you separate parameters from presets?

I don't know how to do it on a code level ofc.

unique salmon Mar 9, 2025, 7:41 PM

#

Forget about code level. Conceptually, how?

hasty fractal Mar 9, 2025, 7:41 PM

#

you'll have hyperoptimisation. you don't need to see the behind the scenes.

#

params will be optimised by one button for all decks.

#

or maybe invent "general-presets" and param-presets" and make everything more confusing.

unique salmon Mar 9, 2025, 7:43 PM

#

So you just want to have per-deck parameters, except decks are grouped, except those groups aren't presets?
Thats...a very strange wish

hasty fractal Mar 9, 2025, 7:43 PM

#

lmao true

#

the problem is sometimes I'm trying to change my sort order and now I have to go through 40 fuckin presets all because I was trying to make my scheduling optimal.

ashen light Mar 9, 2025, 7:44 PM

#

unique salmon Give me 5 years 🤣 Then Github will support LLMs making PRs natively

whats so hard about making a PR

unique salmon Mar 9, 2025, 7:45 PM

#

ashen light whats so hard about making a PR

The fact that I don't even know the basics of Rust

ashen light Mar 9, 2025, 7:45 PM

#

I linked that anki deck

#

I guess the general thought is you seem to have a lot of stuff you'd like in anki and yet won't do the thing that'll let you actually do those things

#

relying on me is unreliable!

#

I'll do a thing then disappear for months

unique salmon Mar 9, 2025, 7:48 PM

#

I also rely on Jarrett. Two people = more robustness 🤣

ashen light Mar 9, 2025, 7:48 PM

#

like the only reason lb happened is cause I REALLY wanted it

#

I'm justsaying, its not as hard as you might think

spring adder Mar 9, 2025, 7:57 PM

#

hasty fractal btw, can I ask what happened with the hyperoptimise thingy?

The results of what I was attempting didn't look super promising, and I got bored.

There's definitely a benefit to grouping and splitting presets, and that process could probably be mostly automatic, but I don't really intend to look back into it.

cursive badge Mar 9, 2025, 8:00 PM

#

I didn't dig too deep, but it looks like the first annoying thing would be that you don't have access to the revlog at the point where Anki currently marks leeches. You would have to work your way back until you find somewhere with access to the revlog and refactor everything in-between.

bold terrace Mar 9, 2025, 8:43 PM

#

ashen light like the only reason lb happened is cause I REALLY wanted it

Do you have to convince Dae or as long as it's backward compatible and togglable (like off by default) it's all good ? 🤔

unique salmon Mar 9, 2025, 8:46 PM

#

bold terrace Do you have to convince Dae or as long as it's backward compatible and togglable...

You mean the leech detector? As long as it's togglable, Dae won't object, I think

bold terrace Mar 9, 2025, 8:48 PM

#

Oh I was thinking about some kind of being able to configure some triggers based on S/D/R... to trigger "Due" state

#

Could be fun with the B-W matrix showing you which class of cards (based on S/D) is over/underestimated by FSRS

#

The leech I guess is not that much disruptive

#

What I'm describing is somewhat close to Filtered Deck, but it could be then plugged dynamically to things like the B-W matrix

#

So instead of scheduling based on R, it could schedule based on R/S/D based on those past observation

#

Typically, the LB and the Easy Days would be part of this "Scheduling Post-Processing"

#

In fact, we can even argue it's not "Post-Processing" but plain and simple Scheduling

#

The R < DR might just be another rule in this set of rules

#

Hmm not really in fact, LB/Easy Days are per nature "Post Processing"

unique salmon Mar 9, 2025, 9:03 PM

#

bold terrace The R < DR might just be another rule in this set of rules

It would break both LB and Easy Days
Well, not "break" per se, but make them much worse at doing their job

#

We should allow some small deviation from desired retention

bold terrace Mar 9, 2025, 9:03 PM

#

But having that split in place could help having more information about an Initial Schedule and the Post-Processed one (because sometimes, you don't know if you get 5d because it's 3d + 1d LB + 1d Easy Days, and you get R=50% instead of 90% ....)

bold terrace Mar 9, 2025, 9:04 PM

#

unique salmon It would break both LB and Easy Days Well, not "break" per se, but make them muc...

Yeah LB/ED I was wrong to consider them as the same as the Scheduling

#

You only know how to LB/ED once you already solved the scheduling aspect

unique salmon Mar 9, 2025, 9:05 PM

#

Maybe we could allow a deviation of 25% in terms of odds or something
https://github.com/open-spaced-repetition/fsrs4anki-helper/issues/419#issuecomment-2359076992

GitHub

[Feature Request] Reschedule cards only if the new interval is diff...

Is your feature request related to a problem? Please describe. This is a continuation of the Discord discussion. Many users avoid rescheduling because it greatly affects their workload even if para...

bold terrace Mar 9, 2025, 9:09 PM

#

IMO the threshold with rescheduled should be based on "How low would be my Target R if I don't reschedule it now ?"

#

I investigated this a bit since I do reschedule a lot

polar maple Mar 9, 2025, 9:10 PM

#

@unique salmon what exactly is the calculation to find leeches for your idea? is it to find cards in the bottom 1% in terms of total failures?

bold terrace Mar 9, 2025, 9:10 PM

#

And in general, it's a lot of big interval, like 6 month becoming 3 month, but in fact, the new Target R would be ~70% instead of 80%, since the stability is very very high in the first place

bold terrace Mar 9, 2025, 9:10 PM

#

polar maple <@530106856593424407> what exactly is the calculation to find leeches for your i...

IMO, having a leech detection more suited than the number of Lapse

#

Having N lapse is not really a measure of a leech in FSRS

unique salmon Mar 9, 2025, 9:10 PM

#

polar maple <@530106856593424407> what exactly is the calculation to find leeches for your i...

Pretty much, yes

unique salmon Mar 9, 2025, 9:10 PM

#

unique salmon `pmf_exact = fast_poisson_binomial_pmf(p_succ) p_value_exact = sum(pmf_exact[0:n...

See this

#

We add up the probabilities of 0, 1, 2...k successes, where k is the real number of successes

#

Which gives us the probability of failing this card n-k times or even more times

polar maple Mar 9, 2025, 9:12 PM

#

ok just a small concern, this would probably flag more than 1% of cards since the rarity of cards behaves as some sort of random walk and cards can fall below the 1% threshold (and come back over it) over time, so this idea requires some more investigation first

unique salmon Mar 9, 2025, 9:13 PM

#

polar maple ok just a small concern, this would probably flag more than 1% of cards since th...

Make it so that at least 3 reviews (without the first review) are required
Allow a change in the leech status only once per 3 reviews

#

So if a card has been tagged as a leech, it cannot be un-leeched for the next 2 reviews

#

Oh, and yes, we would need to code the un-leeching part from zero

#

Right now Anki can automatically tag cards as leeches, but not automatically remove the tag

polar maple Mar 9, 2025, 9:14 PM

#

i mean it requires some proper investigation in terms of the memory model. Suppose that D doesn't exist in FSRS, then you would actually expect every single card to eventually become a leech at some point in their lifetime, but i'm not sure if this is the behaviour that you want

#

and now let's reintroduce D. Make the assumption that D is computed solely based on the first few reviews. Then on the 10th review and on, an easy card can very easily become a leech since it rolls the same dice as the high difficulty cards

#

the DR formula doesn't include D or anything

unique salmon Mar 9, 2025, 9:17 PM

#

I'm really not sure what you're trying to say

polar maple Mar 9, 2025, 9:17 PM

#

just a retention based formula is not enough to find leeches

unique salmon Mar 9, 2025, 9:18 PM

#

We're not using DR though, we're using R at the time of the review

polar maple Mar 9, 2025, 9:19 PM

#

Picture this, you have a tree that models card histories. Going right corresponds to a pass, going left corresponds to a fail. So suppose we sampled the nodes at 4, 5, 6, 7 for fail fail, fail pass, pass fail, pass pass. Now also suppose that 4 < 5 < 6 < 7 in terms of card easyness. This is reasonable in terms of the review history, 6 and 7 were passed on the first review, 4 and 5 failed. But your method would treat 4 and 6 as having the same rarity

#

(just suppose that each decision point is 50%)

#

so D must be used as part of the formula, not just R

#

or just use D only? technically it has the right interpretation

unique salmon Mar 9, 2025, 9:23 PM

#

How would you use D if the detector is based on probabilities?

#

D is not a probability

unique salmon Mar 9, 2025, 9:24 PM

#

polar maple Picture this, you have a tree that models card histories. Going right correspond...

But your method would treat 4 and 6 as having the same rarity
No? The detector is based on the entire history of the card (or the last 64 reviews, whatever)
2-4 = 2 fails
3-6 = 1 fail

#

2-4 -> left-left -> fail-fail
3-6 -> right-left -> pass-fail

polar maple Mar 9, 2025, 9:25 PM

#

unique salmon > But your method would treat 4 and 6 as having the same rarity No? The detector...

4 = fail fail
5 = fail pass
6 = pass fail
7 = pass pass

in this example we suppose that these corresponds to 4 separate cards

#

ah yeah i miscounted but yeah 5 and 6 has the same rarity here

#

but all you need is to add another layer to the binary tree to make even weirder results

#

the point of this exercise is to show that counting failures does not preserve the order of the elements

#

#

in this one, counting failures suggests that the review history of the red line is not as bad as the blue line

#

bue has 2 failures, red has just 1

polar maple Mar 9, 2025, 9:29 PM

#

unique salmon D is not a probability

one simple thing to try is to find the distribution of D and just count the bottom 1% as lapses

unique salmon Mar 9, 2025, 9:30 PM

#

polar maple one simple thing to try is to find the distribution of D and just count the bott...

You mean leeches?

#

Meh, then we're back to just counting without taking the probability of recall into account

#

Since D doesn't depend on R

#

Finding cards with the highest D is so strongly correlated with counting Agains it might as well be the same thing, up to a constant

polar maple Mar 9, 2025, 9:37 PM

#

unique salmon Meh, then we're back to just counting without taking the probability of recall i...

sounds like a separate problem for FSRS if it cannot fine tune D based on R

polar maple Mar 9, 2025, 9:37 PM

#

unique salmon Finding cards with the highest D is so strongly correlated with counting Agains ...

that's what I'd expect but this way also saves easy cards from becoming leeches with 100% probability in the long run

unique salmon Mar 9, 2025, 9:38 PM

#

polar maple that's what I'd expect but this way also saves easy cards from becoming leeches ...

I still don't get why this would happen

#

Easy cards won't have a small enough number of successful reviews to get tagged

polar maple Mar 9, 2025, 9:41 PM

#

sure but that relies on humans not using anki long enough. and remember this is just a worst case example that shows that the method is wrong. how else could it go wrong? what reason do we have to believe that it is even reasonable to use? that's why you should investigate more

unique salmon Mar 9, 2025, 9:42 PM

#

polar maple sure but that relies on humans not using anki long enough. and remember this is ...

sure but that relies on humans not using anki long enough.
?

#

If you mean that a card can have an unlucky streak just by accident, sure, but as long as the rest of the review history is normal, the number of successes will still be high enough for it to not get tagged

#

I mean, I guess it's theoretically possible for a normal card to fail 64 times in a row, but I bet that will never happen

polar maple Mar 9, 2025, 9:44 PM

#

unique salmon If you mean that a card can have an unlucky streak just by accident, sure, but a...

just look at the binary tree example. it shows that easy cards can easily get tagged with high probability

unique salmon Mar 9, 2025, 9:46 PM

#

https://media.discordapp.net/attachments/907746802902634518/961979114477223936/Apu_scratching_brain_1.gif

#

Ok, I'll do a graph with probabilities later

polar maple Mar 9, 2025, 9:50 PM

#

you should use the fsrs simulator or something

#

but idk what metric you would go for to count proper leeches other than just the bottom 1% of D lol

unique salmon Mar 9, 2025, 10:01 PM

#

Alright, assume a perfect scheduler that always schedules a card at exactly R=90%. Suppose we did 2 reviews.
The card always has a 90% chance to go "right" and a 10% chance to go "left". So in the end there are 4 possible outcomes:

Left-left: 1% chance
Left-right: 9% chance
Right-left: 9% chance
Right-right: 81% chance

Explain what's wrong

#

@polar maple

polar maple Mar 9, 2025, 10:08 PM

#

polar maple

apply the same logic to the bigger binary tree here and you would wrongly find that the blue line is more of a leech than the red line

unique salmon Mar 9, 2025, 10:24 PM

#

polar maple apply the same logic to the bigger binary tree here and you would wrongly find t...

I still don't see the problem

#

Card 11 (or card in the state 11, whatever) is more of leech than card 10 because 10 has two successes and 11 has one success

#

Why would this be wrong?

polar maple Mar 9, 2025, 10:28 PM

#

unique salmon Card 11 (or card in the state 11, whatever) is more of leech than card 10 becaus...

here we make the assumption that cards from left to right are in decreasing difficulty; this makes sense when you examine individual reviews, cards that failed the first review are all of a harder difficulty than all cards that passed the first review

#

while this isn't a correct assumption this example shows that your idea isn't correct as-is

#

also since card 11 passed the first review, you would also expect the intervals that it uses to perhaps be longer than the ones in card 10

#

this is definitely true in the case of FSRS

unique salmon Mar 9, 2025, 10:30 PM

#

I still don't see the problem

#

Genuinely

polar maple Mar 9, 2025, 10:31 PM

#

i guess it boils down to this

#

give me evidence that your idea would work well

#

don't expect me to disprove it

#

it isn't mathematically correct or anything

#

so at least show that it works well empirically

unique salmon Mar 9, 2025, 10:34 PM

#

Well, the current approach in ANki is based on just counting Agains. This doesn't take into account the fact that pressing Again when R is high is a pretty different situation from pressing Again when R is low. The former is surprising, the latter is not. So this method would be more precise because it takes the probability of recall into account. Of course, if FSRS sucks at predicting probabilities, this will suck as well.
As for how many cards will be tagged as leeches, we can use some threshold, like 1%. If the user has a large number of leeches, in reality more than 1% will be tagged as leeches. The more leeches - more precisely, cards for which FSRS consistently overestimates R - the more cards will be tagged as leeches, more than 1%.
Whether this results in satisfactory user experience is somheting that we won't know until we implement it.

polar maple Mar 9, 2025, 10:35 PM

#

replacing a bad method with another bad method isn't satisfactory especially when we have no reason to believe that this new method is any good

unique salmon Mar 9, 2025, 10:36 PM

#

I literally just explained why it's better - because it takes into account the probability of recall

polar maple Mar 9, 2025, 10:37 PM

#

then here's another one: take the bottom 1% of D. Why is yours better than mine?

unique salmon Mar 9, 2025, 10:37 PM

#

Failing a card 3 times at 99%, 99%, 99% is clearly worse than failing a card 3 times at 70%, 70%, 70%

unique salmon Mar 9, 2025, 10:38 PM

#

polar maple then here's another one: take the bottom 1% of D. Why is yours better than mine?

Because D is not a probability, and is not directly related to it. So we're back to counting Agains, just in a roundabout way

polar maple Mar 9, 2025, 10:39 PM

#

unique salmon Because D is not a probability, and is not directly related to it. So we're back...

i dont see why D has to be a probablity

polar maple Mar 9, 2025, 10:39 PM

#

unique salmon Failing a card 3 times at 99%, 99%, 99% is clearly worse than failing a card 3 t...

i have never disagreed about this, but imo pass pass pass fail fail fail is not worse than fail fail fail pass pass pass especially when you look at the intervals that would be involved with these cards

bold terrace Mar 9, 2025, 10:39 PM

#

polar maple one simple thing to try is to find the distribution of D and just count the bott...

This is just another interpretation of what a leech can be.
In this sentence, it shows you expect a leech to be the hardest card.
Expertium expect a leech as a state a card is when multiple reviews start to diverge far from what would be normal if R~=DR.

In your case, "leech = difficult", in expertium case, "leech = off the predictions"

#

The most difficult card, with R=DR most of the time, would be a leech for you, not for Expertium

#

My opinion is, the current leech definition is just worse than any of those 2 interpretations

polar maple Mar 9, 2025, 10:40 PM

#

yeah, i dont see why leech = difficult is not the goal here. we want to identify cards that would take too much effort to learn

bold terrace Mar 9, 2025, 10:41 PM

#

Because given enough time, lapsing N time is just normal with FSRS

polar maple Mar 9, 2025, 10:41 PM

#

and leech = off predictions can easily happen for easy cards by just random luck as i have demonstrated in my examples

bold terrace Mar 9, 2025, 10:41 PM

#

Personally I think historically, there was always a difference between a leech and a card with high difficulty, so I think it's intuitively different case

#

It can be hard, but your predicted R might be matched

unique salmon Mar 9, 2025, 10:42 PM

#

bold terrace The most difficult card, with R=DR most of the time, would be a leech for you, n...

Yep, that's the crux

bold terrace Mar 9, 2025, 10:42 PM

#

It's also interesting to know, what cards can't be matched correctly to R

#

It's 2 different question

#

This is why Philsophy is sometimes useful haha

unique salmon Mar 9, 2025, 10:42 PM

#

If a card is insanely difficult subjectively, but gets successfully recalled roughly as often as we expect it, then it's not a leech under my definition

bold terrace Mar 9, 2025, 10:42 PM

#

So maybe we could have different leech detectors 😄

#

Anki will look like a boeing cockpit but it's all fine

#

i's fun

polar maple Mar 9, 2025, 10:43 PM

#

unique salmon If a card is insanely difficult subjectively, but gets successfully recalled rou...

even in FSRS, D is not this 'subjective difficulty'

bold terrace Mar 9, 2025, 10:43 PM

#

However, I'd argue D can't be used. The "neutral" point for D, will be higher for lower DR

#

D work really for similar DR

#

Also I noticed D has like multiple class with multiple normal distribution

#

https://media.discordapp.net/attachments/1347522619460812810/1347852615702282341/CleanShot_2025-03-08_at_09.44.252x.png?ex=67cf4f51&is=67cdfdd1&hm=82f7bdf35d620c83bb88111df26940abca1bc5cd287142d978f98507c95fb171&=&format=webp&quality=lossless&width=3500&height=2376

#

https://media.discordapp.net/attachments/1347522619460812810/1347852526271205427/CleanShot_2025-03-08_at_09.41.262x.png?ex=67cf4f3c&is=67cdfdbc&hm=477709a69e060e544b022836061e84b67813fbff63f3394b5fb59f9898f9d2df&=&format=webp&quality=lossless&width=3328&height=2428

#

https://media.discordapp.net/attachments/1347522619460812810/1347852526942158848/CleanShot_2025-03-08_at_09.42.442x.png?ex=67cf4f3c&is=67cdfdbc&hm=49ac7818a13696c845d84d6427476086f3d3398557bfa72521380241374c7a88&=&format=webp&quality=lossless&width=3312&height=2132

#

You see this

#

you think "It's like an exponential"

#

but no

#

It's different curves

#

like different clusters or difficulty

#

So "leeches" could be the rightmost part of each curves

unique salmon Mar 9, 2025, 10:45 PM

#

bold terrace Also I noticed D has like multiple class with multiple normal distribution

That's because it gets updated by a (roughly) fixed number that depends (mostly) on the grade

polar maple Mar 9, 2025, 10:45 PM

#

bold terrace https://media.discordapp.net/attachments/1347522619460812810/1347852526271205427...

do not confuses issue with FSRS with this, we now know that FSRS is not actually a very good prediction model when it gets beaten by a simple moving average

bold terrace Mar 9, 2025, 10:46 PM

#

polar maple do not confuses issue with FSRS with this, we now know that FSRS is not actually...

This is FSRS "D" concept

polar maple Mar 9, 2025, 10:46 PM

#

bold terrace This is just another interpretation of what a leech can be. In this sentence, it...

if 100 people flip a fair coin for long enough, eventually all of them will have the lowest (buttom 1%) count of total tails

unique salmon Mar 9, 2025, 10:46 PM

#

polar maple if 100 people flip a fair coin for long enough, eventually all of them will have...

But we can then un-leech the card later

#

The tag can be removed as new reviews come in

#

Btw, this is another advantage over the current method, where the road to leeches is one-way 🤣

polar maple Mar 9, 2025, 10:47 PM

#

unique salmon But we can then un-leech the card later

i would rather not be wrongly notified of a leech

bold terrace Mar 9, 2025, 10:47 PM

#

polar maple if 100 people flip a fair coin for long enough, eventually all of them will have...

Well, he'll get an alarm "You got 1% unlucky here !", he'll check it, he'll move past

#

Also, with the "max D" solution, it will also happen

#

he got unlucky, press again too much time in a row -> max D

polar maple Mar 9, 2025, 10:47 PM

#

bold terrace Well, he'll get an alarm "You got 1% unlucky here !", he'll check it, he'll move...

then what is the point of leech detection of the detections are of no value?

#

is it just another false positive?

#

how can i trust it?

bold terrace Mar 9, 2025, 10:48 PM

#

Human interpretation and feedback

#

The whole review chain is not made for machine

unique salmon Mar 9, 2025, 10:48 PM

#

We can choose a threshold such that there will be very few false positives

bold terrace Mar 9, 2025, 10:48 PM

#

made for user that will have a chance to check what cards underperformed, and assess themselves the reasons

unique salmon Mar 9, 2025, 10:48 PM

#

1% or 0.1% or 0.01% or whatver

bold terrace Mar 9, 2025, 10:48 PM

#

Also, we're not flipping coin here

polar maple Mar 9, 2025, 10:48 PM

#

unique salmon We can choose a threshold such that there will be very few false positives

exactly, my first point i brought in is that a 1% threshold will flag more than 1% of cards

bold terrace Mar 9, 2025, 10:48 PM

#

We're assessing memory

#

It's not because R=90% that TRULY the memory has a 90% chance of getting the valu

#

WE estimate it to be 90%

#

if he gets it 3 times in a row wrong, it's not just bad luck

#

It's bad memory

#

So the interpretation is totally different than a coin flip

#

We think he will got it at 90% with 60d stability ? Nop
30d ? Nop
5d ? Nop

It's way more than being unlucky

#

There IS a reason for this sudden loss of memory

#

it's not just a flipped coin

unique salmon Mar 9, 2025, 10:50 PM

#

polar maple exactly, my first point i brought in is that a 1% threshold will flag more than ...

But that's not a critic of the method itself

polar maple Mar 9, 2025, 10:51 PM

#

unique salmon But that's not a critic of the method itself

it kind of is, i want a statistical test or something instead

bold terrace Mar 9, 2025, 10:51 PM

#

polar maple exactly, my first point i brought in is that a 1% threshold will flag more than ...

This could be configurable also

unique salmon Mar 9, 2025, 10:51 PM

#

We can make the threshold 0.2% if that makes you sleep at night better

#

Or 0.1%, whatever

bold terrace Mar 9, 2025, 10:51 PM

#

By dychotomy finding the %-age that will flag 1% haha

#

Just joking

polar maple Mar 9, 2025, 10:52 PM

#

bold terrace We think he will got it at 90% with 60d stability ? Nop 30d ? Nop 5d ? Nop It's...

i guess i disagree, memory can be random, if a card had 3 passes that brought it to 60d and now 3 failures brought it to 5d, it is probably an easy card but there is some interference somewhere, whereas a card that struggled at fail fail fail pass pass pass and is now at 5d as well, will probably grow much slower in the future and also encounter more failures

bold terrace Mar 9, 2025, 10:53 PM

#

polar maple i guess i disagree, memory can be random, if a card had 3 passes that brought it...

Sure, Interference could aggregate with Leech in this algorithm

#

Unfortunately interferences are a bit diffcult to find out

#

Maybe another term than "leech" could be better for sure

#

But the current "leech" (lapse >= N) would have to disappear then

#

So it's not really a criticize of Expertium's proposal, but a criticize of Anki own choice of using Leech as a concept

polar maple Mar 9, 2025, 10:54 PM

#

also i still don't understand why R != DR is even the goal lol, you can predict the distribution of R assuming that FSRS is a good prediction model, but how would you even interpret the bottom 1%? Is it just bad luck or something else? Whereas a high D even in FSRS has a more direct interpretation: these cards will have their intervals grow slower, so they are probably harder

bold terrace Mar 9, 2025, 10:54 PM

#

One could even argue "Leech" could just mean "It leeches your workload for very low returns", computing something like "Utility*Stability/Reviews"

polar maple Mar 9, 2025, 10:55 PM

#

bold terrace One could even argue "Leech" could just mean "It leeches your workload for very ...

hmmmmmm i wonder if D does this...

#

🤔

bold terrace Mar 9, 2025, 10:55 PM

#

Unfortunately no

polar maple Mar 9, 2025, 10:55 PM

#

yes it does, high D cards have their stabilities grow slower

bold terrace Mar 9, 2025, 10:55 PM

#

"Utility"

#

A low stability card bumping your workload could have high value

unique salmon Mar 9, 2025, 10:55 PM

#

How would you define utility?

bold terrace Mar 9, 2025, 10:56 PM

#

You're nitpicking the concepts you find useful or not

bold terrace Mar 9, 2025, 10:56 PM

#

unique salmon How would you define utility?

You can't really, thus why I think "Leech" in that interpretation would be useless

#

You'd just proxy a vaguely defined term "leech" with another "utility"

#

"Hard" cards and "Out-of-distribution" card could be better name than "Leeches"

#

But to me, "Leech" as it is right now is even worse, it's just useless

#

(The Lapse > N)

#

For SM2 it can make sense though

#

Anki having to maintain SM2+FSRS requires some flexibility in terms of interpretation if you want to keep the same UI and options

#

if "Leech" need to be amalgamed with "Out-of-distribution results", it's fine by me

polar maple Mar 9, 2025, 10:59 PM

#

@unique salmon what is your interpretation of cards that would be in the bottom 1%?

#

certain interpretations might even lead you to develop a formula for FSRS

unique salmon Mar 9, 2025, 10:59 PM

#

Alright, how about a really dumb compromise - find cards within the bottom 5% D AND with <5% p(successes<=k), where k is the current number of successes
In other words, find cards that are leeches according to both methods 😅

#

Please don't nitpick the thresholds, btw

polar maple Mar 9, 2025, 11:01 PM

#

also, this idea could pretty much add a new dimension to the usual DSR models, if the running history likelihood is actually important, surely you could add it to DSR and get DSR + H or something and improve FSRS?

unique salmon Mar 9, 2025, 11:01 PM

#

polar maple <@530106856593424407> what is your interpretation of cards that would be in the ...

Cards for which the forgetting curve/FSRS formulas just don't work

bold terrace Mar 9, 2025, 11:01 PM

#

D is not really comparable with different DR though

polar maple Mar 9, 2025, 11:02 PM

#

unique salmon Cards for which the forgetting curve/FSRS formulas just don't work

ok then i will stop talking, this is not the definition of a leech to me, leech is a hard card that i keep on forgetting

#

"Leeches are cards that you keep forgetting. Because they require so many reviews, they take up a lot more of your time, compared to other cards."

#

seems reasonable

bold terrace Mar 9, 2025, 11:04 PM

#

With Lower DR you get Higher D, so it's not really working well unfortunately

#

Higher "Neutral" D let's call it like that

#

D is not influenced by DR

#

So the more you fail, the more the "balance" goes close to 100%

#

so DR=60% just by nature will have higher D than DR=90%

unique salmon Mar 9, 2025, 11:05 PM

#

Yeah, but Alex wants to look at cards with relatively high D, relative to other cards from the same preset

bold terrace Mar 9, 2025, 11:05 PM

#

Blue is 1 fail 1 good

#

Red is 3 Good 1 Fail

#

the balance point will be higher for blue

bold terrace Mar 9, 2025, 11:06 PM

#

unique salmon Yeah, but Alex wants to look at cards with *relatively* high D, relative to othe...

hmm ok

#

I'm looking at my top D now

#

Top 1.4%, I have this :

#

1 fail, 2 hard

#

Now I look in my top perfmer, D=82%

#

#

2 Fail in less reviews

#

Not that convinced about D

#

But yeah, basically it compounds over multiple lapse

#

You can easily fail 3 times in a row and still be considered "easier" than a card taht fails from time to time

#

IMO in those case, The probability detection is better

#

Since D will get higher and higher at each lapse, the "High D = leech" comes back to "The more you lapse, the more it's a leech"

#

Which is stupid

#

You ask a lot that we should justify those probability detection

#

But I start to feel you should start to justify it a bit more :/

polar maple Mar 9, 2025, 11:13 PM

#

bold terrace But I start to feel you should start to justify it a bit more :/

wait a minute, i'm not the one trying to add a new feature here

bold terrace Mar 9, 2025, 11:13 PM

#

In a perfect world with a perfect D, might make sense, but it's not the D we haev

bold terrace Mar 9, 2025, 11:13 PM

#

polar maple wait a minute, i'm not the one trying to add a new feature here

Fixing a broken feature with something at least a bit useful*

#

Also, don't really have to justify it to you

polar maple Mar 9, 2025, 11:13 PM

#

bold terrace Top 1.4%, I have this :

for these two examples, isn't the second picture of lower difficulty? i mean visually it seems that it reached 22 days stbility much faster than the first picture, the first picture seems to indeed be more difficult

bold terrace Mar 9, 2025, 11:14 PM

#

polar maple for these two examples, isn't the second picture of lower difficulty? i mean vis...

IT did because by nature, Difficulty start way lower initially

#

So it got to 22d stability "just because"

#

(In my optimization though)

polar maple Mar 9, 2025, 11:15 PM

#

bold terrace IT did because by nature, Difficulty start way lower initially

yes, because the learning steps indicate that it was an easier card or something like that

bold terrace Mar 9, 2025, 11:15 PM

#

(Maybe other start high ?)

polar maple Mar 9, 2025, 11:15 PM

#

tbh i'm very confused about your example, it does not paint a bad picture about difficulty at all

#

@unique salmon maybe you can explain it?

unique salmon Mar 9, 2025, 11:16 PM

#

polar maple <@530106856593424407> maybe you can explain it?

Idk what to say, other than "look at the FSRS formulas"

polar maple Mar 9, 2025, 11:16 PM

#

i mean, why would this example in particular be an argument against difficulty

#

those histories seems to be modelled by D well

bold terrace Mar 9, 2025, 11:16 PM

#

polar maple tbh i'm very confused about your example, it does not paint a bad picture about ...

The thing is, the more you lapse, the higher the D will become, the lower the stability will be.

Problem is, my top 1.4% most difficult card is not per say a very difficult one, it's just one that live long enough to have many lapses

polar maple Mar 9, 2025, 11:17 PM

#

i'm sure you can find other examples that paint D in a bad light but these examples just aren't it

bold terrace Mar 9, 2025, 11:17 PM

#

But having lapses, is perfectly healthy

#

FSRS is predicting me 80% success rate, over time, having 10-15 lapses is just perfeclty normal

#

but those, will get incredible high D

#

Compared to card I might fail more, but in sooner lapses

unique salmon Mar 9, 2025, 11:18 PM

#

Yeah, since "reversion to the mean" (as Jarrett calls it) is very weak for most users, it takes literally thousands of "Good"s to undo one "Again"

bold terrace Mar 9, 2025, 11:19 PM

#

Going to sleep though, but you get the idea

unique salmon Mar 9, 2025, 11:19 PM

#

unique salmon Yeah, since "reversion to the mean" (as Jarrett calls it) is very weak for most ...

So D becomes just "an Again counter"

polar maple Mar 9, 2025, 11:19 PM

#

unique salmon Cards for which the forgetting curve/FSRS formulas just don't work

as Sound suggested if we continue with this idea we would need to rename 'leech' since it isn't what most people expect anymore

polar maple Mar 9, 2025, 11:20 PM

#

bold terrace The thing is, the more you lapse, the higher the D will become, the lower the st...

this is the definition of leech that most people expect right now

#

i really struggle to see the problem

unique salmon Mar 9, 2025, 11:22 PM

#

polar maple as Sound suggested if we continue with this idea we would need to rename 'leech'...

Do we need to, though? I mean, with my method a "leech" is still a very hard card, broadly speaking

#

"A card that you fail more often than expected"

unique salmon Mar 9, 2025, 11:23 PM

#

polar maple ok then i will stop talking, this is not the definition of a leech to me, leech ...

"A card that you fail more often than expected" and "A hard card that you keep on forgetting" seem like the same thing with different wording, no?

#

The only situation we disagree on is if a card feels difficult subjectively, yet the number of lapses and successes is in line with what is theoretically expected

polar maple Mar 9, 2025, 11:25 PM

#

unique salmon "A card that you fail more often than expected" and "A hard card that you keep o...

wait a moment, you defined your way out of this situation just earlier when i suggested if you can use the knowledge of history likelihood to improve FSRS formulas

#

if the history likelihood actually matters then you can improve FSRS right?

#

otherwise it doesn't matter and its just a useless metric

unique salmon Mar 9, 2025, 11:26 PM

#

You mean using the history of most recent cards, not just this specific card?

#

Idk how we would use that in FSRS

polar maple Mar 9, 2025, 11:26 PM

#

i mean this specific card

unique salmon Mar 9, 2025, 11:26 PM

#

Uhhh...then I'm not sure what do you mean

polar maple Mar 9, 2025, 11:26 PM

#

if the probability of this card's reviews matters then by all means incorporate it into FSRS

unique salmon Mar 9, 2025, 11:26 PM

#

Do you mean something where the order matters?

#

Because in my current method fail - pass - fail is treated the same way as fail - fail - pass

polar maple Mar 9, 2025, 11:27 PM

#

So the idea of leech detection is that we find some signal that suggests that future reviews of this card will be difficult in some manner. But this metric, if it is insightful, should be able to be added as a formula into FSRS to improve predictions

#

so one way to show if this is actually a useful signal, the likelihood of the review history, is to see if you can find any formulas that uses this value

#

if you can, then you have found an improvement to FSRS. If you cannot then the metric is not insightful

unique salmon Mar 9, 2025, 11:28 PM

#

Ah, ok. So you want me to try to incorporate this into the formulas themselves. Interesting. I'll think about it.

#

Idk how the hell I'm going to do the math, though

#

Like, with torch

polar maple Mar 9, 2025, 11:29 PM

#

i'd guess you need to do some plotting and then make some guesses

unique salmon Mar 9, 2025, 11:29 PM

#

And also this means that we would have to store every R value in the memory state, Jarrett is not going to be happy about that

polar maple Mar 9, 2025, 11:30 PM

#

unique salmon Like, with torch

hmmm. Compute DSR with FSRS, add this historical likelihood thing as H, have a nn print out a forgetting curve from these 4 values

#

try this without H as well after

#

to compare

unique salmon Mar 9, 2025, 11:30 PM

#

No, I mean, Poisson binomial stuff

#

But with torch arrays and whatnot

polar maple Mar 9, 2025, 11:32 PM

#

ask claude to update the numpy code to do it in parallel with another dimension

#

then ask it to convert it into torch

unique salmon Mar 9, 2025, 11:32 PM

#

Actually, the more I think about it, the more is seems like a nightmare. Doing Poisson binomial PMF stuff and storing every value of R...man...

polar maple Mar 9, 2025, 11:32 PM

#

actually if its just to find the likelihood of the history you don't need poisson binomial pmf, you just multiply all the probabilities together

#

you don't need the bottom 1% or anything like that in this case

#

you just need the exact probability, which is easily computed by just multiplication

unique salmon Mar 9, 2025, 11:33 PM

#

No, I need bottom 1%

polar maple Mar 9, 2025, 11:33 PM

#

ok sure, but it will prob make a nn have a harder time for the DSR + H idea

unique salmon Mar 9, 2025, 11:34 PM

#

Ain't no way I'm making an nn compatible with the benchmarking code, mate

#

Ain't no way

polar maple Mar 9, 2025, 11:35 PM

#

polar maple actually if its just to find the likelihood of the history you don't need poisso...

nvm about this, the raw likelihood is useless, we care more about relative likelihoods so yeah poisson binomial pmf it is

unique salmon Mar 9, 2025, 11:38 PM

#

Ok, screw it, I highly doubt I will be able to implement it. You can ask Jarrett

polar maple Mar 9, 2025, 11:39 PM

#

Unfortunate. But if this historical likelihood has rich information then such a nn should get significant performance boost from it so it should be investigated

#

otherwise the leech idea isn't promising

unique salmon Mar 9, 2025, 11:40 PM

#

You can incorporate it into something like an LSTM as an input feature

#

And see if it helps

#

Sorry if I'm not being helpful here

unique salmon Mar 9, 2025, 11:42 PM

#

unique salmon You can incorporate it into something like an LSTM as an input feature

This sounds pretty interesting, actually. It would make it self-reflective, in a way. Like, "ok, I see that my own predictions are off, I need to account for this fact"

#

With FSRS I can do a simplified version - just a moving average of abs(R - binary grade)

#

And then see if I can turn that into some sort of multiplier or something

unique salmon Mar 9, 2025, 11:45 PM

#

unique salmon With FSRS I can do a simplified version - just a moving average of abs(R - binar...

Or maybe -ln(1-abs(R - binary grade))

#

I've actually tried incorporating this into the update of D as an extra multiplier, but it didn't do anything good

#

Maybe with some more parameters and with using it for S instead of D it could be useful

#

Maybe it needs to be it's own variable

#

I mean, instead of just a modifier for D

#

Well, at least a moving average of abs(R - binary grade) is workable, I can do it, unlike the PMF and all that stuff

polar maple Mar 9, 2025, 11:52 PM

#

unique salmon You can incorporate it into something like an LSTM as an input feature

a problem about this specifically is that LSTM could compute something similar internally even without it being given explicitly, that's why a nn would need only the 4 values DSRH, and no other input about the history of that card

#

since LSTM is given the full history of the card it has the same information required to compute H

#

(lmk if you want to call it something else btw, this historical likelihood thing)

quasi shadow Mar 10, 2025, 2:44 AM

#

🤣 This thread will become research references for spaced repetition.

ashen light Mar 10, 2025, 2:46 AM

#

too bad discord is where information goes to die

quasi shadow Mar 10, 2025, 7:19 AM

#

ashen light too bad discord is where information goes to die

I'm asking my friend to develop a daily summary bot for our thread😂

#

He has developed a bot for telegram: https://github.com/asukaminato0721/telegram-summary-bot

GitHub

GitHub - asukaminato0721/telegram-summary-bot: Summarize group chat...

Summarize group chat with AI, LLM && query group chat, FREE to deploy your own, support img, link meta info, reply to, 支持中文检索. - asukaminato0721/telegram-summary-bot

ashen light Mar 10, 2025, 7:55 AM

#

can't wait to see how it generalizes my chat activity

#

"jake continues to refuse to help"

cosmic hedge Mar 10, 2025, 8:49 AM

#

ashen light can't wait to see how it generalizes my chat activity

what do you think about jake as a person

#

gemini only let me paste half the chat XD

hasty fractal Mar 10, 2025, 12:28 PM

#

it missed the sarcastic jake

ashen light Mar 10, 2025, 4:52 PM

#

thats the only jake there actually is

unique salmon Mar 10, 2025, 10:06 PM

#

I made a flowchart so that I don't have to type out the same thing all the time 🤣
Thoughts?

robust hill Mar 10, 2025, 11:12 PM

#

what if

#

fsrs gives intervals that are too medium

bold terrace Mar 10, 2025, 11:12 PM

#

TBF, I'd just put in the bottom "Or just wait to have more reviews before optimizing like crazy"

#

Also, first stpe would be "Is your Retention around your Desired Retention (~10% ballpark)". Yes -> Intervals are OK

#

"Have you less than 10K reviews". -> Review more

#

"Do you change all the time your DR" -> Stop

#

When people say "There is NO way this interval make sense", they tend to forget that FSRS didnt come up with that interval on its own, it just read your history and that's what it saw

unique salmon Mar 11, 2025, 12:03 AM

#

bold terrace Also, first stpe would be "Is your Retention around your Desired Retention (~10%...

10% is reasonable if your DR is 70%, but not 95%, for example

#

So it's had to say how much deviation is ok

robust hill Mar 11, 2025, 2:34 PM

#

how bad is it that when

#

i optimize with FSRShelper addon, and it gives me more cards to do, then i do it, but if it reduces the cards i have to do, then i undo the optimize

#

😭

#

i always feel sketched having less to do idk why

bold terrace Mar 11, 2025, 2:38 PM

#

unique salmon 10% is reasonable if your DR is 70%, but not 95%, for example

I mean Stability and LB considered, unfortunately a 95% DR could translate into a 70-80% R, so you're average retention on those days might be way lower unfortunately

#

Higher the stability, the less it will be a problem, but for low stability, for example if you suddenly added a lot of new card/day, and ~50% of your card have stability <1d, you'll have a bigger difference

#

It's not a bug or a problem itself, since it will correct itself with higher stability and bigger deck, but it's still something to keep in mind when differences occurs

robust hill Mar 11, 2025, 2:43 PM

#

i have average card stability of 18 days with 93% dr on a deck from october 1 with avg of 8ish new cards a day avg difficulty 77%

bold terrace Mar 11, 2025, 2:44 PM

#

As you can see, even if my DR is 84%, my Target R for many cards can be around 71-80%, sometimes just because they have very low stability, sometimes because the LB or rescheduler pushed them a bit too far

robust hill Mar 11, 2025, 2:44 PM

#

with only 8 cards being after 30 days out of 900

unique salmon Mar 11, 2025, 2:44 PM

#

I wonder if we should tweak LB a bit. Here's the formula
Maybe we should make it more aggresively schedule cards earlier by using the square of the interval length (or something like that) in the weight

bold terrace Mar 11, 2025, 2:44 PM

#

robust hill i have average card stability of 18 days with 93% dr on a deck from october 1 wi...

And your daily retention is at how much ? (The Actual one) ? And your RMSE ?

robust hill Mar 11, 2025, 2:45 PM

#

havent finished today yet

#

log loss .3368 rmse 2.52

bold terrace Mar 11, 2025, 2:45 PM

#

So it's quite nice !

#

And how many reviews per day more or less ?

#

200 ?

robust hill Mar 11, 2025, 2:46 PM

#

bold terrace Mar 11, 2025, 2:46 PM

#

Ok !

robust hill Mar 11, 2025, 2:47 PM

#

robust hill

this is on my end

bold terrace Mar 11, 2025, 2:47 PM

#

To be honest your situation is quite good

robust hill Mar 11, 2025, 2:47 PM

#

some day i would not do new cards, some days id do a lot

#

well thats good atleast

bold terrace Mar 11, 2025, 2:47 PM

#

There's symptoms that will show you if something is not right

#

The average stability that's why I wanted to have it so bad, it's because it's a good sign that too much card are added every day, so stability can't be built

#

I think keeping a high DR is also a smart move

#

I made the mistake to lower it with time to be able to add more new card/day, and I really shouldn't have

robust hill Mar 11, 2025, 2:48 PM

#

u should see the leech deck

#

#

bold terrace Mar 11, 2025, 2:49 PM

#

So after X lapse you put them in a leech deck, do you do something specific with them ?

robust hill Mar 11, 2025, 2:49 PM

#

robust hill

last month was when i put them into their own leech deck and made it optimize with their own deck options, curently went from 13 rmse to 11 now

bold terrace Mar 11, 2025, 2:49 PM

#

Different DR ?

robust hill Mar 11, 2025, 2:49 PM

#

same dr but optimize the parameters against leeches only

#

its working pretty well

bold terrace Mar 11, 2025, 2:49 PM

#

Never thought of it but that can be quite good actually

robust hill Mar 11, 2025, 2:49 PM

#

i talked about it here

#

and someone gave me an idea

bold terrace Mar 11, 2025, 2:49 PM

#

It dropped your RMSE for both ?

robust hill Mar 11, 2025, 2:50 PM

#

i dont remember

#

ive uploaded screenshots before so one sec

#

i guess i dont have it

bold terrace Mar 11, 2025, 2:51 PM

#

It's OK, I'll experiment and search a bit

robust hill Mar 11, 2025, 2:51 PM

#

but as far as i remember yes

bold terrace Mar 11, 2025, 2:52 PM

#

What I do is I do some Filtered Decks to manipulate when some cards are due

#

but it still pollutes my parameters, potentially

robust hill Mar 11, 2025, 2:52 PM

#

in the normal deck it was like 3% and being stubborn to stay around there, then when i separated the leeches out
Leech deck was like 14% ish
after a month, the normal deck is at like 2.5% right now, and leech deck is currently at

#

10.77%

#

but there are not so many reviews

#

only 600 in the past month

#

compared to the main deck which is like 6000

bold terrace Mar 11, 2025, 2:53 PM

#

Yeah it's also difficult to compare because potentially, maybe the RMSE with the old parameters on the non-leech cards, would still have been lower than 3% (if the leeches were the one to mess with the RMSE, while not necessarly being optimized on)

#

But at least now you have a "proof" that for non-leech cards, your RMSE is quite low

robust hill Mar 11, 2025, 2:54 PM

#

let me check one of my language learning decks

#

.3170, 2.85% including leeches

#

after optimization
.3282, 2.34% excluding leeches

#

i do not have a leech deck for the language learning deck, probably i should tho

bold terrace Mar 11, 2025, 2:55 PM

#

Would be fun @unique salmon some kind of "Multi-class FSRS", but I guess we're reinventing neural networks here

Cluster cards by difficulty rating
Create different parameters and optimize those for those different difficulty class

robust hill Mar 11, 2025, 2:55 PM

#

but some cards are tagged leech for a strange reason

#

0 lapses, 7 reviews

bold terrace Mar 11, 2025, 2:55 PM

#

By zooming on my difficulty graph (and increasing granularity), I noticed how there is a lot of smaller normal distribution of difficulties :

robust hill Mar 11, 2025, 2:55 PM

#

robust hill but some cards are tagged leech for a strange reason

actually i think i know why and its my own fault

#

because after 4 months of working TL -> NL i made a note type to make that deck NL -> TL

#

and so i guess it copied the tags

robust hill Mar 11, 2025, 2:57 PM

#

bold terrace By zooming on my difficulty graph (and increasing granularity), I noticed how th...

how can i do this

bold terrace Mar 11, 2025, 3:00 PM

#

There's no easy way, I have a difficulty viewer branch of the addon and I'm tweaking in the code directly for now

#

https://github.com/JSchoreels/Anki-Search-Stats-Extended/tree/feature/difficulty_viewer

GitHub

GitHub - JSchoreels/Anki-Search-Stats-Extended at feature/difficult...

Contribute to JSchoreels/Anki-Search-Stats-Extended development by creating an account on GitHub.

#

I think I can always find a local build with that view

#

but it's on another user session so I'll take a look later to upload it or to improve it so it can go in the main branch

#

but for now it's just personal stuff

robust hill Mar 11, 2025, 3:01 PM

#

okay no worries

#

probably mine will be like that

#

would you prefer me to send u the deck

#

im genuinely curious what mine would look like

#

if i export with include scheduling information + deck presets, does it keep statistics ? it should right

bold terrace Mar 11, 2025, 3:03 PM

#

I think it will be easier if I just send you a local build when I have it 🙂

robust hill Mar 11, 2025, 3:03 PM

#

haha okay

bold terrace Mar 11, 2025, 3:03 PM

#

I'll check on my private session later

robust hill Mar 11, 2025, 3:03 PM

#

no worries

bold terrace Mar 11, 2025, 3:03 PM

#

I'm on another one right now

robust hill Mar 11, 2025, 3:03 PM

#

take your time

#

i keep procrastinating my studies anyways 😭

sonic forge Mar 11, 2025, 3:39 PM

#

unique salmon I wonder if we should tweak LB a bit. Here's the formula Maybe we should make it...

It seems like a good idea!

unique salmon Mar 11, 2025, 3:40 PM

#

@quasi shadow https://github.com/open-spaced-repetition/load-balance-simulator
Is this code up to date? I mean, have there been any changes to Anki's code that are not reflected here?

GitHub

GitHub - open-spaced-repetition/load-balance-simulator

Contribute to open-spaced-repetition/load-balance-simulator development by creating an account on GitHub.

unique salmon Mar 11, 2025, 4:55 PM

#

https://github.com/ankitects/anki/commit/69e699dc134419112956209a67cb0d62380d27cd
There was this change, but as far as I can tell it doesn't touch the load balancer itself, only Easy Days. So I assume the code from the repo above is up-to-date

GitHub

Fix easy days causing load balancer to disproportionately schedule ...

…aduates to the furthest day (#3643)

don't do easy days calculation if all days are the same ease

ashen light Mar 11, 2025, 5:23 PM

#

the initial easy days impl was trying a bit too hard to force the graph into a certain shape and that just sorta lessened that effect

#

theres some extra multipliers in the logic for siblings and easy days but yeah the code in the comment above the lb is still correct

#

https://github.com/ankitects/anki/blob/main/rslib/src/scheduler/states/load_balancer.rs#L371-L378

#

anyway re: lb biasing further to earlier days, is it necessary?

#

it already (if in a vacuum and days have the same amount of cards scheduled) will prioritize an earlier day. cards due naturally sort of gravitates to a 1/x curve. are the specific numbers of this not wokring properly? is it not 1/xing optimally?

unique salmon Mar 11, 2025, 5:42 PM

#

ashen light it already (if in a vacuum and days have the same amount of cards scheduled) wil...

According to Yuki and Sound, no

ashen light Mar 11, 2025, 5:42 PM

#

but really my actual question: given how it already will prioritize an earlier day, how would this cause problems the original fuzzer would not

#

call me when they double-blind some tests, yuki already had a mental bias against it before it even was in anki. sound has real numbers at least 🍃

#

but my point point point is: can someone create a measure that can be tested or at least have a sample size of more than two people?

unique salmon Mar 11, 2025, 5:48 PM

#

Nonetheless, I'll run simulations to see if I can both reduce volatility AND bring the average retention closer to the desired value

ashen light Mar 11, 2025, 5:48 PM

#

oh for sure

sonic forge Mar 11, 2025, 7:53 PM

#

The thing is that because (1 / (cards_due))**2 is squared, it is has huge impact on the weight and it "outshines" the (1 / target_interval)
Current implementation only priorities earlier days if earlier day and further day have the same card count (or near the same) - so the (1 / (cards_due))**2 value is the same.
It is obvious that (1 / (cards_due)) and (1 / target_interval) variables need to be raised to the same power to accomplish fair LB

#

So yes, the point is that these two variables need to be in the same power.

ashen light Mar 11, 2025, 8:02 PM

#

so I think theres a bit of a misunderstanding? in (1/cards_due)**2 the ^2 makes it smaller, not larger? 1/2 * 1/2 = 1/4

#

though the numbers are the real numbers, perhaps normalizing those numbers would be better

#

either way, "priorities earlier days if earlier day and further day have the same card count (or near the same)" yes most due graphs look like this and so it should end up being no different than the normal fuzzing routines in the long term

bold terrace Mar 11, 2025, 8:32 PM

#

ashen light but really my actual question: given how it already will prioritize an earlier d...

Sorry if I was not clear about it, but yeah, the fuzzer would have the same problem

sonic forge Mar 11, 2025, 8:35 PM

#

ashen light either way, "priorities earlier days if earlier day and further day have the sam...

But what about the case when user's due graph looks like decreasing exponent (y = 1000/x, x>= 5, for example)?
Further days have smaller card count.
I am a little confused, because I messed up calculations. It seems like the current formula weight = (1 / (cards_due))**2 * (1 / target_interval) already priorities the target_interval - because cards_due is squared is has less impact than target_interval, right?

bold terrace Mar 11, 2025, 8:37 PM

#

So basically my point personally is just : If I have a card with low stability, I'd prefer to not take an extra hit with LB/Fuzz

Typically in this example, my DR was 84% when I did it, the Target R will be 79% on March 14th, but it will already drop below my DR tomorrow (since it's 85% today)

#

It's thus a bit silly because one of my beloved Filtered deck is to mark as due, cards with R<DR ("deck:Japan::1. Vocabulary" prop:r<0.844 -is:due)

#

(Yeaaah I also do multiple more than 2 decimals lol)

#

But doing so, my "Future Target R" graph is just perfect

#

Without it, my average Target R would be, everyday, ~5% lower than my DR

#

Is it a big deal ? A bit, look how my weekly Retention is way more stable than before

#

Before doing so, I would have to do some mental gymnastic thinking "If I want to remember 80% of the words when I see it, should I put 90% DR ? 85% DR ?"

#

Now I'm always in the ball park of DR+/-RMSE

#

Which is way more motivating than wondering if today will be a bad day or not haha

ashen light Mar 11, 2025, 8:40 PM

#

bold terrace Sorry if I was not clear about it, but yeah, the fuzzer would have the same prob...

you probably were but I just forgot

bold terrace Mar 11, 2025, 8:43 PM

#

Now to the question "Isn't it a ~1/x", in theory with a regular rhythm of new card/day, it should yes ! But of course, if you stop adding new cards, you'll get a flatter curve, and if you suddenly add more, it will be more aggressive.

I might be wet dreaming, but I think the best way to know what would be the "ideal" curve for a user, would be to base it on his "Review Intervals" curve

ashen light Mar 11, 2025, 8:44 PM

#

sonic forge But what about the case when user's due graph looks like decreasing exponent (`y...

"Further days have smaller card count." also typically further days are less susceptible to having retention issues by being off a day. most the issues @bold terrace brought up initally were about fuzzing at short intervals

bold terrace Mar 11, 2025, 8:47 PM

#

ashen light "Further days have smaller card count." also typically further days are less sus...

Yes, example :
I reviewed it 2025/01/17, Target R will be 80% on 2025/05/29 with stability 1.8 month.
If I reschedule it, the target R will be 85.54% on 2025/04/13.

Basically, there's 1.5 month of "room" between a 80 and a 85% R, which is more than fine

#

Very low target R happen when well, a 1d stability card take a +1d increment just by passing in the Fuzzer/LB

sonic forge Mar 11, 2025, 8:49 PM

#

ashen light "Further days have smaller card count." also typically further days are less sus...

I'm also talking about short intervals. I tweaked my weights, so graduating interval becomes 3 days. It is important that card interval would become two or three days and not four or five.

bold terrace Mar 11, 2025, 8:49 PM

#

Also, even without any LB/Fuzz (since my Filtered Deck overwrite the scheduling), I still have somewhat constant amount of review severyday (the spike is just a change of DR and I did the backlog), in 2 days, the curve went back to previous baseline

#

(As you can see, half my reviews are through Filtered Decks now)

ashen light Mar 11, 2025, 8:51 PM

#

sonic forge I'm also talking about short intervals. I tweaked my weights, so graduating inte...

thats just the fuzzer doing fuzzer things independent of the loadbalancer, and yeah it might fuzz short intervals a bit too hard and there was some discussion before about caring about stability when doing this stuff but I don't think anything came of it?

#

and like, at the first 5 days, the lb is very weighted towards earlier days

unique salmon Mar 11, 2025, 8:52 PM

#

ashen light thats just the fuzzer doing fuzzer things independent of the loadbalancer, and y...

Oh, right, we were talking about making fuzz based on S rather than interval lengths

ashen light Mar 11, 2025, 8:52 PM

#

I think for intervals under like a week it could be preferable!

#

I'd help but

ashen light Mar 11, 2025, 8:53 PM

#

ashen light "jake continues to refuse to help"

unique salmon Mar 11, 2025, 8:56 PM

#

Actually, wait, wouldn't that make the problem worse at high DR?
At DR>90% S>ivl, so the intervals would have more fuzz, not less

bold terrace Mar 11, 2025, 8:57 PM

#

What about ...

#

Making ...

#

NO fuss 😄 ?

#

I googled a bit about how to disable it, it seems it's like something sacred in Anki

#

But seriously, let people turn it off

#

😂

#

Especially now with FSRS where we SEE the R impacted

#

WIth SM2 I guess people could make wild assumptions without having anything to rely on

#

"YOu have low stability ? YOU are the problem"

#

But now it's clear that a +1d fuzz at early stage of memorizing something is not that great

cosmic hedge Mar 11, 2025, 9:00 PM

#

I tried programming %correct into the rust simulator quickly and ran it with and without the fuzz turned on

#

idk if this helps? 😂

bold terrace Mar 11, 2025, 9:02 PM

#

cosmic hedge idk if this helps? 😂

WIth 0 New/day I think you're removing the main problematic point lol

ashen light Mar 11, 2025, 9:02 PM

#

I mean anki has always had fuzz

bold terrace Mar 11, 2025, 9:02 PM

#

I see you have put 80 review/day, so I'd suggest do the same with ~10 new/day

ashen light Mar 11, 2025, 9:02 PM

#

there was like a brief period of time where it didn't because anki was being rewritten and it wasn't added in yet

bold terrace Mar 11, 2025, 9:02 PM

#

ashen light I mean anki has always had fuzz

Yeah but it's software, "soft" meaning there is nothing sacred here

#

It's not because something was always there it has to stay

ashen light Mar 11, 2025, 9:03 PM

#

I mean at this point dae is very opposed to any option unless it really pulls its weight

#

and I don't think this toggle does

#

¯_(ツ)_/¯

#

you're free to make your own build with no fuzzing though

sonic forge Mar 11, 2025, 9:04 PM

#

bold terrace But now it's clear that a +1d fuzz at early stage of memorizing something is not...

Yeah, that's the point. Cards in short term stage (<=10 days) should be placed in the exact interval as FSRS predicted

ashen light Mar 11, 2025, 9:04 PM

#

its pretty easy to remove

unique salmon Mar 11, 2025, 9:04 PM

#

bold terrace But now it's clear that a +1d fuzz at early stage of memorizing something is not...

https://github.com/ankitects/anki/blob/9b5da546be49f37c8d6c286e09c86074b2f0c278/rslib/src/scheduler/states/fuzz.rs#L16
static FUZZ_RANGES: [FuzzRange; 3] = [ FuzzRange { start: 2.5, end: 7.0, factor: 0.15, }, FuzzRange { start: 7.0, end: 20.0, factor: 0.1, }, FuzzRange { start: 20.0, end: f32::MAX, factor: 0.05, }, ];

As far as I can tell, fuzz isn't applied to intervals <2.5 (before rounding, I assume)