#FSRS Megathread

1 messages · Page 5 of 1

bold terrace
#

You can check the full discussion we had here #1282005522513530952 message

unique salmon
#

I remember that there was weird unintuitive stuff with compounding, but it should only affect intervals >=2.5, no?

bold terrace
#

And here #1282005522513530952 message

#

ah !

#

Yes sorry

#

If interval >= 2.5

#

soooo stability 1 with DR=79% it will mach

#

Or stability 1.1 with DR=80%

ashen light
#

@bold terrace heres some motivation to get anki to build on your system, change that 2.5 to like 7 and see waht happens

bold terrace
#

Yeah Yeah

#

I think I'll also keep a fork just in case

ashen light
#

and then you can do the leech thing everyone wants but I still don't even understand

bold terrace
#

You know

#

Recently I realized in 30min I can do ~300 reviews

#

So with this discussion I could have done 300 reviews extra today lol

#

Soooo unfortunately I had to remember to myself that Anki is a mean to learn the stuff I want and not the endgoal lol

unique salmon
ashen light
#

or you coulda been tricked into discussing here

bold terrace
#

But yeah sure I'll check to build it and try a few things

#

For now my focus is still to have the average median per day a bit better than what I did

ashen light
#

so its just an alternate (or replacement?) way to tag something a leech?

ashen light
#

is this really that large a feature that you're trying to trick me into doing work?

unique salmon
ashen light
#

oh untagging is interesting

unique salmon
bold terrace
#

The problem with the current "leech" feature is that it's not bringing much to the discussion ... Saying that something is a leech because you failed it 8 time, is stupid when the whole point of FSRS is to "fail at an expected rate"

#

So people with lower stability, WILL have huge amount of leeches with that simplistic rule

ashen light
#

yall should really learn how to do these thigns on your own

#

if I'm a bottleneck in any capacity things are just doomed

bold terrace
#

Yeah to be honest @unique salmon if you can do Python I don't see why you couldn't code other things too lol

#

Also, you're pretty much the full day in discord, so just cut 1h and learn rust 😄

#

We'll even call you Jarrett_2

bold terrace
#

Jarrett in a few months will go full proprietary route

#

We need our Jarrett2

#

Expertium the savior 😄

ashen light
#

oh hes goin full sellout?

unique salmon
#

idk, this is the first time I'm hearing about it

bold terrace
#

Just joking lol but he said his company was building some FSRS app or something

#

But I'm joking

#

I'm sure he'll keep contributing ❤️

unique salmon
#

Ah, yeah, he said his company is making an Anki-like app

ashen light
#

well expertium aka jarret2 it looks like you gotta step up

polar maple
#

imagine the endless possibilities for Anki if Expertium learns Rust

ashen light
#

I know right? golden age of anki

unique salmon
#

Lol

hasty fractal
#

expertium learns rust

yea I can't imagine that

quasi shadow
#

It's out of date. Please see the easy day simulator.

#

Rust is not very difficult. But you need to learn git, IDE, terminal before it...

#

In short, become a programmer/developer!

#

😭 The number of notification on GitHub exceeds 30 again.

wet plume
#

rust requires a lot of practice. Once you write a bunch of code. You get used to the borrow checker.

#

At the start I found it so difficult to get my head around. Now I don't think about it. I think it's like wa and ga in japanese.

quasi shadow
#

Yeah. And the checker helped me overcome many bad habits about coding which formed when I wrote Python.

quasi shadow
#

In this video, I discuss Anki’s latest updates (25.02, 24.11, 24.06, 24.04, and 23.12), including a complete overview of the best settings to use with the new FSRS-5. I also cover new features, such as the option to select Easy Days to reduce your study load, the CMRR, the new statistics, and the benefits of the FSRS Helper add-on.

⍟ OUTLINE 📋:...

▶ Play video
#

It's coming soon!

robust hill
#

when tag deck options coming

robust hill
#

i have a question with fsrs helper add on optimization again

#

paremeters were optimal and i rescheuled with fsrs helper addon

#

and finished all cards, then i tried to reschedule again and no cards were added

#

this was yesterday

#

and today, i have reviews to doo and i click reschedule and this time it changes the # of reviews and gave me some more

#

why did it give me some more with the optimization of today

quasi shadow
#

I wonder you have more reviews due tomorrow and the helper tried to apply load balance for this case.

#

For example, if you finished 100 reps today and have 110 cards due tomorrow, the helper may reschedule 5 cards to today from tomorrow.

robust hill
#

before after

quasi shadow
#

The number of card due tomorrow decreases (141 - > 130).

#

How many cards did you review today?

robust hill
#

0

quasi shadow
#

🤔 some weird things happened.

quasi shadow
#

I want to reproduce this problem.

robust hill
#

my collection is huge and my internnet is very very slow

quasi shadow
robust hill
#

okay i sent it 👍

unique salmon
#

Oh, wait, I'm dumb. I can just set all weights to 1

unique salmon
#

Alright, now we wait for 12-13 hours for a crapton of simulations to run

bold terrace
#

@cosmic hedge I improved the graph to have the median / avg based on the true values and not the bins (So instead of having a bunch of 16d stability and then a jump to 17d, you have a more incremental progress)

cosmic hedge
bold terrace
#

The most inintuitive piece of code lol

#

I think Copilot gave me that one 😂

#

Problem with the new one is that basically I build the full list for each day of each stability

#

I check my process manager but the RAM doesn't seem to explode

#

so I think it's fine

cosmic hedge
#

if that makes sence?

#

but "so long as it works" 😂

bold terrace
#

I saw there are some algo to compute the median with N/2 space for each array

#

And I don't know when the revlog is built, if we have "all things for one day" before going to the next one

#

(to be able to discard the previous day each time)

cosmic hedge
#

and instead return the median and mean for each day

bold terrace
#

aah got it yes

#

But I'm checkign I don't see much bump in memory usage

cosmic hedge
#

yeah honestly i'd probably prefer just returning the array

bold terrace
#

Opening stats make the memory go from 250MB to 450MB but then it doesn't grow much more when I trigger Memorised graph

bold terrace
cosmic hedge
#

before -> after
this change basically doubles memory usage of the entire addon but its only 50mb so does it really matter?

bold terrace
#

Nice !

#

How do you do those snapshot 🙂 ? I don't know how

cosmic hedge
#

oh i use the dev console

bold terrace
#

You open anki in chrome or something ?

cursive badge
#

My SxR heatmap would never have been finished without being able to use the Inspector. The number of times the cells all went to something like x=-3000 because I did my maths wrong 😂

cosmic hedge
unique salmon
#

I mean, it is a statistical test. It's like scipy.stats.binomtest with alternative='less', but using a generalized distribution so that we can do math when the probabilities aren't all equal. In fact, you can check that when all probabilities are the same, binomtest gives the same result
(I have no clue why it's so slow though 🤣)

polar maple
#

a basic correction is that where n is the number of reviews, only flag if it is under the threshold 1% / 2^n

unique salmon
unique salmon
polar maple
#

that's sort of the point, if you don't have a correction you might have many many cards that are wrongly tagged as leeches

#

and i don't want many false positives

polar maple
ashen light
#

is the current leech system informative?

polar maple
#

not really

unique salmon
#

And I'm pretty sure that's not how you correct the p-value threshold anyway

polar maple
#

if we knew in advance that the number of reviews that a card will have will be n, then we can use 1% / n or something

#

but we don't know it, so at the very least we can bound the total threshold by a finite number: 1% / 1 + 1% / 2 + 1% / 4 + .... = 2%

#

that's where teh 2^n comes from

ashen light
#

this argument is fine and all but apparently I'm the one actually doing the work so that means I get to decide all the numbers right?

#

🍃

polar maple
#

not wrong

ashen light
#

on a more serious note, @unique salmon given your previous examples how would these proposed numbers look in that situation

#

I can barely count to 5 so I need it explained

unique salmon
unique salmon
ashen light
#

don't worry my mind is a jumbled mess right now

polar maple
ashen light
#

I don't think I do either

#

maybe: you're complaining about his numbers being too whatever, how would that actually affect this

#

makes stuff never be flaged, but how many WOULD it take to flag something?

#

I feel like I am spouting nonsense I don't know whats up with my brain right now

unique salmon
#

If you want an example, here
5 reviews, each was done at 90% probability of recall (predicted by FSRS). 2 successes, 3 fails. p-value would be 0.856%, meaning that there is a 0.856% chance that this card would have this many or fewer successes (aka not Again), given these probabilities.
The first review and same-day reviews are unused, btw

unique salmon
polar maple
#

but sure, can try / n

#

since 2^n is about as arbitrary

unique salmon
polar maple
#

yeah it will still overestimate though

#

but you still have to give proper reasons why we cannot use D

#

remember that from FSRS's perspective, even with a card with a low history likelihood, its probability from then on at that point should still approximately hit the DR if FSRS is modelling well

#

that is, the history likelihood should have no bearing as to the future R

unique salmon
polar maple
#

unless you are suggesting that probabilities helps with memory modelling, in which case, introduce it to FSRS already, prove that it helps!

#

D already predicts well, we know that from FSRS's performance

unique salmon
polar maple
#

unless you admit that historical likelihood doesn't predict anything at all

#

in which case, this leech idea is pointless

unique salmon
#

And post-lapse S

polar maple
#

btw, it is also technically in D as well in terms of the transition probability

unique salmon
#

Anyway, here's a good correction: https://en.wikipedia.org/wiki/Šidák_correction
I can look into modern papers to see if I can find something better

In statistics, the Šidák correction, or Dunn–Šidák correction, is a method used to counteract the problem of multiple comparisons. It is a simple method to control the family-wise error rate. When all null hypotheses are true, the method provides familywise error control that is exact for tests that are stochastically independent, conservative ...

polar maple
#

remember a core hypothesis of FSRS: the entire memory state can be summarized as 3 variables: D, S, R

#

to say that historical likelihood is used for leech detection is to say that this is not true

polar maple
#

make it true

unique salmon
unique salmon
#

Btw, I bet Jarrett wouldn't be happy about 2

polar maple
#

wouldn't 2) be an argument against your entire leech idea in the first place?

unique salmon
#

No? It won't be a part of FSRS itself, just an extra thing

#

Like the forgetting curve

#

Which requires recalculating all R values

polar maple
#

for 2) then consider it as a infrastructure investment, idk

#

if you truly believe that historical likelihood is such a strong predictor, just use it in FSRS already

unique salmon
polar maple
#

the intuition behind the 1 + 1/2 + 1/4 + ... version is that we might actually be able to interpret it as each card has ~2% chance of being tagged as a leech in its lifetime

#

but idk for sure

unique salmon
#

Ok, I have an idea
c=pi^2/6
threshold=alpha/(c*(n^2))

Basically, we use the fact that the sum of 1/n^2 converges to pi^2/6.
This way we ensure that the sum of all thresholds across the card's entire life approaches alpha as n increases.
Call it Expertium's correction 🤣
This still leaves the question: what's a good choice of alpha? 1%? 2%? 5%?

#

I guess we need some empirical data on how many leeches people have, and I mean subjective leeches

#

Now we just need to find people who are willing to go over all their cards and tell us how many of them they consider to be leeches

#

Actually, it should be threshold=alpha/(c*((n-2)^2)), if we plan to start using the detector only after there are at least 3 reviews to avoid early false positives, excluding same-day reviews and the first review. So we turn the detector on when n=3.

polar maple
#

we can use a smaller exponent like 1.1 for 1/1.1^n as long as we divide out the sum as well

unique salmon
#

I like this one though :C

#

Feels more elegant

polar maple
#

yeah pi^2/6 is cool

#

1/n^2 seems to have a lot of weight at the start, less in the middle, and more at high reviews (n > 80) compared to 1/1.1^2

#

high weight at the start can be beneficial to detect leeches early

unique salmon
#

#general message
Here comes the first data point

robust hill
#

i hope 5% is alow number

#

or else im cooked

unique salmon
#

I'd say about 20% of my kanji cards are leeches, but that's because there are very similar kanji that are hard to tell apart, so I might be an outlier

unique salmon
#

#general message
So far the numbers are 1.5%, 5% and 20%

#

We really need a survey. The problem is that not a lot of people will bother counting

unique salmon
#

@polar maple there is a recursive way to calculate the PMF
`def update_poisson_binomial_pmf(current_pmf, p):
"""
Update the PMF of Poisson binomial distribution when a new Bernoulli trial with
probability p is observed.

Args:
    current_pmf: List containing current PMF values
    p: Success probability of the new trial

Returns:
    List containing updated PMF
"""
n = len(current_pmf)
new_pmf = [0] * (n + 1)

# Update PMF using dynamic programming (working backwards to avoid overwriting)
for k in range(n - 1, -1, -1):
    # No success case
    new_pmf[k] += (1 - p) * current_pmf[k]
    # Success case
    new_pmf[k + 1] += p * current_pmf[k]

return new_pmf

def calculate_poisson_binomial_pmf(probabilities):
"""
Calculate the PMF of Poisson binomial distribution for a list of success probabilities.

Args:
    probabilities: List of success probabilities for each Bernoulli trial

Returns:
    List containing the PMF
"""
# Start with PMF for 0 trials: 100% probability of 0 successes
pmf = [1.0]

# Process each trial one at a time
for p in probabilities:
    pmf = update_poisson_binomial_pmf(pmf, p)

return pmf`

But it requires storing the entire PMF, so it's still not ideal if we want to try to integrate it directly into FSRS

#

We either have to store all values of R or the PMF

#

Claude also made a normal approximation formula (with some correction for skewness) with O(1) time and space complexity, but it's...just look

#

It's ass

polar maple
#

we can compute and cache the value so it shouldnt be a big issue

unique salmon
#

I don't think you understand what the problem is. Right now FSRS only stores one last value of D and one last value of S. It doesn't store any n-2 or n-3 or whatever values

#

So unless we want to rewrite a whoooole loooot of code (and in Anki itself too), we'd better only store one last value

polar maple
#

it shouldn't be too hard right? we already have a full history of the cards including the S and D at each point in time, we just add another value to it

#

then we do the little O(n^2) compute to get the exact pmf after every review, so we avoid having to store the entire pmf as well

unique salmon
#

Man, screw it. Let's just implement the leech detector outside of FSRS

polar maple
#

ah yeah i was talking about if implementing it for anki, oops

polar maple
#

so it should be simple to compute the pmf on the spot

unique salmon
#

Yeah, but not in Anki

#

In Anki we can't get the full history at review time

polar maple
#

can't you see it in the card info sectoin?

#

at the very least it shows D, but it shouldn't be difficult to add other information as well

unique salmon
#

Idk, I guess "historical reasons"

#

Maybe Dae wasn't planning that far ahead, for the future where this is needed

ashen light
#

I mean if you need some data structure in a specific part of the code, just pass it through the 20 functions till it gets where it needs to be

#

I'd say the only "historical reason" is that the code hasn't needed it there thus far?

quasi shadow
#

In my view, we should build a dataset to train the leech detector...

#

The argument is not very helpful here.

#

Without data, we cannot validate the idea.

cosmic hedge
#

I know this is ignoring the linked comment to but I liked the idea in the title 😅

robust hill
#

yes

#

this would be good

#

finally my idea of the leech fixer 9000 is occurring

unique salmon
#

#general message
Another data point
So far we have 0.7%, 1.5%, 5% and 20%

unique salmon
#

Since you would need many thousands of manually labeled cards, good luck collecting that

quasi shadow
#

Maybe you can ask someone else... good luck.

unique salmon
unique salmon
#

Then I guess it's over. Jarrett and Jake don't want to implement my idea with a leech detector based on FSRS probabilities, and training a machine learning based detector requires data that we don't have and won't be able to obtain unless we get like 100 insanely dedicated volunteers or something

severe storm
#

advancedleech detection is qutie a high effort low impact though compared to other things

cursive badge
unique salmon
#

If you care about researching in this direction, I suggest removing the first review and same-day reviews, and using alpha of 1% or 2% with my correction

cursive badge
#

I might have another look later, but cannot make any promises.

unique salmon
ashen light
unique salmon
bold terrace
#

"CMRR" ... "Optimal"...

#

This is what drives me crazy with how much bad faith there is

#

When you point out how CMRR issues people say "It's why it's the minimum recommended, it's not an optimal"

#

But then when no one points out the issues

#

"It's the optimal DR"

#

Incoming waves of people trusting the CMRR and putting their DR at 70% with a true retention at 50-60% because FSRS was trained on their 80-90%

bold terrace
#

This me

bold terrace
#

Isn't it beautiful how card stability improves slowly but surely with time ?

#

The declining phase was when I was doing 40 new cards per day

#

This one is without adding any new cards per day

#

It shows how stability improve better at first then improves more slowly based on repetitions

#

And with a balanced number of new/day, of ~8-10/day

#

I'm wondering if Higher DR means faster Stability Gains

#

With default parameters and DR=90%, 3 goods will make you have a 34 stability after 49 (cumulativeInterval)

#

With DR=96%, 6 goods will give you Stability 61 after 49d (cumulativeInterval)

#

So in that simple setup, you do twice the amount of review in the same lapse of time, but you get +44% stability

unique salmon
bold terrace
#

I see

#

When you say S is growing faster for lower DR

#

YOu mean faster "by review" ? Or "faster for the same interval of time"

unique salmon
#

S depends on R at the time of the review. Lower R = higher increase in S

#

So "per review"

bold terrace
#

Sure but also longer to wait before having that increase

#

But probably workload-wise more optimal

#

I guess I'll create a new Filtered Deck 😂 prop:s > 14d prop:r < .95 or something

#

1240 cards match it, ouch

#

(Over 3600)

robust hill
#

so what if we do

#

high DR in the beginning then slow DR

#

🔥

unique salmon
#

That's pretty much what SSP-MMC does

robust hill
#

who is ssp mmc

#

is he the goat

unique salmon
#

It may be more time-efficient by a tiny margin, compared to a good fixed value of desired retention

unique salmon
robust hill
#

i see

#

well at this point

#

arent we so far along that any gains in accuracy are like really low

#

like it could be more effective to get proper sleep & nutrition etc

unique salmon
#

According to simulations it's like 5% more time efficient over the course of 10 years, compared to a fixed value of desired retention

robust hill
#

oh

unique salmon
#

Yeah, the gains are low

#

There aren't any low-hanging fruit left

robust hill
#

gg

unique salmon
#

Right now I'm running simulations to see if we can improve load balance (spoiler: only a tiny bit), but that's like TWEAKING tweaking
Like, tweaking the tweaks of tweaks
Like, twenty layers deep into tweaking

#

Love me some 0.000001% improvements

hasty fractal
#

we need sleep tracking in Anki

#

that's the next paradigm shift

#

repeating whatever SM does

polar maple
#

but theres still hope, as jarrett said before SSP-MMC does not actually aim to minimize knowledge per minute

#

so there could still be some improvement somewhere

polar maple
# unique salmon ?

a very basic strategy is to never/rarely review difficult cards and only study cards with an initial rating of 'easy'

#

if you do this you can get a knowledge/minute of 1300 or so

#

but this is obviously not a good learning strategy

unique salmon
#

You mean "maximize knowledge per minute"?

#

Anyway, it does minimize time spent on reviews, right?

robust hill
#

anki gambling

unique salmon
#

Unless I am somehow wrong even about that

polar maple
#

i've heard before that SM will purposefully schedule difficult cards longer so you don't get bogged down in reviews by them

#

this strategy i just outlined where you only study cards with initial rating 'easy' is like that idea but put on its max

#

basically there whould be a spectrum of this idea where perhaps with proper leech detection, we purposefully drop the bottom 2% in order to increase knowledge/minute by a lot

unique salmon
#

That sounds terrible for maintaining retention at DR

polar maple
#

ssp-mmc doesn't aim to achieve a target DR so its not relevant

unique salmon
#

Fair enough, I guess

bold terrace
#

IMO to know if something is better we should also wonder if the measure of better is good enough

#

The more I put importance on Stability and the more sad I feel about Knowledge not taking it into account

#

I was wondering if D should not also be taken into account though

#

To similar S and R you’d like a lower D no ?

sacred seal
#

Hello. (Still learning English)
Approximately two weeks ago I started using FSRS with default settings. After 1100 reviews I clicked the "optimize" button and then something weird happened. New intervals are HUGE. For example after I clicked the "good" button on the NEW card the first interval is 1,5 month. And huge intervals are also happening to the old cards, I learned before. I don't really know what to do. Now I'm learning with default FSRS parameters.

#

Desire retention is 0.90

#

Wait... I don't really know how the hell I done this but it seems good. I just deleted all parameters, leaving blanks space. Then some parameters appeared so I clicked "optimize" and now It works good. XD nevermind.

bold terrace
#

Yes sometimes 1000 can be quite few if for example you rate rarely as good first and when you do you never fail then afterwards for example

#

But if you have the same pattern again check what is the history of those cards if they are all good that might be it

#

Sometimes the best is to let them have those huge interval and review them again later, if you still succeed them then you’re just underestimating yourself

#

If you fail them a lot FSRS will adjust

#

1.5m is quite peanuts compared to the language learning process

#

And if really you want the shortest possible interval just crank your DR to 95-99

robust hill
#

"still learning english" they say

#

your english is impeccable

#

😭

sacred seal
sacred seal
sacred seal
bold terrace
#

So let say FSRS observed that when you rate directly "Good" a new card, in general, you never fail them 2 months later ... he will probably recommend you to wait 2 months before reviewing them again

#

In the opposite side, if you fail them almost all after 2 months, he will probably recommend you less, like 2w, 4w

sacred seal
#

sounds logical

bold terrace
#

But the problem is with 1000 review, you might have only 5% of your total reviews "in that case", so FSRS sometimes take a long time to learn those edge cases

#

Also, if you never had such long interval with SM2, FSRS optimizer is a bit "blind" in that area so it's trying to do educated guess, but it might be off

sacred seal
#

I'm learning for two weeks

bold terrace
#

Yeah personally I'd advice to keep FSRS enabled, and it will learn by itself along those next weeks/months 🙂 Hit "Optimize" as much as you want, and check the "Evaluate" button output (What does it say right now ? What's the "logloss" and "RMSE" when you press "Evaluate" ?)

bold terrace
#

You see, the lower, the better. log loss can be lower by nature by having a higher Desired Retention, and RMSE can be lower by nature by having more reviews

#

RMSE can be more or less interpreted as "how much in terms of % FSRS will be wrong based on your history"

#

So in my case, ~90DR might very well lead to ~87-93% actual retention depending on the case

sacred seal
#

Thank you for everything I think I should go with these parameters and optimize from time to time

bold terrace
#

No worries, feel free to ask if you have other doubts/questions ! Glad to help

unique salmon
#

@polar maple @ashen light @hasty fractal @bold terrace @sonic forge @quasi shadow just pinging everyone who may care, lol. It's Load Balancing time!

Alright, I've set up the optimization loop (using a Bayesian optimizer) to optimize these powers that are used in the load balancer's weight formula:
(1 / np.power(r, due_power)) * (1 / np.power(delta_t, interval_power))
due_power and interval_power are parameters to be fine-tuned. The range for both is from 0.5 (square root) to 3 (cube).

We have two optimization objectives here: average absolute difference between true retention and desired retention, which I called avg_abs_ret_diff; and volatility, which is a measure of how much workload varies day-by-day. Example: if you had 120 due cards today and 100 due cards yesterday, volatility=20%.

We want to minimize both. However, when minimizing two different objectives, you often run into a situation where you cannot make one better without making the other one worse. It's called a Pareto frontier.
So instead of getting one set of parameters as a result, we will get a bunch of Pareto-optimal (can't-improve-A-without-making-B-worse) sets of parameters.

Simulation parameters
maximumInterval = 36500
new_cards_limits = 10
review_limits = 9999
max_time_limits = 10000 (IIRC this is in seconds)
learn_days = 100
deck_size = 1000
sample_size = 5
retentions = [0.7, 0.8, 0.85, 0.9, 0.95, 0.97, 0.99]
For each value of desired retention, the simulation runs sample_size times, for a total of 7*5=35 simulations per each set of parameters. Then this is done 100 times for different parameters. The same seeds are used across all retentions, for the sake of consistency.

Here are baseline averages and their 95% confidence intervals:

#

Fuzz (no LB)
avg_abs_ret_diff=1.06%±0.18%, volatility=0.170±0.026

Current double-weighted LB: due_power = 2 and interval_power = 1
avg_abs_ret_diff=1.16%±0.16%, volatility=0.115±0.014

Current double-weighted LB (predicted by the Bayesian model)
avg_abs_ret_diff=1.06%, volatility=0.117

The first one is a "raw" experimental result, the second one is given by the Bayesian model after it has processed 100 experimental results. And by "experimental" I mean "simulated".
Notice how much volatility is reduced compared to random fuzz!

I wanted to add a table with Pareto-optimal values, but decided that it's better as a graph.
Also, I tested removing
{ "start": 2.5, "end": 7.0, "factor": 0.15, },

from FUZZ_RANGES entirely, so that intervals <7 aren't fuzzed.

I have attached a visualization of the Pareto frontier. Well, two frontiers.

As you can see, the frontier with the current fuzz ranges and one with that first range removed are nearly identical. I wonder if it's because the simulation duration is 100 days, and if the effects would be noticeable if it was, say, 15 days.
The "utopia point" is a hypothetical point where both objectives are minimized. It's not actually obtainable. What we can obtain instead instead is a "knee point" - a point closest to the utopia point.

So what's the best course of action given all of this? I suppose we can modify the powers. We can use due_power=2.150 and interval_power=3.000 (knee point), which would give us 0.88% abs. diff. in retenton and 0.115 volatility, which is better than the current implementation. I doubt anyone would notice a difference, but even more so I doubt that it would make anything worse, so I guess why not.

hasty fractal
#

Not related to the message above me but I think one thing people are missing is that leech should not depend on what has already happened but what is going to happen to a card.

For example, if a metric starts at low and goes at a high after some time that's not a good metric. So volatility of the metric itself is something that matters. I know I've repeated this a few times before but that was a long long time ago.

unique salmon
hasty fractal
#

more like, by looking at "history", if u can't say the future (which is the case if the metric is really volatile) then it's not working

#

we can look at history, but I'm saying the goal is in the future so that matters

unique salmon
#

We'll see how my leech detector behaves if Jarrett manages to implement it in the Helper add-on. Worst case, we can just add a rule that a card cannot change it's status as leech/not a leech after every review, only once per 2 reviews or per 3. Though I don't think we will actually need that, especially with the correction I described here: #1282005522513530952 message

bold terrace
#

You get a review wrong, but you didn't necessarly noticed it's the third time in a row with a DR of 95% ... Well, at least that detection would be useful

cursive badge
ashen light
#

yeah shove those numbers in go make a pr

#

you can do it

unique salmon
#

Alternative:

    """
    Calculate the exact PMF of the Poisson Binomial distribution using
    dynamic programming and vectorized NumPy operations.

    Parameters:
    -----------
    p : array-like
        Array of success probabilities for each Bernoulli trial

    Returns:
    --------
    numpy array of PMF values for k=0,1,...,len(p)
    """
    p = np.asarray(p, dtype=np.float64)
    n = len(p)

    # Validate input
    if not np.all((0 <= p) & (p <= 1)):
        raise ValueError("All probabilities must be between 0 and 1")

    # Handle trivial cases
    if n == 0:
        return np.array([1.0])

    # Initialize the PMF - we'll use a dynamic programming approach
    # pmf[j] will represent P(X = j) after considering the first i trials
    pmf = np.zeros(n + 1, dtype=np.float64)
    pmf[0] = 1.0  # Base case: probability of 0 successes with 0 trials is 1

    # Process each probability one at a time
    for prob in p:
        # For each new Bernoulli trial, we update the entire PMF
        # We do this in reverse order to avoid overwriting values we still need
        # The key insight: P(X=k after adding new trial) =
        #   P(X=k with no success in new trial) + P(X=k-1 with success in new trial)

        # Calculate the effect of this probability on the entire PMF at once
        # This is where the vectorization happens
        pmf_shifted = np.zeros_like(pmf)
        pmf_shifted[1:] = pmf[:-1] * prob  # Probability of success for this trial

        # Update PMF by combining the two possibilities
        pmf = pmf * (1 - prob) + pmf_shifted  # No success + success for this trial

    return pmf```

And then just

```pmf = fast_poisson_binomial_pmf(trial_probabilities)
p = sum(pmf[0:trial_success_count + 1])```
ashen light
#

protip: use ``` instead of `

unique salmon
#

Ok, nvm, just commited again and it worked, lol

robust hill
#

who was this targetted at

unique salmon
#

100% accurate, can confirm

ashen light
#

@unique salmon your post text references the 7day test stuff, which kinda distracts from the issue

unique salmon
#

It's relevant if we want to know whether removing that range is worth it or no

ashen light
#

also leaking secret discord r&d

#

yeah but thats an entirely different topic

unique salmon
#

It's related to tuning the load balancer though

ashen light
#

basically: this pr has a bunch of information that has nothing to do with the pr

#

whats gonna happen is dae is gonna ask about it

#

and this merge will be delayed for weeks over disucssing things that have zero relevance to what we want merged right now

ashen light
#

like for sure do the 7day stuff, just in a separate issue/thread

#

yeah but this is just chumming the fucking waters

#

my point is if you want focused discussion, leave out as much unnecessary info as possible

unique salmon
#

Alright

ashen light
#

(and when you've fixed that go make an issue detailing the 7day stuff and paste all the stuff removed from this pr into that)

unique salmon
#

It doesn't seem like it's worth it

#

The only thing it would achieve is making Yuki and Sound happier 🤣

ashen light
#

yeah who cares about them

unique salmon
#

Lol

ashen light
#

let them toil in the leech mines

unique salmon
#

I mean, no offense to them, but that doesn't seem like a strong argument to tweak fuzz, especially if simulations show that it makes like a 0.001% difference

ashen light
#

as they continue to fail short-term cards forever instead of just maybe studying that card harder or something I dunno

unique salmon
#

Maybe after the new tweak they will magically notice that fuzz now works better

#

I wouldn't bet on that, though

robust hill
#

noooo

#

i love leech destructinator 9000

hasty fractal
#

@unique salmon another thing: I notice that FSRS performs slightly worse in very low R range (for some of my deck I mean). Maybe we should consider that too. So, the threshold should depend not only on how many time you failed it but also on how many times you failed other cards in that R range if that makes sense. Wdyt?

unique salmon
#

That's a known issue, yes. However, I can't think of a way to incorporate that into the detector

hasty fractal
#

one step ahead of me haha

#

btw, have u thought about calculating the optimal threshold for suspension using the simulator maybe?

#

because I think it shouldn't just be smth arbitrary

unique salmon
#

That's why I'm asking people how many leeches they have

#

#general message
Another data point
So far we have 0.7%, 1.5%, 5% and 20%

#

But we need an actual survey. The problem is that there aren't that many people who will go over hundreds or thousands of cards to report to us

hasty fractal
#

I don't know why you want that honestly

#

that data in itself sounds arbitrary

#

I have 5% leech but I could've had more if I decided a different leech threshold

cursive badge
#

I just had a go with the successes <= k version and the threshold correction still seems to be wonky. I set the threshold to 0.9 and it still only found 3 leeches 😂 . Either I've messed it up or it doesn't work.

ashen light
#

I mean whos to say someone didn't make the leech threshold something completely silly like 300

#

and so they have zero leeches

#

I may or may not be referring to myself here

hasty fractal
#

sus

unique salmon
#

If people themselves can't decide on how many leeches they have, then this idea is doomed

hasty fractal
#

bro u r the numbers guy. say something objective.

#

fuck people

unique salmon
hasty fractal
unique salmon
#

Like, what on Earth even is this...

#

Is this Linux stuff?

ashen light
#

could be mac

#

its the next attempt at making python packaging not absolute trash

cursive badge
unique salmon
#

Oh, yeah, I didn't know about uv

unique salmon
cursive badge
#

You just run it in the terminal from the repo directory.

unique salmon
#

So if I want to put print() everywhere to see what's going on, how do I do that?

ashen light
#

open your text editor and put prints in there

cursive badge
unique salmon
#

I get that

#

I don't get the rest

#

Btw, I made a PR, in case you didn't see it

#

Doing a bunch of math is the easy part, the hard part is running it on Anki data

cursive badge
cursive badge
ashen light
#

can there be a feature that instead of tagging, it just prints out the cards it flagged

unique salmon
cursive badge
ashen light
#

oh perfect I just saw "This will directly modify your Anki collection. If things go wrong you may experience data loss." and walked away

cursive badge
#

like this:

Found leech - cid:1722559395989 - p:2.7166874125840693e-06 - thresh:8.993004465296252e-05
Found leech - cid:1722559396215 - p:6.650463663121781e-08 - thresh:6.907390375819012e-05
Found leech - cid:1722559396279 - p:2.4991422915605144e-06 - thresh:9.991497291245872e-05
#

I just don't want to accidentally break someone's collection if I mess up.

cursive badge
# unique salmon So I downloaded the code from my github branch, what next?
  • Install uv https://docs.astral.sh/uv/getting-started/installation/
  • Find the location of your Anki DB (something like this on windows: C:\Users\USERNAME\AppData\Roaming\Anki2\PROFILE\collection.anki2)
  • Close Anki
  • Open a terminal and cd to the local copy of the repo
  • Run one of the example commands e.g. uv run -m leechkit PATH_TO_ANKI_DB

By default it will try to find leeches in the last deck you had open. See the README.md for options to twiddle with this.

#

You could also export a .colpkg, unzip it, and use the collection.anki2 inside there if you want to be extra safe. You cannot use --write and the deck browser to inspect cards if you do that though.

ashen light
#
Found 35 leeches``` @unique salmon add this to your leech samples
#

@cursive badge your thing shouldn't count new cards in the process count, those don't exactly have the opportunity to be leeches yet

#

(I manually removed the new cards from my numbers above)

unique salmon
#

It prints a bunch of stuff, not the stuff I want, and I get an error

cursive badge
#

Oh. I have no idea why stability would be 0 😕

unique salmon
#

For the record, this card is suspended and has never been reviewed using FSRS

#

It also has this for some reason

cursive badge
unique salmon
#

Oh jesus, now it just goes brrrrrr
Can I limit it to, say, 10 cards?

cursive badge
#

I haven't put anything in to limit it because it didn't take too long with my collection. You could just manually tag the notes you want to play with and put that in the query.

unique salmon
#

How?

cursive badge
#

In the Anki browser add a tag e.g. test then give my script --query "tag:test". It should let you do anything you can search in the Anki browser I think.

west whale
#

Hello, watched the latest video about Anking 2025 settings, and got asked about using empty learning steps

are they recommended over single 10-30min learning steps as mentioned in FAQ?

Will empty learning steps be the default when FSRS are auto toggled in future releases?

ashen light
#

well with the default settings, I went through the cids that it deemed as leeches and those cards definitely are leechy

unique salmon
#

Ok, the correction is insanely conservative. Let's just turn it off and use a flat 2% threshold. Correction makes it basically impossible for anything to be identified as a leech

#

The math on probabilities checks out, btw

#

@cursive badge how do I configure the threshold in the query?

cursive badge
#

e.g. --leech-threshold 0.02

#

Oops. I did underscores in the README instead of hyphens

unique salmon
#

Man it's slow 😔

#

Btw, the "Checking cards" percentage seems to be bugged

#

Either it's bugged, or it will take longer than the age of the Universe to complete

#

But yeah, I'm going over cards that are identified as leeches at a 2% threshold, and they definitely feel like ones

#

I don't see any false positives so far

cursive badge
# unique salmon Man it's slow 😔

A lot of the slowness seems to be from jank, not inherently in the algorithm.
Anki was printing out lots of annoying messages about blocking the main thread. When I suppressed them the script became a lot slower for some reason.

bold terrace
unique salmon
#

Even cards that ae just barely below the threshold, like 1.9%, feel like leeches

#

Interesting

#

I wonder if 5% would be ok

#

Then again, I have a ton of leeches

cursive badge
bold terrace
#

Let's hope Skynet don't use your algo at some point

#

At least it's more gentle than Thanos

#

(24% vs 50%)

unique salmon
#

Processed 34472 cards
Found 120 leeches

That's 0.35%. I definitely have more leeches than that

#

Man, we need a survey so bad

cursive badge
unique salmon
#

You still haven't merged my PR 😅

#

But yeah, turn off correction and set 2% for now

cursive badge
#

I left the PR open because I didn't know if you were going to rebase and add something else.

unique salmon
#

wat

#

how

#

Like, it's clearly used

#

But it doesn't exist

#

But it's used...

#

But it's nowhere to be found...

#

What kind of african american sorcery is this...

cursive badge
#

✨ magic ✨ (src/leechkit/pbd.py)

unique salmon
bold terrace
#

Processed 6262 cards
Found 44 leeches

unique salmon
#

Idk why Github can't find it

bold terrace
#

(Seems for hiearchical deck it's not matching all of those)

unique salmon
bold terrace
#

(If I select "Japan" which contains a lot of deck, he found 0 on 500 cards)

#

uv run -m leechkit '/Users/<user>/Library/Application Support/Anki2/User 1/collection.anki2' --query "Yomitan" --flag --write

cursive badge
bold terrace
#

┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Option ┃ Value ┃
┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ query │ Yomitan │
│ skip_reviews │ 3 │
│ leech_threshold │ 0.05 │
│ dynamic_threshold │ False │
│ tag │ maybe-leech │
│ flag │ True │
│ write │ True │
└───────────────────┴─────────────┘

#

the default values 🙂

#

I'm on master

#

On yours technically

unique salmon
bold terrace
#

commit c8379e0b52f3c61014c4f8d864f8cd954bcb3674 (HEAD -> master, origin/master, origin/HEAD)
Author: Ross Brown [email protected]
Date: Fri Mar 14 201024 2025 +0000

#

Processed 6262 cards
Found 21 leeches
with threshold .02

cursive badge
bold terrace
#

It worked !

#

│ No such option: --leech_threshold (Possible options: --dynamic-threshold, --leech-threshold, --no-dynamic-threshold) │

#

it was --leech-threshold

#

but apart from that underscore->dash, it worked fine

#

let me review them 🙂

#

ah wait anki was open

#

seems I need to close it first

cursive badge
unique salmon
#

Alright, so 2% works well, at least for me. Now we need to see how much the p-value changes after every review. Give me a moment

#

Actually, no, I won't be able to do that

#

Since I need the number of successes after every review, not just the most recent value

cursive badge
#

Also note if you are using --write it does not remove the tags/flags it added last time. You have to manually remove them first.

unique salmon
#

Ross, can you plot how the p-value changes after every review?

#

For any card you want

#

We need to see the dynamic

#

Well, preferably for a card that is currently at around 2%

bold terrace
#

So first observation, it seems indeed to be quite related with Difficulty, my 21 result, are on the "most difficult card"

unique salmon
#

So of course there is some correlation with D

bold terrace
#

The one flagged that was in the "middle pack" of FSRS D, is one that indeed, seems fishy

#

it's the 2350 most "difficult" card over 3186

unique salmon
bold terrace
#

but indeed, from the look of the recent shape, for those, it brings some value

cursive badge
unique salmon
#

It will be a huge pain to figure out how to do this

#

If anything, manually writing down a list of successes sounds easier 🤣

bold terrace
#

By default it take the full history in account right ? Also, the doc for
--skip_reviews - The number of days with reviews to ignore to let the FSRS state stabilise
Is not super super clear to me 🤔

unique salmon
#

Like [1, 2, 3, 3, 3, 4, 5, 5, 5, 6]

unique salmon
bold terrace
#

Another observation, it feels indeed like an evolution of the "Lapses" threshold, since most leech are indeed with high lapse, but certain high lapse are just "normal"

cursive badge
bold terrace
#

Ok !

#

It ignores same-day reviews also I guess ?

cursive badge
bold terrace
#

Ok ! But it's good, for example the top most lapses, I have a lot of 2-3 fail in a row, but often "the same day"

cursive badge
#

If there are same-day reviews it counts as one, but it calculates the elapsed time between reviews by looking at the first/last review of the day depending on which side you come from.

unique salmon
#

Anyway, I want to plot the p-value as a function of the number of reviews for some card. I guess I'll just write values down manually later, probably tomorrow

#

It's unclear how to treat successes and lapses, though. I mean, what would be on the X axis?

#

Reviews? Successes only? Lapses only?

bold terrace
#

One graph for each lol

unique salmon
#

Reviews makes the most sense, but then how do I differentiate between successes and lapses?

bold terrace
#

Maybe I'm oversimplifying, but isn't it Reviews-Lapse=Success ?

#

I check a few example here it seems to be that for smaller history cards

#

Hmm "Learn" steps messed it up

#

It's 16 reviews, 1 lapse, but the 4 learn are included in the 16 reviews but not in the lapse

unique salmon
#

How do you tell whether the p-value changed after a success or after a lapse?

bold terrace
#

I'll stop here for today, but for now at least, I think it looks consistent enough so it doesn't mark something should not be a leech as a leech (I tried with .02 and .01), so that's already better than the current leech = lapse > threshold of anki

#

Where I'm a bit wondering if we shouldn't show the p-value for the cards to the user, and also maybe only look at X last reviews

#

It's not always super super clear why one is a leech and not the other with big histories

unique salmon
#

I definitely don't intend to show it to the user. I don't see the point in only using the last X reviews, though

bold terrace
#

For example this is not a leech :

#

But this is :

#

And this is not :

#

You see it has some consistency, but it's a bit difficult to understand why certain were not detected

unique salmon
#

Maybe anything with >15 lapses should just forever be labeled as a leech 🤣

bold terrace
#

I really don't think so

unique salmon
#

I don't think that using only the last X reviews will make it more consistent, though

#

And it can lead to the card's status changing all the time

bold terrace
#

It's just that right now, I don't know if something were detected as a leech because of a bad streak of last year

polar maple
#

does the code right now only detect a leech after looking at the full history or does it indicate if any prefix of the history would put the card in the leech threshold?

bold terrace
#

Also, after ~14months of learning japanese, this is my median stability over time

As you can see, for the past half year, my stability median was lower than 8d, for a 80% DR, it means having higher lapse count is not that "rare"

#

And that's why I think the current leech detection in Anki is just bad : It's perfectly normal to have high lapse count when your stability is still low in average/median-wise

#

So I don't criticize really that algo by itself, I think it might be more nuanced than simply the lapse>threshold

bold terrace
#

It's just a bit difficult to interpret ("Was it a specific event that caused me that leech ? Was it something from the past ? Something recent ?)

#

Right now, I see some High D/High Lapse card are leech and other are not

polar maple
# unique salmon The full history

then isn't that maybe why you were finding so little leeches with the dynamic threshold? it is only considering one timestep rather than the sum of them

bold terrace
#

But I'm not sure which one need some "extra care"

unique salmon
#

Well, not all, probably, since Ross got a few leeches

#

The thresholds just get too conservative too fast as the number of reviews increases

bold terrace
#

I got 0 leech with dynamic threshold

#

But once again most of my leech are >70 reviews

#

And thank you @cursive badge for that implementation 🙂 Very instructive

polar maple
#

can try a 1.1^n threshold, it is less conservative for medium-sized reviews

#

like 5-70

unique salmon
#

I don't think it's necessary. For example, with a "nominal" threshold of 2% it actually tagged only 0.35% of my cards

#

So we probably have the opposite issue

#

Then again, we still haven't seen how it behaves if we calculate the p-value after every review

polar maple
#

so we don't know if it is necessary or not yet

unique salmon
#

@cursive badge I'd like you to add an extra command to iterate over all reviews and calculate the p-value after every review, and if it dropped below the threshold at any point, tag the card as a leech. So if the p-value has dipped below the threshold once in N reviews, it's a leech
So we calculate the p-value for each card AND each review
I want to see how many cards will be tagged

#

Also, I feel like we're better off just using the Bonferroni correction and assuming some average number of reviews per card's life

#

We can estimate that average from the 10k dataset

polar maple
#

why not exponential decay?

unique salmon
#

?

polar maple
#

1 / 1.1^n or something

unique salmon
#

I don't like the infinite series dumbassery

#

Let's just analyze 10k and find the average number of reviews per card

#

Btw, I ran the code with a 5% threshold and 0.74% of my cards got tagged

polar maple
#

what does the FSRS calibration graph look like on your collection?

unique salmon
#

Wait, if we only use the last N reviews, then we can just use N in the Bonferroni correction. Problem solved

unique salmon
#

Why?

polar maple
#

if a card has n reviews but you only use a sliding window of the last m reviews, you are still calculating the p value n times but the bonferroni correction is using m

unique salmon
#

Oh, yeah

#

Right

#

Then let's just analyze the 10k dataset and find the real average

polar maple
#

for now just make a random guess like 15 and see if it gets what you want

polar maple
#

a perfect memory model should always achieve around 5% tags

cursive badge
# bold terrace And this is not :

Maybe it is because this metric defines a leech as something that FSRS does not predict well. You have a big clump of reviews that had a really low R at the start. If FSRS predicts that you would fail a lot and you do, that is not considered a leech.

polar maple
unique salmon
bold terrace
#

Or even sometimes way lower if I fail it multiple time in learning phase

#

I remember I had a few cards with 30-50% R when I really struggled them to recall them just for one day

#

Which also means the algorithm here is a bit more lenient with very very very low stability cards

#

Well, it does what it's asked to do, it mark things that FSRS didn't predict well

#

But if a card has every day a very very low R, because of such a low stability, it won't necessarly see it as a leech, because FSRS indeed, also agreed it was a very low R

#

Which is funny, because then it means there's even a third way of defining a "leech" !

  1. Card being failed way more than what FSRS would expect
  2. Card very difficult (High D)
  3. Card with very low stability (thus low R at time of review).
    More Philosophy I guess, @unique salmon
#

It just reinforces my opinion that "leech" is a very ambiguous and not that helpful term haha

#

And that algo is then quite a very valuable one, since a "fail", more than just compared to the "expected DR", is actually compared to the "R" which might have been way lower than the DR. Nice !

unique salmon
#

If a card's S isn't <1, then, assuming the user reviews it when it's due, R should be close to DR

#

It's just that Anki is jank when it comes to short-term reviews

bold terrace
#

(close to DR you mean ?)

unique salmon
#

Yeah, my bad

bold terrace
#

Sure

unique salmon
#

Anyway, I wouldn't call cards with S<1d leeches

bold terrace
#

But my point is not really I believe in those 3 points but more like I can understand that for some people, having a card taht stay at low stability, could be a "leech" by their definition

unique salmon
#

Not "I sorta kinda disagree", but "I can't imagine any argument that could convince me otherwise"

bold terrace
#

Personally I don't know

#

Leeches are thing that suck blood, not sure how well it applies to memorization

#

It leeches if it decrease your performance ?

#

😄

#

Joke aside, the "leech" definition I see the most is "it leeches your workload"

#

Like, you do more workload because of it, for just one card

#

Buuuuuuuuut then people might be tempted to only add card they already know quite well

#

Isn't it a bit strange to do a lot of Anki cards for things you might really need SRS in the first place ?

#

If I'm able to remember "Self-Loathing" (a word I learnt a few months ago) many weeks from the first occurence, and then for multiple months at the second occurence ... Should I put it in Anki ?

#

Or should I, add those worst I keep forgeting again and again (Dweller, Delver, ...)

#

If you listen to "anti-leech" people, they'll say they would be leeches so not worth my time

#

Ok big boys, but when do I actually memorize those though

#

Next step : "Let's add only cards for things you'll never lapses more than N time !"

#

The goal is not to Anki-fy your life, it's the material that you want to learn

polar maple
# unique salmon I don't understand. If FSRS is inaccurate and we are measuring how likely it is ...

this metric doesn't necessarily measure where FSRS is inaccurate. You can do the same calculation with a fair coin, yet, how would you interpret the bottom 1%? It really just comes down to luck and no other underlying explanation for this case. Or alternatively, think of how you would interpret this metric if you had a perfect memory model. The hope is that with this metric, the detected leeches do actually correlate with incorrect FSRS predictions, but its hard to tell how well the metric does this.

for your question, imagine if FSRS only ever predicts R = 50% but the truth is 100%. Cards would never lapse and nothing would fall below the 1% threshold. If the truth is 90% and the predicted is 50%, you get less detections, etc etc. This might be what is happening judging by the calibration graph, predicted 50% by FSRS corresponds to actual 70% for your collection

unique salmon
#

imagine if FSRS only ever predicts R = 50% but the truth is 100%. Cards would never lapse and nothing would fall below the 1% threshold.
Nvm, I get your example now

#

Yeah, since we're only measuring p "from the left", if some cards are never failed, we will not detect those anti-leeches

#

I mean, we could try to detect those as well, it's just that in the context of Anki and spaced repetition I don't see a good reason to find anti-leeches

polar maple
#

to adjust for calibration we can instead define leeches to be in the bottom 1% of historical likelihood, aka use percentiles instead of comparing it to a fixed threshold

unique salmon
#

Nah, let's just use a fixed threshold

#

I'll try some plotting tomorrow, unless Ross will be ahead of me

#

I want to see how the p-value changes after every review

unique salmon
# west whale Bump
  1. Right now the steps from the Helper add-on are recommended
  2. Probably no, I doubt they will be the default
unique salmon
#

I'm analyzing 10k. Either I screwed up or a card gets reviewed on average...5.7 times in its lifetime. Without same-day reviews and the first review, that is

Cards=77559205
Average n reviews=5.7
Median n reviews=4.0
99th percentile=37.0

cursive badge
# unique salmon <@347088848854974465> I'd like you to add an extra command to iterate over all r...

I did a first pass at this, but will not have time to check I did not break things until later:
https://github.com/rbrownwsws/leechkit/pull/3
New flag: --incremental-check
I think _calculate_incremental_leech_probabilities should give you the data you want for plotting if you want to give it a go. (maybe draw a line then plot red/green points on top to show success/failure?)

cosmic hedge
# cursive badge I did a first pass at this, but will not have time to check I did not break thin...

you might wanna post a link to your repo on the forum post https://forums.ankiweb.net/t/automated-leech-detection/56887

sacred seal
#

Oh... I think I discovered why FSRS gives me such huge intervals. My desire retention is 0.90 but when I was studying with default settings my true retention was 98%
if I understood correctly

#

In this case I don't know what to do. I'm currently studying with default FSRS parameters

sacred seal
#

When I do it FSRS gives me HUGE intervals

#

For example 1,5 month for a new learned card

sacred seal
#

I think I didn't use the "hard" button, but I'll try this "Remedy Hard misuse" and see how it goes

unique salmon
sacred seal
#

When it comes to adjusting desire retention another problem appears. New cards have optimal intervals but mature cards have too small intervals

sacred seal
unique salmon
sacred seal
#

So I think I finally managed to make FSRS useful for me. I made two different settings for two different decks, and based on my "true retention" I set desire retention at 98%. I started session and it looks great.

#

thank you all for your help

ashen light
#

@unique salmon cg on the pr merge 🍃

#

nows the time to keep that momentum and do the leech thing 🍃

polar maple
unique salmon
polar maple
#

so like how you wanted S*R instead of just sum of R

#

also such a definition could function as a leech detector, if this expected cost is too high then you count it as a leech

unique salmon
#

@cursive badge I ran your code with --leech-threshold 0.05 --no-dynamic-threshold --incremental-check. It tagged 2405 cards as leeches out of 35024 cards, or 6.87%. That's actually not so bad, only a little above the nominal 5% value.
I re-ran it with a threshold of 3.6% and it tagged 1889 cards as leeches out of 35024 cards, or 5.39%. So it seems that at least on my collection we just need to divide the nominal value by a constant (1.3-1.5) to get good coverage. Good = close to nominal.
@polar maple @bold terrace try it out, guys.

polar maple
#

but don't you get 0.3% with only one detection at the end of the history? So it's like a 15x increase or something?

unique salmon
#

?

polar maple
#

your collection was way already lower than the 5% to begin with

unique salmon
#

Ah, yeah, x9 increase

polar maple
#

so for a properly calibrated FSRS it could overshoot 5% by a ton

#

if x9

unique salmon
#

Try it and report the results, I'm curious. Sound, you too

cursive badge
#

For science:

--query "deck:JapAnki is:review" --leech-threshold 0.05

Processed 5774 cards
Found 77 leeches
(~1.3%)

--query "deck:JapAnki is:review" --leech-threshold 0.05 --incremental-check

Processed 5774 cards
Found 435 leeches
(~7.5%)
unique salmon
#

I really hope that we can just divide the threshold by 1.3-1.5 and everything will work just fine 😅

cursive badge
#

I can see why some of these cards will trip the leech detector at some point with --incremental-check, but not really be what I would consider a leech right now.

e.g. I obviously went through a rough patch with this card (interference?) but recovered:

unique salmon
#

Ok, I want you to implement one more thing: report the average number of "cross-overs" per card: when a card dips below the threshold or above the threshold.
Something like:
if p[this_review] < threshold and p[last_review] >= threshold: crossover = True elif p[this_review] >= threshold and p[last_review] < threshold: crossover = True else: crossover = False

#

If this number is high, that means that this method will cause a lot of "This card is a leech...oh, sorry, it's not...oh, nvm it is...oh, nvm, it isn't..."

polar maple
polar maple
#

because it becomes equivalent to flipping a coin

unique salmon
#

Hmmm
Well, I guess we can always just implement a "don't change the card's leech status more often than once per 2/3/n reviews" rule

polar maple
#

even easy cards will, given a long enough time, become a leech with high probability

unique salmon
#

Wait, I re-ran it without the incremental check and got 1.56% leeches, even though it was <1% last time 🤔
@cursive badge I wasn't really following the PRs, but wasn't there a bug with using the wrong S value?

unique salmon
#

Yeah, that's what I thought

#

Ok, so the incremental check increases the chances of a card being tagged as a leech by x4.5 times, not x9

polar maple
#

also need to see how many times the test happens

#

maybe its a more straightforward formula

unique salmon
cursive badge
#

Some are very flip-floppy. This apparently went back and forth 12 times @0.05:

#

I guess you would want a higher "reset" threshold (hysteresis?) to prevent the flip-flopping.

unique salmon
unique salmon
cursive badge
unique salmon
#

I'm not sure I understand

#

You want to have 2 thresholds?

cursive badge
#

i.e. there is a low threshold you have to go under to become a leech, then once you are a leech there is a higher threshold you need to go over to become non-leech again.

unique salmon
#

Alright, implement that then

cursive badge
#

I haven't done it properly, but I've had a play by just hard-coding things. It's interesting but might be annoying tuning two thresholds.

I just tried 0.05 -- 0.20 and one card took 12 reviews to climb out of leechdom only to fail the next review and immediately become a leech again 😂

bold terrace
#

I don't have "--incremental-check", is it on a specific other branch?

#

ah wait I might not be on that branch

#

Ok now it's running

#

Hmmm

#

"--incremental-check - Check if card is a leech after every review. Mark as leech if card ever drops below threshold."

How is it different from before 😄 ?

#

Aah gotcha

#

Get flagged if at half the reviews it would have been marked as leech, even if it wouldn't with the full history

#

My only complaint is that having decks from 1.3y with a lot of review 80-12, it's never super super clear if the leech was detected because of something recent (that I should fix) or something old (that might be fixed since then)

#

So basically when you play with the threshold to get a certain %-age like 5%, the problem is that it might be too conservative because otherwise you would flag your entire collection, BUT, maybe it's OK that a lot of card went through the "leech' status at some point

#

For example that one was detected as a leech

#

While it's on a good track right now

#

So I could make the threshold lower ...

#

But this one is not flagged already, so it wouldnt really make sense

#

So I don't know ? Some kind of "limit", or some "decay" that older reviews are less important than newer one ?

unique salmon
unique salmon
#

@polar maple I finally made a good O(1) time and memory approximation
You use it like this:

approximator = HighProbabilityBinomialApproximatorCDF()
for p in p_succ:
    approximator.add_trial(p)
p_value_binom_approx = approximator.cdf(n_succ)```
Later I will try to integrate it into FSRS. For now I'm trying the decay=f(D) stuff
#

It's fine-tuned for high probabilities, so it's not that great when there are low probabilities, but still WAY batter than the normal distribution approximation

tepid spoke
#

oO now how on earth did this happen. This was on default parameters.

#

Just wondered why that one graph looked so weird, having that one lonely card out there, in 2035

#

I hit reschedule on it via the FSRS helper, and now it's due on April first :D

#

(of this year)

unique salmon
tepid spoke
#

default 90%

#

never touched any of that on that deck

unique salmon
#

Huh. Yeah, that definitely shouldn't be >11 years

robust hill
#

😭

#

that is crazy

tepid spoke
#

1.8y matches what the FSRS addon rescheduled it to I think

tepid spoke
#

well, no. It did almost exactly 1y

#

rescheduling it didn't fix the graph though. The marked column is the last one with a card in it

bold terrace
#

I also see Ease=250%, so I guess it was already ease-maxed-out with SM2 ? YOu never pressed Easy but the ease factor was already 250%, strange

#

I'd were you I'd reschedule all cards with interval >6-12 month

tepid spoke
#

That's how SM2 works though?

bold terrace
#

I think the initial ease factor is something like 150-170% no ?

#

But it's configurable

tepid spoke
#

wasn't it 250%?

#

I never touched any of that

#

neither SM2 params nor FSRS params, like I said

bold terrace
#

Maybe then !

tepid spoke
#

And it's only that one card which ended up so far away

bold terrace
#

I didn't use it very long SM2, maybe a few days lol

tepid spoke
#

so I'm really not sure how it happened

#

Like, did I accidentally optimize for one day, and then undid it? But then why only that one card?

bold terrace
#

🤷

#

BTW that would be great that the rescheduler had an easy option for people like "Recommended : Only reschedule >6m cards" or something

#

Rescheduling <1-3 months cards is not providing much benefits and creates huge backlogs

#

So often what I do is I just select all cards with intervals/due due bigger/further than 3-6months and I reschedule only those

#

the ones shorter than that, it'll be rescheduled when I review them TohruShrug

#

Not that I need it (since I can do without it)

#

but I think most people are afraid of rescheduling because of that

cosmic hedge
tepid spoke
#

nah, this deck is as vanilla as it gets

#

it's not really a memorization-based deck, so has to stay on default parameters

quasi shadow
robust hill
#

interesting

tepid spoke
#

whatever happened to that card in the past, clearly current Anki/FSRS realizes it's nonsense

quasi shadow
#

It means you need to review it right now...

#

I guess the built-in rescheduling doesn't work, due to some unknown problems.

#

Did you install the helper add-on?

#

You can reschedule this card in the browser.

unique salmon
tepid spoke
rotund summit
#

hey everyone would someone be able to help me turn on CMRR or is there a how-to guide pinned somewhere? I tried looking but couldn't find one

unique salmon
rotund summit
unique salmon
#

Also, CMRR was in deck options ever since the first version of Anki that supported FSRS, Anki 23.10, which came out more than a year ago

rotund summit
rotund summit
unique salmon
rotund summit
rotund summit
unique salmon
rotund summit
unique salmon
#

Time per button is estimated from your review history

rotund summit
# unique salmon How else would it calculate how much time you spend reviewing? Btw, it doesn't u...

Hmm I guess I just have a misunderstanding of what CMRR fundamentally is/does

I was thinking of it as setting a "lower bound" to a "desired retention range", with the regular DR being the ceiling, and that it would use the exact same parameters only (I see that it does use some FSRS parameters, but I wasn't expecting the review avg review time variable to be factored in as well) and calculate the DR "floor" from there

unique salmon
tepid spoke
#

How does FSRS actually determine what the learning phase is?

#

i.e. what makes it switch from Learning to Review in the card info

unique salmon
tepid spoke
#

That's what I thought, but my learning steps are just within one day, and cards are called "learning" for multiple days.

unique salmon
#

Show Card Info of those cards, and your deck options

tepid spoke
#

And that was with 20m 20m 3h

unique salmon
#

I guess "Learn" displays the status before you pressed the answer button, not after. So this one here is "it was in the learning stage before you pressed Good", not "it is in the learning stage after you pressed Good"

tepid spoke
#

yeah, looks about right.

bold terrace
#

I implemented the R*sqrt(S) to see how it would look

#

knowing I never stopped reviewing but I stopped adding cards for 3-4 months

#

With just R it was more like this

#

Knowing that the "rebound" aronud 2025 align with how my stability started growing again when I stopped adding too much words again

#

I think R*sqrt(S) gives a good idea of progress being made even if the sum of R doesn't seem to be that big

bold terrace
robust hill
#

Anki 25.02 (038d85b1) (ao)
Python 3.9.18 Qt 6.6.2 PyQt 6.6.1
Platform: macOS-15.3-arm64-arm-64bit

Traceback (most recent call last):
File "aqt.webview", line 53, in cmd
File "aqt.webview", line 169, in _onCmd
File "aqt.webview", line 728, in _onBridgeCmd
File "aqt.toolbar", line 429, in _linkHandler
File "aqt.toolbar", line 450, in _statsLinkHandler
File "aqt.main", line 1301, in onStats
File "aqt", line 149, in open
File "aqt.stats", line 77, in init
File "decorator", line 232, in fun
File "anki.hooks", line 92, in decorator_wrapper
File "anki.hooks", line 84, in repl
File "/Users/davidtrayanov/Library/Application Support/Anki2/addons21/1613056169/init.py", line 22, in new_refresh
innerJs = f.read()
File "encodings.ascii", line 26, in decode
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1034709: ordinal not in range(128)

===Add-ons (active)===
(add-on provided name [Add-on folder, installed at, version, is config changed])
Advanced Browser ['874215009', 2024-10-27T16:20, 'None', '']
AnKing Note Types Easy Customization ['952691989', 2025-03-14T14:40, 'None', mod]
Anki Leaderboard - Compete with friends to boost motivation Custom by Shige ['175794613', 2025-03-17T17:59, 'None', mod]
AnkiHub ['1322529746', 2025-03-17T18:25, 'None', mod]
AwesomeTTS - Add speech to your flashcards ['1436550454', 2025-02-15T10:23, 'None', '']
Bible Memorizer ['2012700632', 2023-12-23T13:52, 'None', '']
Colorful Tags Hierarchical Tags ['594329229', 2022-09-15T18:06, 'None', '']
Countdown To Events and Exams ['1143540799', 2022-06-27T14:50, 'None', '']
Deck duplication ['1779572689', 2022-06-11T17:24, 'None', '']
FSRS Helper Postpone Advance Load Balance Easy Days Disperse Siblings ['759844606', 2025-03-09T07:31, 'None', mod]
Fill the blanks - Multiple typecloze support ['1933645497', 2025-03-08T14:00, 'None', '']
FrequencyMan - Sort your new cards i1 ['909420026', 2025-02-02T05:22, 'None', mod]
History Visualizer ['1545338943', 2025-01-18T14:29, 'None', '']
More Overview Stats 21 ['738807903', 2025-02-17T23:14, 'None', '']
Multi-Decker ['1110722673', 2023-06-11T09:20, 'None', '']
Multiple Choice for Anki ['1566095810', 2023-11-17T22:59, 'None', '']
Progress Graphs and Stats for Learned and Matured Cards ['266436365', 2020-03-29T09:26, 'None', '']
Review Heatmap ['1771074083', 2022-06-30T04:43, 'None', '']
Search Stats Extended ['1613056169', 2025-03-02T01:16, 'None', '']
Study Time Stats ['1247171202', 2024-02-24T18:59, 'None', '']
show overview deck browser options name ['684236185', 2023-09-09T02:01, 'None', '']

===IDs of active AnkiWeb add-ons===
1110722673 1143540799 1247171202 1322529746 1436550454 1545338943 1566095810 1613056169 175794613 1771074083 1779572689 1933645497 2012700632 266436365 594329229 684236185 738807903 759844606 874215009 909420026 952691989

===Add-ons (inactive)===
(add-on provided name [Add-on folder, installed at, version, is config changed])
Anki Simulator ['817108664', 2023-11-06T19:26, 'None', '']
Ankimon by Unlucky-life ['1908235722', 2024-05-16T18:07, 'None', mod]
Custom Background Image and Gear Icon ['1210908941', 2025-01-29T10:31, 'None', mod]
Dynamic Text Window ['Dynamic Text Window', 0, 'None', '']
Remaining time for Anki 21 ['1508357010', 2024-04-16T05:15, '24.4.16i24', '']

bold terrace
#

hmmm I think I had something like that

#

I just restarted Anki though

#

and it was OK again

robust hill
#

i just deleted it

#

ok i reinstalled it

#

and restarted anki 4 times

#

still crashing

bold terrace
#

Strange 🤔

#

You still have it on this one ?

ashen light
#

someones got some unicode in their ascii

#

try opening the file as binary

#
        innerJs = f.read().decode('utf8')
    with open(addon_dir / "stats.min.css", "b") as f:
        innerCss = f.read().decode('utf8')
bold terrace
#

I see those lines were changed yesterday

#

I can rebuild with your snippet

#

Hmmm no luck

#
  File "/Users/.../Library/Application Support/Anki2/addons21/1613056169/__init__.py", line 22, in new_refresh
    innerJs = f.read().decode('utf8')
  File "encodings.ascii", line 26, in decode
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1034709: ordinal not in range(128)
ashen light
#

why is it trying to do ascii

#

maybe replace open with file

bold terrace
#

chatgpt tell me to try rb instead of b

ashen light
#

that won't do anything probably but go ahead

#

also I'm a boomer file isnt a thing anymore

#

this feels like a platform difference? on linux it just works without needing "b"

#
<_io.TextIOWrapper name='stats.min.js' mode='r' encoding='UTF-8'>``` it seems to correctly identify it as utf8
#

try encoding="utf-8" in the open call

bold terrace
#
>>> open('stats.min.js')
<_io.TextIOWrapper name='stats.min.js' mode='r' encoding='UTF-8'>

too it seems

ashen light
#

based on his traceback hes also on mac so not a platform issue...?

bold terrace
#

Yeah strange suddenly I get it too

#

Tried with

    with open(addon_dir / "stats.min.js", "rb", encoding="utf-8") as f: # Putting this inside the function allows you to rebuild the page without restarting anki
        innerJs = f.read().decode('utf8')
    with open(addon_dir / "stats.min.css", "rb", encoding="utf-8") as f:
        innerCss = f.read().decode('utf8')

btw

robust hill
#

this is some nerd speak

ashen light
#

the encoding was to replace the rb/decode stuff

bold terrace
#

ah ok

#

smth like

#
    with open(addon_dir / "stats.min.js", encoding="utf-8") as f: # Putting this inside the function allows you to rebuild the page without restarting anki
        innerJs = f.read()
    with open(addon_dir / "stats.min.css", encoding="utf-8") as f:
        innerCss = f.read()
ashen light
#

yeah

bold terrace
#

ValueError: binary mode doesn't take an encoding argument

#

at least it's a different error now lol

#

let's revert to stackoverflow

#

gpt has its limit

ashen light
#

binary mode implies the b is still there?

bold terrace
#

Don't seem so

#
    with open(addon_dir / "stats.min.js", "r", "encoding="utf-8") as f: # Putting this inside the function allows you to rebuild the page without restarting anki
ValueError: binary mode doesn't take an encoding argument
#

it raise on the "open" so nothing before it

ashen light
#
stats.min.js: ASCII text, with very long lines (807)
``` its not even utf8 its ascii why is it choking on this
bold terrace
#

eeeerrr

#
    with open(addon_dir / "stats.min.js", "r") as f: # Putting this inside the function allows you to rebuild the page without restarting anki
ValueError: binary mode doesn't take an encoding argument
#

there's some kind of caching somewhere

ashen light
#
Traceback (most recent call last):   File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1034709: ordinal not in range(128) 
>>> k = open('stats.min.js', 'rb').read().decode('utf8') 
>>> ``` ¯\_(ツ)_/¯
bold terrace
#
>>> k = open('stats.min.js', 'rb').read().decode('ascii')
Traceback (most recent call last):
  File "<python-input-0>", line 1, in <module>
    k = open('stats.min.js', 'rb').read().decode('ascii')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1034709: ordinal not in range(128)
``` yep same
ashen light
#

are you editing the right file 🍃

bold terrace
#
>>> k = open('stats.min.js', 'rb').read().decode('utf8') 
>>> 
#

then all good

#

I think there is some caching

#

because now I did :

#
def new_refresh(self: NewDeckStats):
    with open(addon_dir / "stats.min.js", "r", encoding="utf-8") as f: # Putting this inside the function allows you to rebuild the page without restarting anki
        innerJs = f.read()
    with open(addon_dir / "stats.min.css", "r", encoding="utf-8") as f:
        innerCss = f.read()

#

I rebuild, I update the plugin : Fail
I stop anki, restart anki : All good

ashen light
#

oh

#

you gotta restart anki to reload the new code

#

this is just a python thing

bold terrace
#

This is the last build

#

@robust hill , can you try to install it in Anki, if you get the error, just restart anki and reclick on "Stats" ?

#

maybe the first version I was not able to reproduce because I had still the "good code" in my anki or something

#

I pulled changes from upstream afterwards but I'm not sure I restarted Anki at first

#

so it's maybe why I didn't notice it

robust hill
#

looks good now

bold terrace
#

Thanks for the check, I'll update the PR and ping @cosmic hedge (oops just did it)

ashen light
#

@bold terrace you gotta up your git skillz and add me as a coauthor on those commits

bold terrace
#

I tried to @mention you

#

but once again

#

Ah no

#

This time Jake did not refuse to help

#

But he refuse to take credits

#

😄

#

Thanks

robust hill
#

thank you for the sigma tag

#

i just got emailed in the middle of my anki reviews which caused me to panic which caused me to fail my card. i will be contacting my lawyer soon. ||/j||

ashen light
#

man I wish I was cool enough to have jake as my github account name

bold terrace
#

ahahahahah

#

There was no autocomplete

bold terrace
#

so I was like, ok @ashen light doesn't exist but who cares

#

But when I saw @robust hill message I was like "But who did I tag then ?"

#

No wonder the guy has so many contribution

#

When in doubt, tag @ashen light

ashen light
#

dude has a real job where he codes

#

I obsensibly do but I really can't confirm that anymore (yes we use github at my job)

bold terrace
#

And most of those are just issues opened

#

Or 1-line PR

ashen light
#

I was memeing about how I no longer program at my job, but mostly produce suffering

bold terrace
#

I've been spending my last ~5-7 days trying to build a project we need to "follow" since it upgrade to java 21

#

And all the stupidiest shits are happening at the same time

#
  • Can't do any "apk update" in a docker image because the corporate proxy decided that we were not worthy of having access to internet again
  • Can't curl sonatype for security scan because the same fucking proxy
  • The CI/CD has a UID different than the UID of the process in the docker container, leading either to not being able to create new folder in the container, or not being able to clean them in the host
#

Came to plot nice graphs, stayed to solve encoding issues

#

:/

#

So now I'll spent the last 30min of my day spending the money I got in some waifu-gachas, see ya

#

BTW @robust hill I see @cosmic hedge added a zoom for the difficulty graph :

I know you were interested so check it out 🙂

#

(my last build should include it)

cosmic hedge
#

not the first time i screwed that up

#

@bold terrace could you not branch off your other prs in future and just use the main branch please XD

bold terrace
#

I tried

#

I added your remote as an upstream git remote add upstream ...., then get pull upstream main, get checkout -b ... and then chery-picked normally, but it seems there are many things coming with it 🥲

#

I'll check more calmly tomorrow where I might have screwed

cosmic hedge
#

do

git checkout upstream/main
git checkout -b new-branch

see if that works

bold terrace
#

Oooh

#

"git pull upstream main" instead of upstream/main and not reading well enough

#

So clean 🥲

cosmic hedge
#

thank you 😌

#

i have reached git zen now

bold terrace
#

Deep inside me I was curious if there was a limit on the number of commits and PR depending on each other 😦

#

😄

cosmic hedge
#

not so deep apparently 😂

lapis hearth
rotund summit
#

why should we do it via the add-on instead? and how do we do it? will clicking the highlighted option bring me to another window where I can choose the new DR or is it gonna reschedule based on something I need to change beforehand?

wind palm
# rotund summit why should we do it via the add-on instead? and how do we do it? will clicking t...

why should we do it via the add-on instead?
Because it doesn't add an extra entry to every card's review history.

will clicking the highlighted option bring me to another window where I can choose the new DR or is it gonna reschedule based on something I need to change beforehand?
There's no second window. It will use the current parameters and DR for every options group/deck/card. You should have all of that set how you want before you run the reschedule-all.

tepid spoke
#

right click the deck, there should be an FSRS helper submenu, with a reschedule option

#

or you can reschedule individual or groups of cards via the browser with it

rotund summit
tepid spoke
#

probably cause they would already been added with 90%

#

lowering the desired retention should otherwise always lower the load

rotund summit
tepid spoke
#

well, they weren't overdue until you hit reschedule

rotund summit
# tepid spoke well, they weren't overdue until you hit reschedule

How does rescheduling via the add on work with decks and subdecks, esp if they have different DRs?

both were at 90 but I'm trying to change the main deck to 88 and the subdeck "UWorld" to 89 for starters

do I need to reschedule one before the other? does rescheduling the main deck from the home screen apply to the subdeck as well?

tepid spoke
#

a card is only in exactly one deck

#

no idea if hitting the button on a deck also reschedules all subdecks

#

there's one easy way to find out though

rotund summit
#

yeah I went ahead and tried different combinations of rescheduling to find the lowest card burden but did confirm that hitting reschedule on the main deck did affect/lower the number of reviews in the subdeck as well

for anyone else in a similar situation and curious

tepid spoke
#

You still seem to be confused about what is actually happening.

#

There is no load for a specific deck or its subdecks

#

It's just cards being rescheduled, which can push them away further when you lowered DR, but if you optimized since the last reschedule, it might as well just pull a ton of them to the front

quasi shadow
#

🤔 I made a simulation based on a naive atomic-memory model without Difficulty.

quasi shadow
quasi shadow
#

Good News: I find out an extra parameter that can improve 2% RMSE(bins) and 0.0018 log loss in the preliminary benchmark.

#
$ python evaluate.py --fast
Model: FSRS-5-dev
Total number of users: 634
Total number of reviews: 19967510
Weighted average by reviews:
FSRS-5-dev LogLoss (mean±std): 0.3344±0.1687
FSRS-5-dev RMSE(bins) (mean±std): 0.0549±0.0372
FSRS-5-dev AUC (mean±std): 0.7077±0.0817

Weighted average by log(reviews):
FSRS-5-dev LogLoss (mean±std): 0.3682±0.1717
FSRS-5-dev RMSE(bins) (mean±std): 0.0734±0.0481
FSRS-5-dev AUC (mean±std): 0.7028±0.0879

Weighted average by users:
FSRS-5-dev LogLoss (mean±std): 0.3709±0.1729
FSRS-5-dev RMSE(bins) (mean±std): 0.0758±0.0493
FSRS-5-dev AUC (mean±std): 0.7022±0.0900

parameters: [0.29835, 1.36215, 2.9813, 15.3283, 7.15575, 0.5576, 1.75015, 0.0036, 1.52275, 0.1081, 1.0048, 1.9078, 0.1162, 0.30265, 2.2646, 0.2315, 2.99725, 0.6267, 0.5094, 0.14805]

Model: FSRS-5
Total number of users: 634
Total number of reviews: 19967510
Weighted average by reviews:
FSRS-5 LogLoss (mean±std): 0.3362±0.1698
FSRS-5 RMSE(bins) (mean±std): 0.0561±0.0368
FSRS-5 AUC (mean±std): 0.7048±0.0788

Weighted average by log(reviews):
FSRS-5 LogLoss (mean±std): 0.3708±0.1735
FSRS-5 RMSE(bins) (mean±std): 0.0744±0.0478
FSRS-5 AUC (mean±std): 0.7001±0.0865

Weighted average by users:
FSRS-5 LogLoss (mean±std): 0.3736±0.1748
FSRS-5 RMSE(bins) (mean±std): 0.0767±0.0490
FSRS-5 AUC (mean±std): 0.6996±0.0887

parameters: [0.4014, 1.2663, 2.6931, 15.3206, 7.1801, 0.53625, 1.75455, 0.0065, 1.50165, 0.1313, 0.9943, 1.925, 0.11, 0.28685, 2.27865, 0.23165, 2.99595, 0.454, 0.60975]
#

😎 Maybe we will have FSRS-5.5 this year.

hasty fractal
#

do we have plans to fix this abomination

#

(deleted/reposted to not destroy jarrett's message)

quasi shadow
cosmic hedge
#

oh wait no that features not out yet

hasty fractal
#

is there an option for that now?

#

that's a solution certainly

cosmic hedge
quasi shadow
#

Bad News: the extra parameter may require to refactor the simulatorFeelsBadAnki

#

Because the short-term memory formula is changed