#FSRS Megathread
1 messages · Page 12 of 1
dont
bro i introduced one of my friends to anki
and he had no clue deck options existed
i literally told him over call so many times
peak anki dank
i just turned desired retention from 93 to 90%
then i rescheduled with fsrs helper addon
dropped me from 170 to 79
then i did it again cause why not
it went to 81
then 82
then no more
Learning steps are the only practical way imo
Thank you, this is very helpful. With the help of this, the conclusion can be drawn that even cards above 1k per day are OK?
What good would an algorithm be that tells you you need to learn that card in 2h, that one in 3h, the other one in 1.5h, and maybe that one in 7h?
...you want to learn 1000 new cards PER DAY?
When you only got one or two timeslots in your day where you can Anki
I don't think this is possible, just wondering
With learning steps I can configure it to adhere to my schedule
If we're talking per day, the limit is most likely motivation rather than raw brainpower
1000 new cards per day is pure insanity oO
Tbf, even if we're talking "per 50 years", Woz (the author of that article) seems to imply that motivation plays a larger role than raw brainpower
lowkey forgot which reschedule is better to do
I also legitimately still wonder how accurate any algorithm can be in reality. Since you inevitably end up encountering and using what you learn elsewhere.
And then any forgetting curve the algorithm is assuming gets messed up
Yeah, do you have any good ideas for boosting motivation? Also, I’ve come across the memory palace and the Feynman Technique—how effective are these two methods?
That's likely why in FSRS-6 a lot of users get very flat forgetting curves where the probability of recall takes a million years to fall to 10%
Like, I experience that in my deck all the time
There are some oddball words I only ever see in Anki
And it's usually an Again, unless they're odd enough to be memorable
While a lot of others I hear almost every day
Btw, once FSRS-6 comes out, splitting material into different decks and presets will be more beneficial, thanks to the shape of the curve adapting
idk
nope
Also, today I had 17 "Agains" where I got a Rendaku wrong.
Most of those leeches. Because of Rendaku.
I need a 5th button just for "Wrong Rendaku" at this point
to load balance or not to load balance
Anki already has a built-in load balancer
What version are you using?
...I need to make a card for that as well 😅
@quasi shadow
Wasn't rescheduling via the AddOn superior?
no idea
just tell me whats better so i can do my cards
now i am cooked
paralyzed
Why is rescheduling this important for that?
because
i have 307 reviews to do
and one rescheduling changes that to 160
and the other to 85
Just do the 307 then, safest option!
Unless they take your really long
okay what about for a new deck with clean fsrs options
307 would be a normal day for me, but I only take 10~15 seconds per card
30 new a day leading to 300+ in just 5 days sounds odd
this is for a different deck
no worries
For a new deck there is very little point rescheduling
You're most likely still on default parameters anyway, and have been the whole time
So I'd expect it to do nothing
From what I remember, you want to reschedule via the AddOn
it does it in a nicer way somehow
Forgot the details, but I think the native one puts an entry into the revlog, while the addon actually recalculates stuff
noooo not the revlog
FSRS-5 with a flatter curve on 1000 users
In case you don't know how to read this: ideally, predicted retention should match observed retention for some group of cards. So we put cards into sufficiently many small groups and measure the average retention within each group, as well as the average of FSRS predictions. They should match as closely as possible. If FSRS predicts an average probability of recall of 99% for these cards, you should recall 99% of those cards. If FSRS predicts an average p(recall) of 50% for these cards, you should recall 50% of them.
Orange line - perfect algorithm
Blue curve - FSRS-5 with an extra flat curve. FSRS-6 will be better thanks to an adaptive (not fixed) curve + one more parameter for same-day reviews
So as you can see, on a large dataset FSRS performs very well. Then again, it may perform poorly for some individual user or deck
People often say that studying over 1,000 flashcards a day is really difficult, but is there a way to make it easier? If there is, that would be amazing—it would let us finish learning as fast as possible, which is definitely a good thing.
1000 new ones or just in total?
new ones
That still seems like insanity to me
At least for me, I'd be so fatigued after not even half of them, that I'd remember jack all
The add-on doesn't generate revlog entry.
so
And in terms of fuzz/Easy Days/whatever other stuff?
1000 new ones a day is around 2800 reviews from just new alone
and then take into account the reviews which go up to 8x if i recall correctly from that one rule
you would be doing around 10000 ish reviews a day
dont know if anyone does that
I'd guess it'd be 1000 new ones once?
if you had the entire day to anki yea you could do that
1000 new ones a day for extended periods of time would make you run out of hours in a day fast
I have an entire day and I would never, EVER do 1000 new cards 🤣
11.95s per review on my avg so if i had to learn 1000 new
2800 rev times 12s
around 9 hours
for me to do 1000 new
💀
close enough
Btw @polar maple how did you plot this for multiple users? Can you give me the code? I have no clue how to plot the calibration graph for >1 user
I'd love to plot this for 1000 users for all versions of FSRS
That would be graph porn
It would give us a clear visualization of how much FSRS is wrong, on average
Would be nice if it was added to the benchmark code
Plotting the averaged calibration graph, I mean
They should be same.
Do we even need the add-on reschedule option then?
Yup workload = ~5-10 times the number of new/day is a good estimate. But also even without any new/day, if your backlog is big enough (~3000 cards), you'll have easily a few hundreds reviews JUST to maintain it. It's not dropping that fast (except if your overall retention improve because you do other activities outside anki)
So when you have free time : Do more real-life exposure, don't do more anki
they are not the same
Except if your workload is very very low
but i love ANKI
its the best way to measure my success
instantly
I love too but I even prefer learning japanese
im all for delayed gratification but if im studying for my exam
Because I cannot use the same random number generator.
Am I misremembering or were you talking to Dae and others about the add-on using the same random number generator with the same seed as in Anki?
I vaguely remember this being discussed
cooked
This a very, very great point I never thought about but illustrate well why "short term memory model" in terms of "seconds, minutes, hours" might not really mean anything in the first place.
Long time ago, FSRS Helper's fuzz was inconsistent with Anki's. So I asked Dae to expose the fuzz API.
And it's still not 100% consistent?
before in my days of cramming
i would cram so much new material
take a nap
and cram more new material
then sleep, and it worked pretty effectively
until one day the nap didnt nap and i was on my phone for 4 hours.
A nap like 10min or 1h30 ? Because if it mess with your normal sleep schedule, might backfire a bit no ?
it would be 30 minutes
Of course. The generator is implemented differently between Rust and Python.
but it would be no later than 3pm
poop
crap, even
shit, if I may
I guess we don't need it after the next release?
The idea of finding the best way to group cards to base a preset on has died down. Will it come back
Nope
Why not, when you basically just said FSRS 6 would benefit more from this
Unclear how to do it algorithmically + users already can do it themselves based on the content of the cards
Jarrett asked me to open this as an issue on github. I guess this will not come to fruition...
It has been a while
Hey guys, I came across this thing about tackling knowledge fragmentation called incremental reading. Not super clear on it, but I think it’s some kinda card-making process. Why does this even help with the issue?
Like, say i make cards from a PDF, then use pdf.js to display ‘em in a card interface—wouldn’t that give me the same vibes?
Oh, wait, I have the rights to close other people's issues?
dont
@quasi shadow for some reason I can close this issue https://github.com/open-spaced-repetition/fsrs4anki/issues/709, but not issues in the benchmark repo 🤔
Permissions are wack
I am not worthy of such power
Dont close it yet. Let Jarrett decide
😎 Leave it open. Maybe some random guys would solve it when I quit.
If I remember correctly real "short-term memory" (academic definition) tops out at only a few minutes e.g. 5 mins.
What we call "short term memory" in the Anki community is actually often a kind of in-between point where multiple different memory processes are happening at once before things are solidified in "long-term memory" which is why it is so weird and hard to model.
Yep sometimes I also wonder if "long term" memory is not "long term" simply because how "strong" the connections are, instead of just being a question of "stability" getting longer simply by repetitions alone
I already said it a few times so sorry if I'm like a broken record, but I see in Anki the relation between reps and stability is negative : Cards with higher reps have lower stability, and even if it might just be a matter of inherent difficulty, even if I look at cards I rep'ed a lot, it feels each lapse doesn't necessarly grow that much between cycles
They look like this
That was kind of how I was trying to detect leeches in my January attempt. I was looking at the largest passed intervals in "pass chains" and seeing how they changed / how much total time you spent studying to reach your last peak.
Yep that's a nice idea
I still think that something like that needs to be mixed in with the Poisson Binomial stuff to get a good detector.
But I think to detect those cards, it might be as simple as to say : "What are the cards you already lapses 3 times" 
And I know I was the first one to say it was meaningless to just watch lapse with FSRS that predict probability
But I think I was terribly wrong lol
I was trying to find an "objective" measure of leechness after the fact so you could try training a NN to detect them early.
Lapse Distribution, Average Load by Lapse ... it feels as soon as I start to lapse, things don't get better
Yeah I think effort put in this is quite useful, don't get me wrong @cursive badge
Because clearly, I think some cards might go from "leechy state" to "healthy state" at some point
But I have very very few example
I search for things like "prop:s>N and prop:reps>M", and it's depressing
Over 1789 mature card (prop:s>20), the 1th percentile (17 cards) with the most reps has 25 reps with 840 cards that have more than 25 reps
The perfectionist in me wants Anki to have a full version control system so we can do things like go back in time and see if tweaking card templates/field info affected card leechness.
We need a DeLorean
That's why I do those search, I try to see what triggered that switch between a leechy state to a high stability one
I'm even wondering if the answer is in the revlog
The only 4 match I have for prop:s>20, prop:reps>28, are :
- 眠る : I think I failed it a lot until I just got used to see so much 寝る that I stopped confusing both, so it's 寝る that in fact stopped me confusing it again and again
- 何とか : I think I was never sure if it was なんとか or なにとか, which now sounds awefuly off when I say it aloud.
- 恋 : I confused it with 愛 since they had the same meaning, and later with 窓 because I felt it looked the same, but with time I just realized recognized more easily those 2, which led me to more easily recall this one
- 攻撃 : That one I think it increased simply because I had 4-5 cards with 撃 in it so I cross-review it withiout realizing a lot (爆撃、砲撃、出撃、直撃....)
Sooo it seems what helped the most were OTHER related cards
Because you have the Write right.
This poc, was it included in the leechkit ?
Because I think it make sense, when I look at cards with high stability, the only once that had lapse, are the one that sure had a few lapse, but then had a long non interrupted sequence of just good answer
No, leechkit was just my Poisson Binomial PoC. I only shared graphs from my pass-chain experiments.
ok
I seem to come back to leeches about once a month 😅
Damn I'm looking into it, on 1077 cards with prop:s>60, I have 250 that has lapsed a leat once, and indeed afterwards they never got lapsed again
This month is cursed though, so no PoC this month.
(I mean, they lapsed their number of lapse, but clustered in the beggining)
WHen they were good to go, they just never failed anymore
Lot of issues IRL ?
This week end I want to finish the graphs I was doing (Workload by Lapse/Repetition 5-percentile), but I was curious to create my first addon from scratch. I might take a look
One family member has cancer and had to have invasive surgery. Another had to be rushed to hospital in an emergency and have surgery for other reasons. ☹️
Damn ... Sorry to hear that ...
In more positive news I did your prop:s>20 prop:reps>28 search and have 303 cards. So some leeches may recover.
Trying to brainstorm that idea
Don't know if there are things that could contradicts the last 2 points
I started writing a whole article (about leeches) and then remembered I hate writing and abandoned it 😂
Yeah same 
But I like to write a few chunks just to gather/reflect/discard
"Maybe add a factor to avoid unleeching too fast ? For ex if the new successful interval is 5d instead of 4d."
just use my detector man
(it doesn't currently exist in an easy to use form)
No it's not really good
It sure sounds better than only counting lapses
not really
yes really
It marked as leech things that lapses 2-3 times in a row in the first days but never failed for many days afterward
it's also focusing on rep probability, not really increasing interval
I used it, tweaked it, experimented with it, the things marked were not useful
Don't mean that as an offense but it's really not great unfortunately
To be fair I was also convinced it could lead to something great
I remember being the one to always push for that idea at first
so a bit sorry if now I say the opposite, but it's when you use that you realize if the idea was good or not
and it was not 😦
I feel like it gives you useful info, but it needs to be combined with other things to make it a "leech detector".
At the moment it's just a "something is wonky with the historical FSRS predictions detector" which I do not think is exactly the same thing.
Yep, having an imprecise model would lead to things being detected as leech
And for example if your model overestimate your early retention, a lot of things just get leeched because you got them wrong 2-3 times in a row
I think the benchmark of "is it better than the current system" is the best way to approach this, otherwise nothing will ever happen because nitpicking is fun for everyone
I'm just not convinced it has passed that yet. It can give some odd results.
I'm not convinced the current system is useful in any capacity
It feels silly to spend time refactoring the reviewing code, insert the Poisson Binomial stuff and then not be sure it is doing much better than manually reviewing cards you have lapsed several times.
I'm not saying you shouldn't do it if you are convinced. I'm just saying I'm not.
@unique salmon why do we use the median of parameters as the default rather than something like training a set of parameters that minimizes average loss on the combined revlogs of 100 users?
¯_(ツ)_/¯
i think as we add more and more parameters, the median of parameters makes less sense
because the parameters can interact in strange ways
Ask Jarrett to do this for some subset of users (100 is too few IMO)
I ain't going down the rabbit hole of combining revlogs
you can just concatenate them
but remove the code for recency or precompute the recency weights before concatenation
you don't need to store it into a file
concatenate them in memory and train on them in the same program execution
not really, it's already there in pretrain.py
😭
I felt pain just from reading this sentence
i mean it like probably 95% of the code is ready but you might need to change a line here or there
median of parameters makes some sense for optimization, but we could keep the default parameters separate from the parameters that the optimization starts with
we could keep the default parameters separate from the parameters that the optimization starts with
Can't tell if big brain or overengineering
it's just two different sets of default parameters, not too bad
i mentioned that LSTM's default parameters cannot be used without training, this is just a similar idea
the set of parameters that optimizes performance after training, is not necessarily the same as the parameters that performs best without training
So overengineering then
huh? not sure why such a small change would be overengineering
I doubt that the gain in accuracy would be significant, if exist at all. And it would require adding extra logic in Anki, so you'd have to convince Jarrett
Like, for using one set of parameters as default for users and the other as a starting point
I expect that the set of parameters that performs better as default is also better as a starting point, btw. Aka I expect that this doesn't work
I guess once this PR is merged, I'll test it
https://github.com/ankitects/anki/pull/3929
wait for open-spaced-repetition/fsrs-rs#313
I will release FSRS-rs v3.0.0 after the above PR is merged.
FSRS 6.0 Integration
Changes
Updated FSRS dependency to version 6.0
Added support for FSRS 6...
i think accuracy might increase, iirc there are users where simply increasing # of epochs would increase performance
and even if this is not the case, if we can decrease training time for the same performance then it would still be worth it
like -20% training time for the same perf would be great but that's a stretch
this should become more and more false the more parameters that we add
you can optimize directly for initial parameters that do well with whatever LSTM uses
for now if we can get a set of parameters from joint optimization, we can check if it then does better or worse after finetuning
if it does much better or much worse then we would know that the optimization is sensitive to initial parameters, would be worth exploring more with an actual meta-learning algorithm
I'm trying to think when exactly parameters that are the best as a starting point are not the best as default
So imagine a high-dimensional parameter space. The closer the starting point is to the optimal point in the parameter space, the faster optimization will converge. So we want to pick the starting point that is as close to optimal as possible.
At the same time, if two points are close in the parameter space, their respective loss must also be close. Unless there are some insanely sharp valleys such that a small difference in parameters results in a large difference in loss.
So what I'm trying to say is that as long as there are no crazy sharp valleys, I don't see how a closer-in-the-parameter-space starting point would result in worse loss somehow
Aka I'm assuming that loss doesn't look like this
Ok, here's a better illustration because I suck as explaining it
As you get closer in the parameter space, the loss also gets more similar
Hence "Closer to the optimal point in the parameter space" = "Has a lower loss"
Aka "if it's a better starting point for optimization, it will also be better for users in terms of accuracy"
Unless the loss landscape looks like a jagged mess
fsrs now has 20+ parameters its probably quite complex
@unique salmon https://arxiv.org/pdf/1803.02999
take a look at "4 Case Study: One-Dimensional Sine Wave Regression" for a toy problem where joint optimization is obviously much worse
I asked Gemini 2.5 Pro to modify the pretrain file for me, lel
We'll see how it goes
why tf it takes 1.5 hours to concatenate users...
bro
This isn't training, THIS IS JUST CONCATENATION
That is tragic
it's 20 minutes now, yippie
Still seems two-three orders of magnitude slower than what I would expect
Wow, is that just for loading the parquet files into memory?
yep
Actually, wait, will I even have enough RAM for 500 users...
oops
Guess we'll see
I'm sad to say that discord has no good GIFs for "download more RAM" 😢
#1282005522513530952 message do this then group by user id maybe idk
this is why i say that processing the revlogs is the slowest part of the code and there is much room to improve here
I've heard that polars is the new hotness for doing stuff like pandas but faster.
btw you should make sure that the script works on fewer users, you don't want to wait ~2 hours just for it to error
Alright, I'll do it on 10 users. I'm restarting anyway because >150 will make my RAM explode
I HATE YOUR JARRETT
NOT EVEN ONCE
HAVE I MANAGED
TO RUN
THE
FUCKING
BENCHMARK
CODE
ON
THE
FIRST
TRY
AAAAAAAAAAAAAAAA
Cursed 😂
This isn't even modified pretrain, just other.py
Like, this is literally unmodified
@quasi shadow just add FSRS-6 to pretrain so I you can test the idea of optimizing FSRS parameters on a combined revlog to obtain better default parameters, I will have a mental breakdown trying to modify anything in the benchmark code
...or even run it at all
make sure fsrs-optimizer is up to date
Oh, yeah, I keep forgetting that I need to explicitly specify the full path, otherwise it doesn't work
if DEV_MODE: sys.path.insert(0, os.path.abspath("../fsrs-optimizer/src/fsrs_optimizer/"))
This is the code that works for Jarrett but not for me, so I have to modify it every time
bro why
I updated the optimizer AND specified the full path
I made 100% sure I'm using the latest optimizer code and the benchmark code from the FSRS-6 branch
maybe add a print to the optimizer code to see if it shows up
Memo for Jarrett:
The idea is to optimize FSRS on a combined revlog of a lot of users (I wanted to do 500 btw), get parameters from that and see if they work better than the current median parameters
- Pretrain FSRS-6 on 200 or 500 or whatever users
- Get the parameters from that
- Run FSRS-6 without optimization ("dry run"), with new parameters as defaults
- See if log loss/RMSE are better with these parameters than with median parameters
I have added FSRS-6 to other.py.
I don’t know why you still have that error.
Please use git to manage your code.
added
It's worse.
I used the parameters pretrained in first 100 collections as the default parameters.
But the algorithm performs worse in first 100 collections.
i think you are supposed to avoid finetuning on the users, to simulate anki users who never press Optimize
and make sure to test on users that were not pretrained on
Do you mean this?
yeah
It has been there.
but were those default parameters from the median of parameters?
But the model doesn't perform well even on users pretrained on.
Yes of course.
the idea is to try to find a better set of default parameters from joint training only for users who never press Optimize
it is normal that the set of parameters that performs best for users who never press Optimize, is not necessarily the same as the set of parameters that does well after user-specific optimization
this is the whole point of why LSTM uses the Reptile algorithm rather than doing joint optimization like GRU
OK, it performs better on users pretrained on.
I will evaluate it on users not pretrained on later.
Interesting
It truly works better.
I have excluded the first 100 users from the evaluation.
nice!
😂 So we will have two sets of parameters.
A set of parameters for initial parameters of optimization.
yep
A set of parameters for users who haven't optimize the parameters.
Should I pretrain the parameters with more collections?
for the median of parameters method, why is the decay exactly 0.2?
Because the median is calculated from the previous experiment with fixed decay=0.2.
icic
Very newbie question but, initial parameters influence the optimize result ? Is it because of some kind of "local maximum vs global maximum" ?
@polar maple , do you have any ressource to recommend to learn a bit all this ?
I think that in this case it's more that for optimization we use a fixed computational budget, so the median parameters just makes each parameter closer to the final value on average. if the computational budget is increased then perhaps both methods would converge to something similar
Ah make sense. I noticed for example that my optimization result in Anki is better than the one in the playbook (more precise), while the playbook take long to compute, but the one in Anki start with already almost the same parameters, so the computational budget make sense, the closer from the goal you start, the easier you get there
you can check the paper I linked to expertium or take a look at OpenAI's blog on the Reptile algorithm
so it seems that initial parameters might be worth looking into more, I'll try the reptile algorithm on FSRS later
Yeah but I was thinking a bit more globally. I've done the coursera course on AI but it really goes like : This is linear regression, this is gradient descent, this is neural network, you create X layers of Dense layers, you train, tada
maybe decrease logloss by 0.001...
But it seems it's only the first first step towards building something useful
ok I'm sorry if im wrong this isnt my thing but I thought there was an actual penalty for the weights going away from the defaults
loss = (self.loss_fn(retentions, labels) * weights).sum()
penalty = torch.sum(
torch.square(self.model.w - self.init_w_tensor)
/ torch.square(DEFAULT_PARAMS_STDDEV_TENSOR)
)
loss += penalty * self.gamma * real_batch_size / epoch_len````?
yeah it's true but I don't think it is too related
but depending on how jarrett implemented the test for the joint optimization params, that code might have prevented parameters from properly reaching optimal user parameters
since the joint optimization params will have values that are further than the median
so maybe you're into something
I also remember to do gradient descent you can have variable "step size" to converge faster at first (and reduce them when you're getting closer to the optimal)
I don't know if it's applied here
fsrs uses a cosine annealing scheduler, using a larger learning rate at first that decays to 0 over time following a segment of a cosine curve
but with the fixed computational budget I think it doesn't reach optimal params for some users
Please post the results on all 10k users once it's done. And add it to the readme as FSRS-6 def. param, of course
And don't exclude those 100 (or however many, if you plan to pretrain FSRS on more) from evaluation, since for the table in readme we want all algorithms to be evaluated on exactly the same data
"As much as your RAM allows", as Alex said
You can see the effect of L2 regularization here: https://github.com/open-spaced-repetition/srs-benchmark/commit/f64e6e13dacc8d86cf29ee3c885404921f905e6a
OK, I will try to pretrain it in first 500 collections.
PID PGRP USER PRI NI VIRT RES S CPU% MEM%▽NLWP TIME+ Command (merged)
45793 45793 jarrettye 17 0 448G 54.9G ? 98.3 55.9 21 1h19:08 python3.9│python pretrain.py --algo FSRS-6
It takes ~55GB RAM😅
[0.1251, 0.7461, 1.8218, 9.947, 7.3195, 0.8448, 3.5258, 0.001, 1.9802, 0.156, 0.8337, 0.5343, 0.001, 0.4795, 0.8497, 0.7124, 1.0011, 0.7391, 0.3847, 0.1437, 0.1373]
I'm worried that w[16] is low
w[16]?
yeah
it's werid
[0.2172, 1.1771, 3.2602, 16.1507, 7.0114, 0.57, 2.0966, 0.0069, 1.5261, 0.112, 1.0178, 1.849, 0.1133, 0.3127, 2.2934, 0.2191, 3.0004, 0.7536, 0.3332, 0.1437, 0.2]
If we compare it with the median
w[11] and w[14] are significantly lower than the median
fine, I will do more research in the next week.
Also, w[19] is the same in both of your lists somehow
[0.2172, 1.1771, 3.2602, 16.1507, 7.0114, 0.57, 2.0966, 0.0069, 1.5261, 0.112, 1.0178, 1.849, 0.1133, 0.3127, 2.2934, 0.2191, 3.0004, 0.7536, 0.3332, 0.1437, 0.2]
[0.1251, 0.7461, 1.8218, 9.947, 7.3195, 0.8448, 3.5258, 0.001, 1.9802, 0.156, 0.8337, 0.5343, 0.001, 0.4795, 0.8497, 0.7124, 1.0011, 0.7391, 0.3847, 0.1437, 0.1373]
median parameters:
1:again, 2:hard, 3:good, 4:easy
first rating: 1
rating history: (1,3,3),3,3,3,3,3,3,3,3
interval history: 0.0d,0.0d,0.0d,1.0d,2.0d,6.0d,16.0d,1.3m,3.1m,7.1m,1.2y
factor history: 0.0,0.0,0.0,0.0,2.00,3.00,2.67,2.50,2.35,2.26,2.14
difficulty history: 0,7.0,7.0,6.9,6.9,6.9,6.9,6.8,6.8,6.8,6.7
stability history: 0,0.2,0.3,0.5,2.3,6.1,16.1,40.0,94.4,211.6,454.2
first rating: 2
rating history: (2,3,3),3,3,3,3,3,3,3,3
interval history: 0.0d,0.0d,0.0d,2.0d,6.0d,18.0d,1.6m,4.2m,10.0m,1.9y,4.1y
factor history: 0.0,0.0,0.0,0.0,3.00,3.00,2.72,2.55,2.41,2.29,2.18
difficulty history: 0,6.2,6.2,6.2,6.2,6.1,6.1,6.1,6.1,6.0,6.0
stability history: 0,1.2,1.5,1.8,6.1,17.8,49.0,125.3,301.5,687.9,1496.9
first rating: 3
rating history: (3,3),3,3,3,3,3,3,3,3,3
interval history: 0.0d,0.0d,4.0d,14.0d,1.5m,4.5m,1.0y,2.6y,6.3y,14.5y,31.6y
factor history: 0.0,0.0,0.0,3.50,3.21,3.00,2.76,2.57,2.42,2.29,2.18
difficulty history: 0,4.9,4.9,4.9,4.8,4.8,4.8,4.8,4.8,4.8,4.7
stability history: 0,3.3,3.5,13.7,45.2,134.6,371.9,957.4,2316.1,5301.7,11546.9
first rating: 4
rating history: (4),3,3,3,3,3,3,3,3,3,3
interval history: 0.0d,16.0d,2.2m,7.9m,2.1y,6.4y,17.6y,45.2y,100.0y,100.0y,100.0y
factor history: 0.0,0.0,4.06,3.65,3.27,2.99,2.76,2.57,2.21,1.00,1.00
difficulty history: 0,2.5,2.5,2.5,2.5,2.5,2.5,2.5,2.5,2.5,2.5
stability history: 0,16.2,65.4,236.5,775.7,2321.7,6413.6,16500.8,36500.0,36500.0,36500.0
pretrained parameters:
1:again, 2:hard, 3:good, 4:easy
first rating: 1
rating history: (1,3,3),3,3,3,3,3,3,3,3
interval history: 0.0d,0.0d,0.0d,1.0d,2.0d,6.0d,17.0d,1.4m,3.3m,7.2m,1.2y
factor history: 0.0,0.0,0.0,0.0,2.00,3.00,2.83,2.53,2.33,2.15,2.03
difficulty history: 0,7.3,7.3,7.3,7.3,7.3,7.3,7.3,7.2,7.2,7.2
stability history: 0,0.1,0.2,0.4,2.2,6.5,17.2,43.0,99.5,214.9,436.0
first rating: 2
rating history: (2,3,3),3,3,3,3,3,3,3,3
interval history: 0.0d,0.0d,0.0d,1.0d,5.0d,17.0d,1.7m,4.7m,11.6m,2.2y,4.7y
factor history: 0.0,0.0,0.0,0.0,5.00,3.40,3.06,2.71,2.48,2.28,2.13
difficulty history: 0,6.0,6.0,6.0,6.0,6.0,5.9,5.9,5.9,5.9,5.9
stability history: 0,0.7,1.0,1.4,4.7,16.9,51.6,140.9,348.9,796.9,1698.0
first rating: 3
rating history: (3,3),3,3,3,3,3,3,3,3,3
interval history: 0.0d,0.0d,2.0d,12.0d,1.8m,6.6m,1.8y,5.1y,13.1y,31.0y,68.2y
factor history: 0.0,0.0,0.0,6.00,4.42,3.75,3.24,2.87,2.59,2.37,2.20
difficulty history: 0,2.9,2.9,2.9,2.9,2.9,2.9,2.9,2.9,2.8,2.8
stability history: 0,1.8,2.2,11.5,52.9,198.7,644.6,1849.0,4781.2,11324.5,24883.8
first rating: 4
rating history: (4),3,3,3,3,3,3,3,3,3,3
interval history: 0.0d,10.0d,1.8m,7.9m,2.4y,7.6y,21.5y,55.0y,100.0y,100.0y,100.0y
factor history: 0.0,0.0,5.40,4.37,3.69,3.19,2.83,2.55,1.82,1.00,1.00
difficulty history: 0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
stability history: 0,9.9,53.9,236.4,870.4,2776.7,7853.7,20062.7,36500.0,36500.0,36500.0
Idk, try it on different 500 users ¯_(ツ)_/¯
I feel like pretrain is bugged. Easy bonus being almost exactly 1.0 is too sus. And the fact that w[19] is the same
why can't fsrs find my reviews?
all of my decks have a preset (appropriately named FSRS) set to them which has fsrs enabled, and as you can see here one of my add-ons lets me view which preset is applied to where easily
sorry i'm very new to fsrs (and anki)
am i doing something wrong guys
Do you have any cards in these decks? They look empty on this screenshot
welp
idk
DM me and send me your collection, if you want
What's the name of the add-on btw?
it's probably this
oh okay
hang a sec
DM sent.
note that i didn't include media in the collection i exported to you
All of your reviews were today. You don't have cards that have been reviewed over the course of >1 day
yea i just started anki
This is an edge case I've never considered before 😅
In 1.5 years of FSRS being a thing, you are the first one to run into such a problem
Basically, just do more reviews
i literally just told someone i downloaded anki and i got told "great, now setup fsrs"
so i did it XDDDDD
how many more reviews mate
You'll need 1-2 dozen probably. I mean, for optimization to do anything
And I don't mean "review more new cards today"
As I said, you need to review cards multiple times (at least 2) over the course of multiple days
i'll give it two weeks
TLDR: many reviews, many days
hopefully two weeks of consistent anki reviews should give me enough stuff for fsrs to work with
yep
unfortunately that means i'll have to suffer in the beginning with the archaic scheduling system
the default parameters are there if i don't input anythig in that box, right
yes
N.B. FSRS is globally on/off. You cannot set it per preset/ per deck. (That's what the little globe icon next to settings means)
what globe icon? 🤔
It's also on a few other settings like "Limits start from top"
"Desired retention" and "FSRS parameters" are per-preset (they don't have the globe icon) but enabling FSRS is global, you cannot have some presets use FSRS and others use SM2.
oh understood
Does FSRS weigh all reviews the same?
SInce Anki 25.02 nope, now more recent reviews have a greater weight
Or 24.11
Ic
Where can I put suggestions for FSRS
I don't remember if recency weighting was added in 24.11 or in 25.02
I think it was 25.02
- Here 😉. Devs visit this place a lot
- Forums: https://forums.ankiweb.net/c/anki/fsrs/19
Forums seem scary
I don't know anything about coding etc, so tell me if it's weird/dumb, but this is my idea.
FSRS could maybe be more optimal if it ignores reviews from cards that are the easiest and most difficult.
I assume that a lot of users will have some cards that just aren't great. Maybe too easy, maybe too difficult. I think ignoring these reviews would lead to a (little bit) better scheduling for all the other cards.
Ignoring easy/hard cards isn't a good idea. It would be better to improve the calculation of difficulty, but we don't know how. We've tried a bunch of things and never managed to significantly improve the difficulty formula, beyond very minor tweaks
if you are sleep deprived/tired
is it better to do the reviews and risk lower retention bc of sleep deprivation
or to leave them until next day
forums are nicer imo
most of us don't code anyway so dw
same i did that
Where did dae disappear to
Resting after a long day of running away from automatic optimization
He has been off for a couple of days now and the day he decides to do so is the day FSRS 6 is waiting to be merged 💀
I really hope he makes a build soon
Guess it was on his schedule that he releases the security update before easter
Before that, we want to figure out a better way to make default parameters
Alex's idea of training FSRS on a combined history from 500 users is good, but then we get really weird parameters
I feel like pretrain is bugged. Easy bonus being almost exactly 1.0 is too sus. And the fact that w[19] is the same
The last params also seem quite close as well
Isnt this the same problem from before
Did this not get fixed
?
The last 2 params for me always end up the same as default or just turns 0 whenever i optimize
Ah, no, that's a different matter
The last two parameters (they won't be the last anymore, but you get what I'm talking about) will not be set to 0 anymore
okay what about them being almost always the same as default then
0.5166, 0.6621 I almost know these numbers by heart now
Idk about that. But basically, they will have dynamic max. values in a way that prevents them from becoming too large and causing the "the interval after Again and a few learning steps is longer than before Again" issue
@quasi shadow I wonder if something about concatenating revlogs from different users is broken, or if the optimizer itself is bugged. The latter would be much worse
I prefer to take the hit personally. Sometimes it's a good way fo find out cards that were not that great in the first place. Also the hit was never more than ~10% avg in my case
Try using the same batch size as with normal training 🤷♂️
Idk why the hell batch size of all things would cause easy bonus to be 1, but might as well try because why not
For Automatic Preset Assigning, has anyone tried grouping presets based on similar retention in the past month or so? I have a preset with a desired retention of 85%, with a true retention of 85.3% in the past week for all decks in the preset combined, but one deck has a retention of 94.4% in the past week (102 passes, 6 fails) and another deck has 78.9% in the past week (165 passes, 44 fails).
I know some Med students have a shit ton of different decks to review, so perhaps being able to select which parent deck to assign the presets under for all of its subdecks based on similar retention in the past month could be a convenient feature to make sure each individual deck hits the desired retention rate more accurately
Yes that has been more or less my idea.
Grouping cards similar in difficulty into one group
other difficulties in other groups
Imagine a deck with 90 easy(-ier) cards and 10 hard(er) cards. The preset will be skewed by the easy cards and the intervals scheduled will never be close to appropriate enough for the hard cards
i havent done it by retention but ive done it by difficulty
lets see how my retention has changed actually
it seems my mature retention has shot straight down but my young has gone up slightly
if we zoom in tho its like
but it has dropped bc the amount of mature cards used to be on avg like 20 a day
now its like 3 a day
I don't think heterogeneity pulls in all the other cards to "compensate." I'd say that FSRS groups together the different card cases that shares similar behavior to a certain extent and treats them differently from the rest, as long as there's enough data on these cards.
If, on the other hand, the data on heterogeneous cards is scarce, then this can be problematic, and they will tend to be leeches. Therefore, perhaps if these cases are scarce in a preset, it would be better to move them to other preset. Otherwise, I don't think it's necessary. But of course, I'm no expert.
Also did it for the past 20 days really glad I did
what is heterogeneity in this case
- Slightly Less Workload (10%)
- The low D cards have a Difficulty curve that is more useful than a lapse-proxy
- The low D have good precision even with low default params
- The High D have very very low interval so I see them way more
It's also quite easy to operate :
Low D : Lapse at 6, tag only
High D : Lapse at 12, suspend auto
once per week I move the tagged one from Low D to High D
once per week, I reset the High D, and I set a "new card/day" count to 1, and for that card, I really focus on improving my understanding of it (outside Anki)
Retention has been still quite predictable by FSRS, still at 90%, and even today my first time at 96%
The 96% being more about how I treat cards i fail now
Spent the whole day coding some algo to analyze the lapses yesterday
Card : 1710712242101 Past Lapses : [0, 2, 20, 25] Current Max Interval : 42
Card : 1708207159229 Past Lapses : [0, 1, 0, 3] Current Max Interval : 91
Card : 1723541612090 Past Lapses : [4, 8] Current Max Interval : 28
Card : 1708259347988 Past Lapses : [0, 0, 1, 0, 0, 5, 6, 0, 27, 30, 21] Current Max Interval : 12
Card : 1708787872864 Past Lapses : [1, 28, 9, 3, 27, 15, 0, 1, 14] Current Max Interval : 8
Card : 1716748875647 Past Lapses : [2, 9, 7, 9, 10, 0, 2, 3, 0, 2, 18] Current Max Interval : 0
Card : 1715717287839 Past Lapses : [0, 0, 5, 9, 4, 8, 8, 32, 9, 4] Current Max Interval : 8
Card : 1711230892107 Past Lapses : [42, 38, 6, 2, 5] Current Max Interval : 1
Card : 1708440946044 Past Lapses : [0, 0, 0, 0, 4, 15, 11, 0, 14, 6, 5, 19, 15] Current Max Interval : 9
Card : 1727897994100 Past Lapses : [0, 0, 0, 1, 0, 1, 0, 1, 6, 12, 0, 9, 7, 5] Current Max Interval : 5
For each lapse cycle, I isolate the biggest duration successful
Clearly shows that the one I lapse the most, does not even respect a somewhat-increasing pattern
They might have a success at 28d interval, then failing after a 4d interval, etc etc
So while we focus a lot on time stability, I think there is also some stuff to think about in terms of knowledge stability. (The low K Stability would be cards with some kind of "coin flip" errors, while T Stability would be more like "normal memory degradation")
We could model it so that a card has some maximum p(recall)<1, but then scheduling at high retentions would be impossible
How do you schedule a card with DR=99% if the card's maximum R is 95%?
Sure but you see, if some knowledge has a terrible K-Stability (Sorry for the made-up name, but will be easier like that), meaning : If reviews are coin-flips more than truly recalling information in a logical way, then no matter the DR, you might fail it everyday, and it might lead your T-Stability (the one FSRS tries to predict) to drop very low, when in fact, you just have a big bunch of very low K-Stability card
Would explain why 40-50% of my cards almost never lapse, and when they start to lapse, they can just go to >10 lapse
By splitting deck by Low-D/High-D, in fact that's us trying to separate those
ANd I think it's easy to fall in the Low K-Stability trap in Anki, because you can fail it because you lose your daily coin-flip, but feel afterward confident because you recalled it very easily 5min later... When in fact, you just compensate your terrible K-Stability with the fact you just had a very very short term review.
Sorry if it's a bit vague but it's still a bit fresh in my head too 😆
What I try to achieve with my code above, is the fact to observe if we're in front of something that behave accordingly to our expectation of a T-Stability ([0, 2, 20, 25], [0, 0, 5, 9, 4, 8, 8, 32}), or if we're in front of something super super messy that make no sense in terms of "recall of stable information" ([1, 28, 9, 3, 27, 15, 0, 1, 14])
But it needs to be analyzed outside the realm of FSRS, because by definition, here we're trying to assess things not in terms of probability, but in terms of actual performance
Also the leech detector with the poisson stuff is a bit different, because if FSRS predict each time 90% recall, but the user fail accordingly to those 90%, but never with long term increase of T-Stability, it won't be detected as a leech, while in fact, in terms of leech being "Cards that doesn't seem to have a increasing T-Stability over time", they would be leeches
So it's a bit of a different goal I think
But it's still interesting because it shows that T-Stability might not be the only form of Stability
Well, what could be interesting is based on those sequence, have function that would detect unstability 😄
Linear Regression that has slope <0 ?
Threshold based on a cost function comparing actual perf to that regression ?
The code to test it on your collection and some card id : https://github.com/JSchoreels/anki-addon-leechdetector/blob/main/leechdetector/test.py
The code to compute the array itself : https://github.com/JSchoreels/anki-addon-leechdetector/blob/main/leechdetector/LeechDetector.py#L41-L63
I'll continue next week on it 🙂 Saturday's new ritual 😆
My hope is that if we can cluster cards based on previous performance on different "K-Stability" cluster, then FSRS could run better on each of those
That K-Stability would be some kind of Difficulty rating in some way
@unique salmon using the simulate config for CMRR works a small bit I think 🎉
haven't tried changing the formula at all but hey maybe I don't need to
still seems to be a much smaller value than before though
Yippie!
I do recommend doing the integral thing that I described
this
Remove the part for handling decay=-1, it won't be needed
Use an integral over the next 3 or 5 years
And it's using real card states of already learned cards and all that, right?
nope no real card states
i'd have to change fsrs-rs for that
i guess i could try and dash to do it before fsrs-rs 3.0.0 releases XD
In the final implementation it definitely should
What about loss aversion? The dumb 2.5 multiplier that should not be used when plotting time?
I was thinking of not using it when plotting time, but still using it for CMRR
Since for plotting time we want accurate real time
i might be wrong but i'm pretty sure loss aversion is gone already
I don't think so. IIRC it's in fsrs-rs and not in Anki itself, so it should be gone only when the PR with FSRS-6 is merged
I merged the fsrs-6 pr into the CMRR pr while i was testing it so if its gone with FSRS-6 then it will be there for the FSRS-5 ones and gone for the FSRS-6 ones
Ah, ok
Just tried it and FSRS-5 and 6 are both 70% now interestingly enough
i'd guess its because i'm using a deck thats "done"
wait i can just add more cards hold on XD
nope that doesnt fix it either
weird XD guess its RNG.
task
these parameters
are very interesting
for my deck with low difficulty
fair enough
Total Cards reviewed : 3080
Cards That lapsed at least once: 1905 (61.85%)
Cards That had 0 dropping lapse: 1159 (37.63%)
Cards That had 1 dropping lapse: 457 (14.84%)
Cards That had >1 dropping lapse : 289 (9.38%)
(Lapse Distribution : {10: 1, 9: 5, 8: 26, 7: 55, 6: 95, 5: 152, 4: 214, 3: 269, 2: 365, 1: 579, 0: 144})
(Dropping Lapse Distribution : {2: 212, 3: 65, 4: 11, 5: 1, 1: 457, 0: 1159})
Damn damn damn damn
Some examples of those "recurring" droppers
Card : 1732830933345 Past Lapses : (8) [5, 4, 2, 7, 4, 3, 5, 3] (now:4) Drops:5 BiggestDrop:3 Mean:4.12 Median:4.0
Card : 1708383941132 Past Lapses : (8) [1, 22, 7, 13, 6, 28, 11, 7] (now:12) Drops:4 BiggestDrop:17 Mean:11.88 Median:9.0
Card : 1708787872864 Past Lapses : (8) [1, 28, 9, 3, 27, 15, 1, 14] (now:8) Drops:4 BiggestDrop:19 Mean:12.25 Median:11.5
Card : 1709156003043 Past Lapses : (7) [3, 15, 10, 8, 3, 52, 9] (now:8) Drops:4 BiggestDrop:43 Mean:14.29 Median:9
Card : 1711663821086 Past Lapses : (8) [5, 11, 3, 17, 19, 11, 9, 3] (now:7) Drops:4 BiggestDrop:8 Mean:9.75 Median:10.0
Card : 1711664036084 Past Lapses : (8) [1, 23, 6, 13, 6, 5, 4, 13] (now:11) Drops:4 BiggestDrop:17 Mean:8.88 Median:6.0
Card : 1711750511210 Past Lapses : (7) [1, 22, 10, 21, 14, 5, 4] (now:8) Drops:4 BiggestDrop:12 Mean:11.00 Median:10
Card : 1717096940018 Past Lapses : (8) [1, 8, 4, 15, 9, 6, 5, 6] (now:5) Drops:4 BiggestDrop:6 Mean:6.75 Median:6.0
Card : 1727125905731 Past Lapses : (7) [2, 4, 2, 21, 8, 4, 3] (now:6) Drops:4 BiggestDrop:13 Mean:6.29 Median:4
Card : 1727810008064 Past Lapses : (5) [15, 9, 8, 5, 2] (now:7) Drops:4 BiggestDrop:6 Mean:7.80 Median:8
Card : 1729017180622 Past Lapses : (9) [1, 1, 4, 3, 4, 2, 12, 9, 4] (now:6) Drops:4 BiggestDrop:5 Mean:4.44 Median:4
Card : 1730934966226 Past Lapses : (7) [8, 5, 5, 2, 8, 5, 2] (now:6) Drops:4 BiggestDrop:3 Mean:5.00 Median:5
Card : 1708463669829 Past Lapses : (8) [4, 8, 3, 4, 8, 5, 29, 17] (now:20) Drops:3 BiggestDrop:12 Mean:9.75 Median:6.5
It's interesting to see that some, could already be detected at Lapse Count of 5
If I sort by lapse descending, I have this
Card : 1732919779145 Past Lapses : (10) [1, 2, 4, 6, 6, 3, 1, 1, 1, 7] (now:6) Drops:2 BiggestDrop:3 Mean:3.20 Median:2.5
Card : 1716748875647 Past Lapses : (9) [2, 9, 7, 9, 10, 2, 3, 2, 18] (now:0) Drops:3 BiggestDrop:8 Mean:6.89 Median:7
Card : 1729017180622 Past Lapses : (9) [1, 1, 4, 3, 4, 2, 12, 9, 4] (now:6) Drops:4 BiggestDrop:5 Mean:4.44 Median:4
Card : 1730241729680 Past Lapses : (9) [1, 1, 1, 2, 10, 4, 4, 6, 5] (now:8) Drops:2 BiggestDrop:6 Mean:3.78 Median:4
Card : 1731449637757 Past Lapses : (9) [2, 5, 4, 5, 9, 2, 4, 3, 6] (now:6) Drops:3 BiggestDrop:7 Mean:4.44 Median:4
Card : 1732573034249 Past Lapses : (9) [1, 5, 2, 2, 1, 3, 6, 4, 8] (now:6) Drops:3 BiggestDrop:3 Mean:3.56 Median:3
I have the feeling that I should also sort by "Increase of Interval" even when there is not much drops
Lapsing 10 times with only 2 drops, but never going above 6d interval is pretty bad
Or maybe instead of talking about "drops", talking about "non increasing performance through lapse" (so the condition would be strictly increasing perf)
Card : 1719523268011 Past Lapses : (5) [2, 2, 4, 4, 19] (now:7) Drops:0 BiggestDrop:0 Mean:6.20 Median:4
Card : 1722962900727 Past Lapses : (5) [1, 1, 2, 7, 18] (now:14) Drops:0 BiggestDrop:0 Mean:5.80 Median:2
Or maybe something like "Card with Lapse Cycle #i, have in average, a max interval of #ivl"
Hmmm no, nevermind, a card could have a rought start but recover later.
Cards That had [2/3, 1.0] failed outperformance ratio lapse: 253 (61.63%)
Cards That had [1/3, 2.3] failed outperformance ratio lapse: 478 (25.09%)
Cards That had [0.0, 1/3] failed outperformance ratio lapse : 253 (13.28%)
Card : 1708778862604 Past Lapses : (2) [55, 9] (now:11) Drops:1 BiggestDrop:46 Mean:32.00 Median:32.0 FailedOutperforamnce:1 FailedOutperforamnceRatio:100.00
Card : 1708786226533 Past Lapses : (2) [166, 11] (now:4) Drops:1 BiggestDrop:155 Mean:88.50 Median:88.5 FailedOutperforamnce:1 FailedOutperforamnceRatio:100.00
Card : 1709075502403 Past Lapses : (2) [52, 20] (now:29) Drops:1 BiggestDrop:32 Mean:36.00 Median:36.0 FailedOutperforamnce:1 FailedOutperforamnceRatio:100.00
[...]
Card : 1719510273220 Past Lapses : (4) [1, 13, 11, 2] (now:4) Drops:2 BiggestDrop:9 Mean:6.75 Median:6.5 FailedOutperforamnce:2 FailedOutperforamnceRatio:66.67
Card : 1719525561629 Past Lapses : (4) [2, 2, 12, 15] (now:6) Drops:0 BiggestDrop:0 Mean:7.75 Median:7.0 FailedOutperforamnce:2 FailedOutperforamnceRatio:66.67
Card : 1723392698570 Past Lapses : (4) [2, 25, 9, 11] (now:0) Drops:1 BiggestDrop:16 Mean:11.75 Median:10.0 FailedOutperforamnce:2 FailedOutperforamnceRatio:66.67
Card : 1723392771593 Past Lapses : (4) [2, 22, 14, 5] (now:7) Drops:2 BiggestDrop:9 Mean:10.75 Median:9.5 FailedOutperforamnce:2 FailedOutperforamnceRatio:66.67
[...]
Card : 1730241623530 Past Lapses : (4) [3, 5, 12, 9] (now:8) Drops:1 BiggestDrop:3 Mean:7.25 Median:7.0 FailedOutperforamnce:1 FailedOutperforamnceRatio:33.33
Card : 1730241741880 Past Lapses : (4) [1, 4, 10, 4] (now:15) Drops:1 BiggestDrop:6 Mean:4.75 Median:4.0 FailedOutperforamnce:1 FailedOutperforamnceRatio:33.33
Card : 1730495865747 Past Lapses : (4) [1, 4, 1, 2] (now:23) Drops:1 BiggestDrop:3 Mean:2.00 Median:1.5 FailedOutperforamnce:1 FailedOutperforamnceRatio:33.33
Card : 1730563415299 Past Lapses : (4) [2, 4, 17, 6] (now:8) Drops:1 BiggestDrop:11 Mean:7.25 Median:5.0 FailedOutperforamnce:1 FailedOutperforamnceRatio:33.33
Failed Outperformance Ratio looks a bit better 🙂
Cards with high number of lapse also fall in the [1/3, 2/3] ratio of failed outperforming lapses
Card : 1730066278081 Past Lapses : (8) [2, 2, 5, 5, 10, 4, 2, 12] (now:5) Drops:2 BiggestDrop:6 Mean:5.25 Median:4.5 FailedOutperforamnce:4 FailedOutperforamnceRatio:57.14
Card : 1731854761426 Past Lapses : (8) [1, 4, 18, 4, 5, 2, 3, 2] (now:4) Drops:3 BiggestDrop:14 Mean:4.88 Median:3.5 FailedOutperforamnce:4 FailedOutperforamnceRatio:57.14
Card : 1732461461198 Past Lapses : (8) [1, 1, 1, 2, 10, 3, 1, 9] (now:6) Drops:2 BiggestDrop:7 Mean:3.50 Median:1.5 FailedOutperforamnce:4 FailedOutperforamnceRatio:57.14
Card : 1732919779145 Past Lapses : (10) [1, 2, 4, 6, 6, 3, 1, 1, 1, 7] (now:6) Drops:2 BiggestDrop:3 Mean:3.20 Median:2.5 FailedOutperforamnce:5 FailedOutperforamnceRatio:55.56
question
if i have 2 cards that is like
glucagon does what to cAMP levels
(increase)
insulin does what to cAMP levels
(decrease)
today i got the 1st wrong
i said decrease, when it should be increase, now obviously after like 5-10 other cards, i know that since the 1st one is increase now it will be decrease
should i make this card wrong because of recency bias? i wouldve gotten it wrong if i didnt have the exposure
I'd say hit good and move on with your life
or maybe i should rework that card
the shame of hitting good will stick in your mind
TBH doesn't matter that much
I kinda like to put one at "Again" and one at "Good" to not keep them in sync forever
But really depends
hmmm that sounds like a good idea
I did a bit of everything, both Again, one Again/one Good, one Again/one Hard ... didn't really noticed much difference
have you restarted?
its ok 😂
i blame the anki overlords
i'm pretty blind myself most of the time
Truth is, as IT guys, rebooting when in trouble is our #1 reflex
yea
And if a reboot is not helping, a second one sometimes is
i had rebooted anki 15 times today
so it kinda skipped my mind
@bold terrace you ever get an answer thats like half right and half wrong
like you say it
the card asked me where is this molecule located in the cell
most of the answers will be
cytosol or mitochondria, depending on the molecule, this time, i said
"mitosol" 🔥
mitosol does not exist.
Yeah ...
I think that's the kind of good example on how knowledge itself can be not stable
And you can be super strict on yourself and mark it as Again even if you guessed it right ...
... But what will make your future you not try to guess it again ?
wasnt even a guess is the worst part
When we do Anki as a chore, we just burst through reviews thinking the sheer number of reviews will fix everything
me rn
But then you get people that has 1M reviews at 2s/review in average that ask for a short memory model that allow steps <10min
I spent the last week taking lot of times to each time I answer wrong a question, note what I answered in a field, when I hesitated, taking a few minutes to check examples on the internet, etc etc
ANd I don't know if I'll be able to replicate it, but my 96% retnetino of today instead of 90% seems to be a sign it was not just time lost
i mean
i am trying to be like this
but i think to myself "ok ill start doing this tomorrow" everyday
atleast for now when the answers wrong, i rewrite it correctly
and speak it outloud
like physically rewrite it
But then you have cards with 200 reviews, 20 lapses, and stability of 5-6d
And you start to wonder if you just didn't really lost more time doing the hamster wheel instead of doing the things you had to do
Yep all my cards I Type always the answer
Even sentences etc
would be nice
Make me type faster and faster in japanese haha
but a majority of my cards are not handmade
To make a field typ-able it's quite easy, you just have to add type: before the tag
{{type:answer}} for ex
{{type:Front}}
For example
yeah but the thing is
this deck im using is very strange
genuinely i have no idea why he made it like this
You'll hate me
its so strange
I know
but i cant
for this deck
im not looking at this deck's retention yet
because i started it 5 days ago
and its made to some videos, basically, i watch the video for the first go, and then go do the deck for this video
😅 Pretraining FSRS is mysterious.
I get three values of w[16] like 3.0004, 1.7976 and 2.4303 in first three collections with pretraining.
Once I concat them into one collection and run the pretraining, I get 1.0898.
It's smaller than all values of w[16] of the first three collections.
reminds me of Simpson's paradox
[0.1643, 1.2527, 2.1041, 10.311, 7.0821, 0.8304, 3.478, 0.001, 2.3956, 0.1686, 0.5738, 0.6228, 0.001, 0.4833, 0.8923, 0.7427, 1.0, 0.6286, 0.001, 0.1439, 0.1736]
I fixed several bugs, but the pretrained parameters still look weird.
I started using FSRS some time ago to increase my retention but that didnt happen. I didnt change the number of new cards a day but the daily reviews are now twice as many as they were before. Should I just turn it off again or will that break something?
Did you optimize parameters?
Yes
Do you know what your retention was before?
I suggest downloading Anki 25.02 (if you haven't yet), going to Stats, scrolling all the way to the bottom and screenshotting the True Retention table
As far as I know it hasnt changed much even tho the reviews are way more than before
Im not done with my reviews for today yet so today might still be inaccurate
And what's your desired retention?
90%
Last question: how long ago have you switched to FSRS?
Probably 3-4 weeks ago. Before I had around 200-250 reviews, now its 390-450 and still rising
welp
And I thought 250 were a lot already but this is brutal rn
Idk how to help
oof
Do you use Hard as "fail"?
Maybe FSRS just failed me ig
No, I only use Again and Good
I did change one thing tho
@quasi shadow I found one of the 0.6% users for whom FSRS works worse than SM-2 🤣
What is it?
I changed the Review sorting order on my "mother deck" to ascending retrievability
I recommend descending instead (review cards that you are least likely to forget first), but it probably doesn't matter unless you have a backlog
Hmm thx. I never have backlog so I didnt think it mattered much
you have over 10% gap of real vs desired retention this will give you a lot of reviews
Meaning I should lower my desired retention?
until its reached and then increase it again?
if you want less reviews then lower to about 80-85%, if you want to try to reach 90% then keep the reviews up and try to "remember better"
I obviously want better retention but the reviews are skyrocketing and my retention isnt
Like its not changing at all
Its not feeling like they will stabilise
you fail a lot of mature cards which is very common for sm2 as doesn't show them often enough, your mature cards aren't really mature
Hmm meaning at some point the increase of reviews has to stop right?
But my young retention is also not even close to 90%
Yeah, but he still fails the same amount with FSRS
I guess I'd advise him to give it 1-2 more months
To see if things get better as more cards use FSRS scheduling
young cards can be unstable I would say you need 1-3 months of no new cards to reach some kind of equilibrium of fsrs desired retention
Damn
I sadly cant afford to drop new cards at all so I will probably have to endure it for now
thanks for the help nonetheless
from the workload perspective, less new cards then lower retention if there is too many reviews, you can find both values that work for you
since you just switched it will take some time to switch from sm2 intervals to fsrs intervals, you can also reschedule but this will most likely show you thousands of cards to review right now
I already rescheduled so It should already be doing that, no?
Should I turn on "reschedule cards on change"?
yeah, if you switch desired retention in fsrs you can reschedule to decrease the daily load immediately
I dont really understand what this option does
reschedules all cards after optimization, otherwise the change is only done during reviews
Ohh so this is the option that will give me thousands of reviews?
Also is this something i need to worry about?
generally yes, because the bigger retention gap you have and the higher desired retention you set the more reviews will get moved to right now which results in huge backlog
Alright, im on break rn so this is probably a smart thing to do right now
not really just as msg says lower means generally better fit of fsrs preditions to your review history\
Ah okay
also get fsrs helper addon which has useful options for rescheduling / pushing review forward etc
I enabled the reschedule cards on change thingy but if i save and reenter the options menu its disabled again lol
Thx, will do
That is expected. It only does it as a one-off when you save your current changes.
I guess to stop you accidentally rescheduling all your cards when you make other changes later.
But how do i turn it on now
You just turn it on and save your settings. It will reschedule your cards when you save, but go back to "off" when you look at the settings next time.
Maybe it didn't do it because your params/DR did not change? I use the FSRS Helper addon to reschedule cards because it is a bit nicer (and does not add junk to your card revlog).
Hehe I have unleashed hell 
Now if i dont fix this and just bruteforce it, wont I have thousands of reviews tomorrow as well? 
Not necessarily thousands tomorrow. Most of that will be cards that are overdue according to FSRS (backlog). It may still be higher than you had previously for a while.
Alr wish me luck on finishing this today lol
If you look at "Future Due" in stats and click "backlog" you will see a lot of cards due in the past.
At least that is how it used to work. I'm not sure if we try to do something fancy now to help manage the backlog.
Well this is going to be one heck of a day ig
I see that the fsrs helper rescheduled my cards to the past meaning i have this small backlog but ill just go with it and hope I dont have to go through the same tomorrow again
I mean its just 4k thats only like 10x my usual daily reviews
N.B. this is where "retrievability descending" can shine. If you cannot fully complete the backlog you at least should maintain your DR on the ones you can manage (at the cost of worse results on the ones you don't get to). "retrievability ascending" can lead to you doing badly on lots of cards.
Apparently the stats say that "retrievability descending" will lead to better progress on your backlog.
👍
retrievability descending is also good for morale as you get the cards you remember best first, so it goes fast at first
Idk man, try it on only 2 users with similar retention and a similar number of reviews
I found IDs of 2 very similar users
'id': (reviews, retention)
'513': (6975, 0.9733333333333334)
'664': (6933, 0.973316024808885)
Or these
'43': (44119, 0.8620548969831592)
'186': (43434, 0.8619975134687111)
The idea is to try the pretrain code on users with very similar data. If EVEN THEN the issue with parameters arises, then either FSRS is borked or the code that combines multiple revlogs into one giant revlog is borked
@quasi shadow I ran my similarity score code on the entire dataset, here are two most similar users:
'449': (5468, 0.8433691236215902, 2.869341265235055, 14.920850261172374, 1.753688261706222)
'2066': (5509, 0.8433691236215902, 2.869341265235055, 14.914320951828207, 1.7668377164849263)
Format: {'id': (ln(reviews), retention, avg. rating, avg. interval length, avg. number of reviews per day, excluding same-day reviews)}
{'7876': (8.61794309451638, 0.843607388627309, 2.8693227091633466, 43.37703435804702, 1.761707550175215)}
{'3350': (8.617762246337932, 0.8433516801853997, 2.8689889918887603, 43.37746427925484, 1.7619502868068833)}
I suggest running pretrain on these two
If even on these guys parameters are weird, then we're screwed
Also, here's a .json file with similarity scores (lower = more similar) for every pair of users, just because why not
https://drive.google.com/file/d/1MpGLzZhhwAs_Q3XqSZULC8J_SyZztE4M/view?usp=sharing
Assuming we don't see anything weird in parameters optimized on the two users I mentioned above, we can use this to gradually try pretrain on less and less similar users to see when the problem occurs. Maybe parameters only go haywire if there is a large enough difference between users
Actually, I should use ln(n reviews) rather than n reviews. And exclude same-day reviews for calculating the avg. interval. Oh well, time to re-calculcate this
Ok, here are the new two most similar users
Format: {'id': (ln(reviews), retention, avg. rating, avg. interval length, avg. number of reviews per day excluding same-day reviews)}
{'7876': (8.61794309451638, 0.843607388627309, 2.8693227091633466, 43.37703435804702, 1.761707550175215)}
{'3350': (8.617762246337932, 0.8433516801853997, 2.8689889918887603, 43.37746427925484, 1.7619502868068833)}
so similar, i wonder if they're the same user/collection repeated twice with a small change between them 🤔
ill go take a look
or is this just birthday paradox going on lol
There are actually exactly identical users, you can check the .json file
But I thought "nah, ain't no way" and printed the closest users with non-zero similarity metric
are there more user pairs like this?
you can check the .json file
I get MemoryError with pandas when trying to open it 🤣
I tested 43 and 186.
[0.1317, 1.5555, 2.4396, 11.1197, 4.6111, 0.7629, 2.9693, 0.0013, 1.7683, 0.1634, 0.7567, 0.9032, 0.001, 0.4082, 0.6454, 0.5804, 1.3254, 0.627, 0.038, 0.1525, 0.1665]
I pretrain FSRS-6 in first 500-1000 collections.
Now the w[16] is not very close to 1.0.
😅 Fine. It's still a mystery.
if i have a leech
i just thought of an idea
cant i duplicate it and have 2 cards of that leech
sure it might be ineffective if it is a lot of leeches
and perhaps this could help get the card out of the leech phase
@polar maple @unique salmon ... The weird parameters have the best performace.
We need a better method to optimize the default parameters.
What [vibe/connection] do those numbers communicate, that they might be weird?
If w[16] is equal to 1.0, it means you will have the same interval when you press good and easy.
I think I'm okay with that. (But what do I know.) (Based on my reading of the discussion that follows, I have no idea.)
maybe this parameter is being dominated by only a few users? maybe for the users that were used that produced w[16]=1.0, find users whose optimized params also has w[16]=1.0 and check their revlog size.
or perhaps you can try pretraining but you equalize the weight of each user so that their entire revlog contributes the same amount as any other revlog regardless of the size
Maybe only a few users regularly use Easy and they are heavily affecting the result
At least they are the minority.
yeah looks like there are basically none.. weird!
would the results be similar if you completely removed wd?
I have removed.
For example, the avg w[16] in the first 10 collection is ~3.
However
Wait...
I find another bug(
I forget to set the wd
😅
🙏
how long does it take to run the training code? Maybe you can binary search to find the first user such that including this user makes w[16] become 1.0 after pretraining. Then we can investigate this user further, but this assumes that the problem is from specific users rather than it being a gradual process.
basically find a user n such that pretraining on [1, n] causes w[16] = 1 but pretraining on [1, n-1] causes w[16] >> 1
maybe you want to confirm this but i just checked and found that user 3 has no 'easy' grades, strange that w[16] is so different between 2 users and 3 users
yeah
user3 changes other parameters.
true
If it's a don't-care term for a specific user, is there any insulation against contaminating the aggregate analysis? (The output to a don't-care input value is undefined, so any instance would be noise added to the larger collection of data, if there isn't a culling/ameliorating method in place, which there might be.)
It's the calibration graph for user 1 & 2 with last rating = easy.
It's the calibration graph for user 1, 2 & 3 with last rating = easy.
It's the calibration graph for user 1 & 2 with last rating = good.
It's the calibration graph for user 1, 2 & 3 with last rating = easy.
@polar maple user 3's retention is higher than user 1 & 2.
I guess it increases the average stability of the dataset.
yeah i was thinking whether the people who use the Easy button just have lower stability overall
So the stability with last rating = easy is lower than average stability.
Then, w[16] will tend to 1.0 or less.
i wonder how low w[16] can go if you remove the clipper
It is close to 0 when I debug the code.
i changed the initial value of w[16] to 1.5 instead and got this
but i dont think this is necessarily related
It's the calibration of the user 1 with all reviews.
It's user 2.
It's user 3.
😅 OK, these distributions are very different.
an alternative method for pretraining that could result in more reasonable parameters but would definitely perform worse:
- take the final parameters from the separately pretrained users
- take a set of card histories by random sampling or some other method
- for each card history: for each parameter set, compute R. Then take the median of these computed Rs, to get a final
card history -> Rassociation - optimize FSRS on these pairs to summarize the results
but this is only for if we really need to try something else
👍
btw for other.py i can't reproduce the result for FSRS-6
nvm i can't for FSRS-6-recency either
The result is different?
yeah, i'm testing the first 3 users
Do you upgrade fsrs-optimizer to v6.0.0?
i'll try that now
I changed the sorting method in v6.0.0 to make it consistent with the rust version.
So the result would be slightly different.
😅 The default sorting of pandas is unstable.
It costed me an entire day to be aware of it 😅
it is still slightly different, python other.py --algo FSRS-6 --recency on fsrs-optimizer v6.0.0 and branch Expt/S-decay-for-short-term-memory
{"metrics": {"RMSE": 0.37155, "LogLoss": 0.443219, "RMSE(bins)": 0.063689, "ICI": 0.02619, "AUC": 0.680656}, "user": 2, "size": 35900, "parameters": {"0": [0.0578, 2.156, 9.0984, 18.6335, 6.977, 0.4176, 2.8603, 0.001, 1.2517, 0.4143, 0.8218, 1.6942, 0.127, 0.2813, 2.6004, 0.1842, 1.7016, 0.6977, 0.2398, 0.1437, 0.1]}}
{"metrics": {"RMSE": 0.265645, "LogLoss": 0.267574, "RMSE(bins)": 0.039605, "ICI": 0.010973, "AUC": 0.640927}, "user": 3, "size": 4255, "parameters": {"0": [4.6381, 10.8228, 11.4011, 10.9234, 7.1465, 0.7717, 2.1306, 0.001, 1.4827, 0.1098, 0.9757, 1.8976, 0.0819, 0.476, 2.3429, 0.1742, 3.0004, 0.7536, 0.3332, 0.1437, 0.1056]}}```
I'm just curious but would easy_bonus still converge to 1 if you set 'easy=good' for all users except 1
if our theory is correct, if this 1 user that we exempt is one of those who have short stability & use Easy then i think that easy_bonus should still converge to 1
easy_bonus just has to give information that when it is pressed, FSRS knows that the user fits this type of profile
😅 OK, they are inconsistent.
I will debug it.
OK, I figured it out.
@quasi shadow try what Alex suggested - normalize the weight of reviews from different users in such a way that each user contributes equally
For example, if someone has 100 reviews, the weight of each of their reviews should be 1/100
it doesn't solve the distribution issue.
is there a known retention increase
when we make our own cards compared to using premade cards
Fixed
Please pull the latest commit.
could filtering the columns help?
def process_user(user_id):
dataset = pd.read_parquet(
DATA_PATH / "revlogs", filters=[("user_id", "=", user_id)], columns=["card_id", "rating", "elapsed_days"]
)
dataset["delta_t"] = dataset["elapsed_days"]
dataset = create_features(
dataset, model_name=MODEL_NAME, secs_ivl=SECS_IVL
)
return user_id, dataset
if user_id != 3:
dataset.loc[dataset["rating"] == 4, "rating"] = 3
# dataset = create_features( ...
for each 3 users
converges to 1 for just user 2
No idea
Here's a better method: keep pretraining FSRS on randomly selected subsets of users until we get parameters that look reasonable 🤣
It's somewhat like bogosort, but with FSRS parameters 🤣
Instead of "try sorting the list using a random number generator until you get a sorted list by pure chance", it's "try optimizing FSRS on random subsets of users until you get reasonable parameters"
What are the criteria of "reasonable parameters"?
- Calculate the 10th and 90th percentile of each parameter (use the same data that you use for plots), those will be the boundaries
- Run FSRS on a subset of users
for n in range(len(fsrs_pretrain_params)):
if not (10th_percentile[n] <= fsrs_pretrain_params[n] <= 90th_percentile[n]):
raise Exception('Parameter is outside of reasonable bounaries')
What's the size of the subset?
Good, so you will be done in a weekend
505C500 is equal to 268318178226
🤣 You cannot be serious.
two thousands years later
Just in time for Dae to review the PR then 👍
Ok, here's another stupid but less stupid method: The Average Joe Method
- Take parameters of a user
- Use them for FSRS "dry run"
- Repeat for all 10k users
This way we will find the guy whose parameters fit all other Anki users the best. And it only requires running FSRS ten thousand times instead of 10^1000 times!
So you try using parameters of user 1, user 2, user 3, etc.
That way one user will be chosen as the "donor" of parameters, and he will never even know 🤣
Though, after thinking about it, it probably won't improve the metrics by much compared to our current median method
Man, this whole situation sucks
I was hoping pretraining FSRS would be straightforward 
Dae is back
Could someone notify him about FSRS 6
In other news, I still can't run the new benchmark code without getting errors. Except that's not news at this point. No amount of updating the optimizer and downloading the correct code from the correct branch helps
@quasi shadow I wanted to make calibration plots for each version of FSRS, for that I need .jsonl file with --raw flag for each version of FSRS, on all 10k users
Actually, wait
FSRS-6 runs, but now FSRS v1 doesn't
Maybe you made a change that broke older versions of FSRS?
Give me a minute
FSRS-6: ✅
FSRS-5: ✅
FSRS-4.5: ✅
FSRS v4: ✅
FSRS v3: ✅
FSRS v2: ✅
FSRS v1: ❌
huh
File "C:\Users\Andrew\srs-benchmark-final\other.py", line 126, in iter
stabilities, difficulties = outputs[
ValueError: too many values to unpack (expected 2)
It only affects FSRS v1 🤔
try
stabilities, difficulties, *_ = outputs[
on line 126
Nice, thank you
Alright, now it will take me two weeks to run all versions of FSRS on all 10k users 😅
But as a result we will get the best visualization of algorithmic advancements!
Here's FSRS v1 on the first 169 users
FSRS v1 vs FSRS-6 with recency weighting on 307 users
🥹
Surely now nobody will complain about TR<DR, right...
FSRS 5 is already doing a good job (i said that before) though what i was having fears about, was that FSRS was actually artificially selecting cards so that it pumps my TR to my DR
Because my DR is 95% and I dont believe that I actually know 95% of my cards. If I open them and reread them outside of Anki, it is as if I forgot them
We'll see how FSRS-5 looks like compared to FSRS-6-recency. It's a bit unfair because only one will have recency, but I already will have to run this stuff for 10-14 days and I'd rather not add another 30 hours to it 🤣
Or even 70 hours
Well, I don't think FSRS can help with that
And I'm not sure what you mean by "selecting" - FSRS schedules intervals for every card according to DR
It doesn't pick like "Hmmm, I will schedule this card as if DR was a few % lower and I will schedule that other card as if DR was a few % higher"
I actually wonder what'll happen to the retention now, that I've run out of new cards
I always had ~85% on average, with the newer cards kinda saving it, and the old ones being in the mud
FSRS might be systematically wrong on different sets of cards and balance itself out for the final TR
e.g. can have a poor calibration graph but still reach the TR perfectly
or can have a perfect calibration but still perform badly
Yeah, something that we won't see on the averaged calibration graph is whether FSRS performs equally well for everyone or if overestimations and underestimations cancel each other out
i think it should be expected to overestimate and underestimate sometimes especially for small collection sizes
Though we can tell that at least FSRS-6 recency isn't systematically wrong in one particular direction, except at very low R
the forgetting curve isn't flat enough still 😁
Meanwhile FSRS v1
Yeah, that one is definitely systematically wrong
I'm curious what the graph for FSRS-5 will look like, whether it will be noticeably different from FSRS-6-recency
Also, I wonder if maybe we could use some other "tricks" other than recency weighting
you mean FSRS-6 compared to FSRS-6-recency?
No, I mean FSRS-5 vs FSRS-6-recency
The only version I will run with recency is FSRS-6
God I hate having these four lines/curves on the same graph
It's so cluttered 😭
Any clever ideas how to improve performance, other than recency weighting?
Like, idk, some kind of loss smoothing or something
Then again, I don't expect that there are sharp minima in the loss landscape of FSRS
ill check if rewriting the initial stabilities as e^x instead of just x will help it learn better from backpropagation
i noticed that they barely move from training
but rn i implemented Reptile for FSRS so im just waiting for it to finish
hopefully we get 0.0001 log loss improvement lol, im not hoping for much
Idk i just find it a bit sus that it schedules cards that i will have no way of forgetting at all whilst there are hundreds of different cards that are leech or I have most definitely forgotten and it doesn't schedule them
I ask myself, if FSRS scheduled me all those cards at once without the easy fodder, my TR would be actually 50-60%, like just imagine that.
Do you get what I am saying
Maybe if there is a way to visualize learning progress on these difficult cards then i wouldnt be so apprehensive
So I have the suspicion that FSRS knows indeed that I have forgotten a particular 50 cards, gives me 5 a day of these 50 cards. Give me 95% easy cards, and I am fooled by hey look, your TR is actually 95% so everything is okay.
But the progress on those 5 difficult cards and ultimately the 50 difficult cards in total has not changed.
And I keep wondering, why it seems as if my learning progress on those 50 cards has reached a stalemate
Do you get what I am saying
Again if there is a way to visualize learning progress on those 50 difficult cards as a whole, i could be relieved
I get that there are leeches that just don't "stick", so you never feel like you've made progress memorizing them. But still, FSRS doesn't do this kind of "global" thing. When a card is scheduled depends directly only on the history of that card. I guess it indirectly depends on other cards in the sense that reviews of other cards affect parameters, and parameters affect this card. But FSRS doesn't directly use review histories of other cards to schedule card A
And if I were to custom schedule those 50 cards, the actual cards which are in need of relearning and learning, my TR would drop to 50%
This is my fear
And this is why I have a subjective feeling that FSRS is scheduling "around" the leech cards
So the ultimate question here is, when FSRS says I am achieving a true retention of 95%. Am I actually achieving progress on my leech cards or not. That is the million dollar question
Alex's neural net has "memory states" (if we loosely borrow FSRS terminology) not just for cards, but also for notes, decks, presets and the entire collection
So instead of "I will use the memory state of this card to schedule this card", it's "I will use the memory states of this card, the note, the deck, the preset and the whole collection to schedule this card"
On leeches? Probably not
i guess it also depends on whether your leeches have > 1 day stability
If you install this addon and you check the Card Info of one of those leeches, what do you see in the last 4 stats ? Those 4 values