#FSRS Megathread
1 messages · Page 14 of 1
¯_(ツ)_/¯
how do i make my own graph
of my parameters of which dr is effective for me
like
yk
this fella
is it possible without coding experience
70%
The problem is that with FSRS-6 Compute Minimum Recommended Retention (which does what you want) very often returns the minimum allowed value of 70%, so it will most likely be removed
noooo
i just wanted this graphhh
IMO we also need to look at CMRR a bit differently : It fucking depends on your own fucking goals
pleaseee master
Jarrett said "Remove CMRR"
I said "No, let's make CMRR great again!"
I asked Luc to hook CMRR up to the simulator...it didn't help much
I think CMRR is right in a sense that if you want to maximise your score at a certain date, you should only look at sum(R) and reduce your DR low enough to be able to crank up the number of new/day
But if the goal is not a certain point in time but to "master" things (Higher and Higher S), you gotta to look at something else than sum(R)
Also, if your test require you to do 90% ... Who cares about finding the optimal value : Set your DR to 95% and grind those cards
What'd also be cool would be not setting DR, but "Desired Time Spent Per Day"
Then everyone will set it to 5 minutes, lol
But more seriously, it would be much, and I really do mean MUCH harder to control
well, setting that AND DR together makes no sense :D
IMO this is possible but this is best done outside anki 😆
i need anki
Retention is relatively easy to control with FSRS, workload is much harder to control
The simulator is already making a prediction of time spent
so the fomulas must be somewhere
We don't know how accurate the simulator is. And we don't have the tools to find out in a similar way to how we measure predictive accuracy
Look man, trust me, it's not worth pursuing
Desired Workload, I mean
From my experience, the Simulator has been surprisingly on-point the last couple weeks/months for me
@unique salmon Can you provide some s t a t i c t i c a l proof. It is your time to shine
ts j came out
😭
pretty graphs
top one is 90% DR, then 1% down with each. Very bottom one is 70%.
I miiight want to drop to 88% or something. It's already a huge difference.
Then crank your DR to 99% and do 8h of anki per day hehe
My point being getting 99% DR require lot of work, and I said "outside anki" because personally I can't do that lot of work inside Anki XD
this is with 99% added in :D
does the simulator consider relearning steps?
You mean when you have multi-day ones? I'd doubt it
if you dont have a backlog i am just so confused
how
review sort order can affect retention
Yes
like im looking w the simulator
and changing to difficult cards first can be better than the default
not likely to be a good enough improvement
Reminder that the only reason DR is allowed to go above 97% is because ONE (!!!) guy asked for it in 2023
💀
A bit unrelated but a simple change of mindset made my daily R going from weekly 90% to 95% 😛
DR much higher than 90% quickly spirals into insanity it seems
yea
I went higher than 99% with my filtered deck
Changing the default to 95% seems insane to me as well
95% would result in almost 1000 reviews per day within a few days
which is absolutely insanity
prop:r<0.891, 0.892, 0.893 ...
he must not have many cards if the extra review save time lol
Yeah but for things like Kanas, Alphabets, a few key numbers, it's quite good
Yeah, I could see it working for my Kana deck
Also there is something that people don't really realize, but when you have a 98% R, your review time will fucking shrink
I could review the whole thing every day in a few minutes
my cards are type-in, there is kinda a hard limit on how quick I can possibly do that
Sure, but I don't know how fast or slow you type, but even long words it takes ~1-2s when you know a word
plus, a lot of them are hard, so I actually do need to think. And if I do that with literal thousands of cards, I don't think I'll get faster over time :D
If you have at least 60-80 Words Per Minutes typing speed, it's really trivial time
My average is probably 10 seconds per card, in Anki-Time-Measurement
Usually it shows ~15, but I pretty much always go AFK a bunch, so there'll be multiple 1m ones in the mix
I like to refer to this graph. My 2-3s cards are 95% R, just because when you know really well something, like a DR=98% should make you do, you can potentially answer those cards super super suepr fast
The simulator take your average speed per good answer (at least that's my assumptions)
Where is that graph?
So yeah, if the simulator think a good answer take ~10s in average and a wrong one ~15, sure it will think getting shitton of good answer doesn't count
But you can blast through things when you don't let them go in <90% territory
That's why I'm Team90DR until I die now
Look at my average review speed when I was increasing DR to 85%, it was shrinking and shrinking
Now it's getting bigger again because I do put a lot of time to add things in the back etc, and the timer also count the back side
It's 8 values: 4 for the first reviews per each grade, 4 for non-first reviews (again, 1 per each grade)
So like
Again, first review:
Hard, first review:
...
Easy, non-first review:
Plus like 12 values for modelling same-day reviews
Wow didn't expect that
Is there a rational to have a different value for the first review time ?
Like, people use lot of times the first time ?
Idk, Jarrett just wanted to give it a special treatment
Btw, in this release a bug was fixed that made the simulator overestimate t(Again)
And Jarrett and Luc added a sophisticated way of modelling same-day reviews
Also this show something : If your goal is truly 90% R, sure you can squeeze a few more good reps if you think a bit more (10-15s), but you might still have a <90% in average for those ... So you can take the approach : If it takes me more than 10s, I don't know it well enough, so I press Again. The R will drop, but you'll see the card more, leading to less and less thinking time
I observed that if I pressed Good after 10s on a card, it was quite rare that the next interval, I would press it good in 2s
Why not press Hard if it was Hard and you had to think about it?
To be honest cognitive load of thinking what to press and also not sure it really matter, because if you press Hard, and get the same rate of "Success", FSRS make Hard/Good more similar in terms of interval spacing
well, you need to be honest in your ratings then
Good if it was Good, Hard if it was Hard :D
Hmmmm but the second point
If that leads to FSRS giving them the same curves, it means you overestimate the impact of it having been hard on how well you will remember.
If you Press Hard when it takes more than 5s, Good when it takes less than 5s, but get the same % of success no matter Hard/Good
FSRS will just adapt to remember Hard/Good doesn't change anything in terms of precision -> The differences will shrink
Having 80% R with thinking time at 10s or 5s won't change the prediction
Then how would you define Hard ?
It's purely Subjective really
As long as you're somewhat consistent, it's fine enough
Yeah but won't change your review speed
Reminder that FSRS is surprisingly NOT shit if you make it use Hard = Good = Easy
Also if "hard" means "It was a flip coin", then IMO it shouldn't even be considered a success, otherwise you'll get cycles of high lapse perf, low lapse perf
For me I'm pretty sure it'd be shit if I I got rid of Hard
a lot of cards would be so far into the future then, that it'd take a few years until Anki/FSRS will find out I forgot it
But then FSRS would adapt
Maybe, but only in 3 years or so
And if it's really "a few years", increase your DR
in the meantime I'd have already forgotten it 10 more times
Aaah
You're the guy that have a very high stability when pressing Good/Easy right ?
Because you added cards you already know so FSRS learnt from that ?
No, this also happens on Cards that didn't have Good as initial rating
a chain of 133333333... will also eventually lead to really long intervals
which is fine on a lot of cards
Ok but FSRS had to learn that from somewhere
FSRS is not guessing
Can you give your params ?
Also
This is what the optimizer produces:
2.5047, 13.7081, 25.7752, 44.1370, 6.1131, 0.9025, 3.6718, 0.0010, 1.2131, 0.1499, 0.8535, 2.1096, 0.0343, 0.3953, 2.6885, 0.0000, 3.0175, 0.1000, 0.1000, 0.2702, 0.1003
Maybe using Hard means FSRS doesn't have enough inputs from your Good 😄
I then change it to be
2.5047, 13.7081, 5.000, 44.1370, 6.1131, 0.9025, 3.6718, 0.0010, 1.2131, 0.1499, 0.8535, 2.1096, 0.0343, 0.3953, 2.6885, 0.0000, 3.0175, 0.1000, 0.1000, 0.2702, 0.1003
Hmm isn't it the same ?
to deal with the old baggage of misrated cards
no, the third parameter is different
Initial Stability for "Good"
Ok but doesn't change much for everything else
Grade Ivl-0 Ivl-1 Ivl-2 Ivl-3 Ivl-4 Ivl-5 Ivl-6 Ivl-7
13333333 1 2 6 15 36 82 177 365
Grade Ivl-0 Ivl-1 Ivl-2 Ivl-3 Ivl-4 Ivl-5 Ivl-6 Ivl-7
13333333 1 2 6 15 36 82 177 365
Yeah, and specially after the lvl-3 or 4 step there I often have cards I felt pretty shakey about, so I hit Hard on them
I doubt it's FSRS-6 yet
I mean the Simulator
FSRS-5 has 19
Or Visualizer rather
ah
Grade Ivl-0 Ivl-1 Ivl-2 Ivl-3 Ivl-4 Ivl-5 Ivl-6 Ivl-7
33333333 26 73 188 444 976 2015 3939 7341
Grade Ivl-0 Ivl-1 Ivl-2 Ivl-3 Ivl-4 Ivl-5 Ivl-6 Ivl-7
33333333 5 17 50 133 324 731 1543 3077
I know that my params are v6
Ah yeah indeed
it was not complaining but it was ignoreing the params
Yeah but basically your tweak just shift the problem 1-2 reps away
Personally, but it's just me
It found a bunch of card that you had 333333 without a single failure
THose you already learnt before, or the ones that were way too easy
Same would happen though if it was 1333333333
I'd move them in another preset
just one or two ratings later
optimize on cards that have more 33331333331 etc
There is no good criteria to split them up on
So FSRS has some data to optimize on
also, taking them out of this deck would completely break the AddOn that syncs them to WK
-rated:600:1 -is:new
lol
600d here
but adapt to find a good chunk of those
and put them in a separate deck
with DR=99%
That just seems unneccesary
or anything to get them in a low load way, but still not thousand of years apart
I have almost no cards with a legitimate initial Good rating
so making the ones with initial Good be basically only a little more stable than the ones with an initial Again is a fine workaround
You mean all the initial Good rating, you already knew them all ?
No, the huge majority of them should not have had Good as initial rating
Easy then ?
I learned them on their Sibling cards earlier
Oh
so of course the other sibling was "Good" when it came up a few minutes later...
When that happened, I was still using SM2
IMO, I'd prefer have good parameters than fighting my way to hack it
and the initial Good was very inconsequential there
But FSRS puts a HUGE importance on that one rating
I really don't see the problem. I have maybe a handful of cards where the first rating was legitimately Good
Is it though ? @unique salmon, the initial rating is important for the initial stability, and sure without lapse D might also help getting very big interval directly, but the initial rating is not that much special than any review apart from initial stab ?
So I just accept shorter than neccesary intervals there, and otherwise treat "Good" as if it was almost Again, by setting its Initial stability lower.
Another trick is to put a higher initial D
In general first lapses have way more optimistic stability
the more you lapse, the lower they will grow
so instead of waiting for something to fail 5 times (with intervals of 1y)
You can just make it start at high D
FSRS treats the first review differently because all FSRS formulas require the previous value of D or S (or both), but for the first review there are no previous values since there is no "zeroth review", so D and S are calculated differently
But yeah, to me it's also why Again/Good is a bit safer than Again/Hard/Good/Easy. You fear the long interval of Good, so you press Hard, so FSRS caliber on Hard, and your Good becomes even larger
Basically, you can't apply recurrent formulas to the first review
Unless I'm missing something in the formulas, all I'm doing is setting the third parameter equal to the first one, which turns the "Good" button on the first review into an "Again" button, and only there
Not directly, but obviously it affects future S indirectly
For the stability yes, but doesn't solve the problem in a bigger picture
What problem?
That your interval grow like crazy
Interal Growth seems fine for cards that didn't have that initial Good rating
S[n] depends on S[n-1], which depends on S[n-2], and so on
So S[0] affects all future values of S
if I hit Good on that card for 10 times, even after 2+ years of not seeing it, it deserves that ultra long interval
Like a chain
I mean, even with my EASY decks :
Grade Ivl-0 Ivl-1 Ivl-2 Ivl-3 Ivl-4 Ivl-5 Ivl-6 Ivl-7
33333333 3 7 14 26 45 74 116 174
Compared to You
Grade Ivl-0 Ivl-1 Ivl-2 Ivl-3 Ivl-4 Ivl-5 Ivl-6 Ivl-7
33333333 5 17 50 133 324 731 1543 3077
This is insane lol
Yeah, an initial Good rating is dang powerful by default
Sure but if for 2 reviews, you have S[N] = X and S[M] X, with the same D, the next review for "Good" will always be the same
Cause it's usually quite rare, at least if you are using Anki to actually learn the material, and not just review it
Grade Ivl-0 Ivl-1 Ivl-2 Ivl-3 Ivl-4 Ivl-5 Ivl-6 Ivl-7 Ivl-8 Ivl-9 Ivl-10 Ivl-11 Ivl-12 Ivl-13 Ivl-14 Ivl-15 Ivl-16
33333333333333333 25 42 68 106 158 229 323 446 601 796 1037 1330 1684 2105 2601 3182 3856
I mean, even with a Initial Stbiliy of 25, I'd need 15 reviews to get to your ~3000 interval
I'm not sure what you are trying to say
If it's "identical inputs produce identical outputs", then that is trivially true
So yeah, putting a 5 instead of a 25, just give you "1 free Good" to put before it go BOOM
Exactly
Grade Ivl-0 Ivl-1 Ivl-2 Ivl-3 Ivl-4 Ivl-5 Ivl-6 Ivl-7
33333333 26 73 188 444 976 2015 3939 7341
Grade Ivl-0 Ivl-1 Ivl-2 Ivl-3 Ivl-4 Ivl-5 Ivl-6 Ivl-7
33333333 5 17 50 133 324 731 1543 3077
So in this case, putting a 5 might give you one free rep before reaching ~25 (17 here), but once you get to 17 ... It's boom-time again
The first review isn't that special for the 50 133 324 731 1543 3077 that follow
If you keep hitting good, the intervals will always keep going up and up, with larger and larger intervals
that's the whole point of an SRS system
But not to 3000d in 7 reviews 😆
well
in your case it is because it's the history of the cards
But yeah
The second line would look the same if it was 13333333
If I hit "Good" on the card after not seeing it for 324 days, 731 days seems appropiate
and if I then STILL hit Good, even 1543 seems fine
Yeah because the first stablity isn't that special apart from initial S and D
Like, this seems more like you discovering that an SRS system will push cards far away if you do in fact not forget them :D
If you're not comfortable with 1500 day intervals, you gotta set the max interval accordingly
Yeah but then don't complain "Good" intervals are getting too big
If the system learnt you won't ever failed them
They are though
But your whole history shows the contrary
Thing is, I do fail those cards. Way too much, cause A lot of them already had gotten those way too long intervals
Yeah but then, fail them a bit, and FSR will adapt
Man, I wish Alex adapted his neural net to be used in Anki and then nobody would ever have to worry about suboptimal intervals
Or about some information not being used
pretty sure a neural net would also struggle if you misused buttons
IMO in the case of @tepid spoke it might be worse because he couldn't tweak easily the params
At least with FSRS you can still cheat the system
A neural net would likely figure it out
In my case it's such a specific misuse, that it's easy to fix
Misusing Hard would still screw it up though
Like, I could also go into the DB and change all initial 3 reviews to 1 with a simple script
IMO, another option Oromit is : Don't take your optimized params. Use the default one for a while. It won't be ideal, but at least next time you press optimize, it will have more data to put less weight on your rebelious SM2 years
Or even just ignore all those cards reviewed with SM2
Can't you use "Ignore Cards Reviewed Before" ?
It's just that a lot of those cards are coming out now. Cause that's how far away then were sent back then
That ignores half of all reviews though
It's just that I have seen the metrics for Alex's net and like...it's insane
cause it ignores the entire card, not just the initial reviews
Going from 60k reviews to 20k when I splitted my deck didn't change much my precision
I tried the setting, and if I use it, I go from 200k+ reviews to below 100k
And now I'm like "If only we could harness this power..."
Yeah but as long as you still have 20-30k reviews left, all good
Also, the problem is: Doing that actually makes it WORSE
LOL
Cause then Anki does not see any cards with an initial Good review
which means it defaults to a default initial stability of 100
Yeaaah OK but then you tweak the initial stability but the whole multiplier/exponent are sane
...that is not default S0
It is? If there's no cards for it to work with, the parameters just end up being 100
First of all, no
Second of all, if there are no cards with reviews, you get this
Didn't know it was added, that's really nice !
We even talked about that exact scenario here before. You must be mixing something up.
If there are no cards with an initial rating of 3 or 4 at all, the 3rd and/or 4th parameter end up being 100
Nope
Yep
I even get a better RMSE by sacrificing those cards !
From : Log loss: 0.3515, RMSE(bins): 2.92%. Smaller numbers indicate a better fit to your review history.
To : Log loss: 0.3520, RMSE(bins): 2.88%. Smaller numbers indicate a better fit to your review history.
Here is default S0
More filtering, better fit : Log loss: 0.3519, RMSE(bins): 2.87%. Smaller numbers indicate a better fit to your review history.
@unique salmon : Have you benchmarked different recency weight ?
Or maybe could we have user-optimized recency weight ?
It seems impossible to search in Threads
Somewhat. Me and Jarrett tried 9 variations
but we discussed parameters like that here before, and I think it was even you who explained to me then that if there are no such reviews for the Optimizer to work with, it'll just put 100 there
I didn't want to make it too aggressive, so we stopped
hell nah man
Imagine if you get the same gain than user-optimized decay @quasi shadow : user-optimized recency weight 😄
FSRS-7 is just a few weeks ahead !
Just scroll up a couple months or even a year or so then
Can't do that without gaming metrics
and you'll find the Discussion about it
The only way to have S0=100 is to have crazy high retention after the first review (at the second review)
If in all your reviews cards, there is not a single one with an initial rating of Good or Easy, parameters 3 and 4 will end up as 100.0000
As long as you're happy with how you fixed the issue, it's all good bro 🙏
But still, why not changing that 100 to 5 ?
You change 25 to 5, but 100 to 5 you say "NO"
It's discrimination
Cause then I'm cutting out a huge stack of reviews
As long as you have ~20K reviews it's all good
And more so: I'm cutting out all the reviews of the first levels of WaniKani, which are substantially easier than the later ones
Better have 20K good reviews than the mess you have
So you'd be cutting cards that are leading to big intervals? Isn't it the perfect situation 😆
I don't want to be swamped with reviews...
Sorry bro but at this point the discussion goes into delusional state just like the 20 reviews in 10 minutes of @lapis hearth
😄
You are severely misunderstanding something I feel like
I want to fix a set of a couple 1~2k cards that get WAY too long reviews than they should
not make myself end up with 1k+ reviews per day
It's a very specific problem with those cards, and there is a very straight forward fix for it with minimal casualties.
So I'm not sure what you keep arguing about
RE: Wonky CMRR. What happens if you remove the 0.7 limit? Does it just entirely break? If it keeps on bottoming out it would be interesting to see what the algorithm thinks is the "real" optimal value.
You could lower the DR until you "recover", or do "partial reschedule" with the helper addon 🙂
Recover from WHAT?
Optimal Retention : 0%
There is nothing to recover from
You can do a lot of workload with 0% DR !
From the 1k+ reviews you'd get
You could start with 200/day
0 cards per day. The dream 😮
The 1k+ reviews wouldn't be a one time thing from rescheduling
they'd be the normal daily review load
Well actually you could do a lot of Learning cards, but yeah, 0 reviews for you 😄
if I did what you suggested
Yeah but it'd be 200/day with lower DR, and you raise the DR slowly when the workload get lower
For example, if I set the "Ignore reviews before" setting to 1.1.2025, I end up with these parameters:
1.7660, 11.2130, 40.5090, 100.0000, 6.9411, 0.5599, 2.6559, 0.0010, 1.2671, 0.2320, 0.7813, 1.9189, 0.0961, 0.4168, 2.3744, 0.1251, 3.0004, 0.1000, 0.1000, 0.2508, 0.1024
Running these through the simulator produces this:
That is stupidly many reviews per day, that also do not look like they're in any mood to recover
From :
Grade Ivl-0 Ivl-1 Ivl-2 Ivl-3 Ivl-4 Ivl-5 Ivl-6 Ivl-7
33333333 5 17 50 133 324 731 1543 3077
To :
Grade Ivl-0 Ivl-1 Ivl-2 Ivl-3 Ivl-4 Ivl-5 Ivl-6
3333333 41 71 118 187 285 421 604
Still more manageable !
(Also, shocker, Easy Initial Stability fell back to 100.0000)
cause I never pressed Easy
Yeah but basically it's just because your current parameters doesn't represent at all what you should have
They do though?
Just for the entire set of all cards
The older ones are just generally easier than the later ones
So when the parameters can take all cards into account, not just a relatively small fraction, that averages out a bit
Well
the hard ones get a bit longer intervals than they might really need, and the easy ones a bit shorter than they could get away with
The new cards yo udo for WK right now, are they getting Easier, or do they keep the same level and/or get Harder ?
WK generally only gets harder and harder
So why do you care so much about your easy cards review history ?
Just throw them away
like they don't exist
the higher in level you go, the more obscure stuff there is, Kanji that are almost never used, and stuff like that
And get yourself a new set of parameters
...?
What you are suggesting would lead to the Easy cards being treated like Hard cards
I'm not gonna suspend thousands of cards lol
And if they are very easy, their stability will go up, you won't see them ever again
There simply is no problem. It's working out fine
So I'm just getting more and more confused what problem you are trying to solve
Oh sorry I thought you had some issues to fix 😦 My bad ! Enjoy then 😄
Well, the Issue I'm fixing is those 1~2k cards where I initially pressed Good, even though it was not Good at all
And that's dealt with by changing that one parameter
While doing that, it's working very fine
Changing the cutoff date really only makes the parameters produce shorter and shorter intervals, since it only sees more higher levels cards, the more into the future I shift the date
Maybe that curve of 13333333 seems so steep to you cause you never use the Hard button?
For me that combination of reviews means I am pretty confident about the card, so a multi-year interval seems appropiate
if I wasn't confident, there'd be a 2 or two in there somewhere, and it wouldn't be nearly as steep in return
@unique salmon user 617, S = 3, decay = 0.18 (the median)
S = 20
perhaps the jank at R = 0.2 and below is from a lack of data
I know the answer is probably just "the CMRR algorithm has issues" but it would be hilarious if it turned out the actual optimal DR was something silly like 55%.
e.g. it turns out you learn faster if you just throw loads of new cards at the wall and see what sticks instead of trying to remember the harder 45%.
IS THIS GOOD ENOUGH FOR YOU, MF?!
YOU KNOW WHAT THIS IS, MADAFAKA?!
THIS IS A DECK WITH ONLY CARDS WHERE FIRST RATING=AGAIN, MADAFAKA!
IT DOES NOT SET STABILITY TO 100, YOU BITCH ASS MF
(btw, the reason why even other values are different is because since FSRS-5 FSRS is now allowed to change S0 values together with other parameters, whereas previously the first four values would be estimated completely separately and then "frozen")
AAAAAAAAAAAAAAAARGH I AM MAAAAAAAAAAD!
STOP SAYING THAT IF NO FIRST RATING IS GOOD THEN STABILITY WILL BE 100, GET YO ASS OUTTA HERE!
wtf is wrong with you, I'm just gonna block you, bye
I'm joking
(kind of)
But for real, I put a bunch of cards with first rating=Again into a single deck, and it definitely didn't set anything to 100
I literally just tested it again by setting a very short "ignore cards before"
and it happens again
see the parameters I just posted
#1282005522513530952 message
There is not a single card with an initial Easy review, and the parameter just snaps to 100
I think there are a handful of actual initial Goods, so that's just 40
I used "Ignore cards reviewed before" to ignore a few cards. Still nope
This is the deck where every first rating is Again
Well, the 100 appears reliably for me
Actually, wait, I have no clue why the other ratings would change 🤔
Wait
Other S0 values, I mean
Wait, that's actually weird
Could it be because the only initial "Good" examples are cards you knew really well.
(N.B. I have not followed your entire conversation)
Good didn't get 100 in my case
only Easy, for which there are exactly 0 examples
I guess if there is not enough reviews, it just pulls out some defaults, and estimates a bit?
And that's why with your, presumably relatively low total review count test deck, it's so low?
Which seems much more sensible to me than just pegging it to 100
@quasi shadow erm...
Am I going insane or on some cards dates are in reverse order?
Nvm, Luc will fix it
Pretty much
In my case there is still a lot of reviews, like A LOT
just no cards that had an initial Easy rating, and only VERY few with an initial Good (legit Good in this case)
Where is that odd revlog from?
Cause yeah, that's... messed up? Not even upside down, just, in a random order
wait no, the second one is fine
first one is upside down
No, really, what the f-
in this case, the first one looks normal to me
and the second one completely upside down
Where did you find those??
Is that a fascinating new bug in 25.05?
I have no idea...
The big question would be if it's only visual, or if FSRS would also see it in reverse
Alright, time to see if it persists in 25.02.4
It wouldn't surprise me if it was a JS issue in the UI. JS is silly.
Nope, it's new. Only reproducible in 25.05
Ok, now how do I turn this into a collection with a minimal reproducible example...
missing a .rev i think...
This could simply just be an oversight were the overview displays it in the order the DB returns it in, cause it's missing sorting entirely
and when shifting the card, it touches the revlog entries in descending date order
Meh, screw it
Let Dae figure it out 🤣
i have fixed it XD
Well, thank you, Oromit. I found a bug while trying to prove that S0 is not 100 if there are no Good and no Easy first ratings 🤣
See, being mad can be useful
What is that if/else it's in?
i.e. what triggers the branch with the missing reversal?
if it has a memory state
thats why it flips when you move it because it clears the memory state
i'll just double check i did actually fix it...
I guess that's purely visual for the card info screen?
Or does it have actual implications?
purely visual
you can even see that the difficultys line up with what they should be
Ok, but that still leaves a question - why does S0 of other first ratings change when no card has a first rating other than Again? 🤔
If there are no Good first ratings, then even though FSRS can change S0(Good), it won't
Because changing it doesn't do anything
Or so I thought, at least
I think it's not that complex, it's just a matter of how you express the goal function. Right now, it's a Memorized/Workload, so sum(R)/workload
Hopefully it's a convex function. But since sum(R) is pretty linear with the number of Card you know, you'd wish the Workload really goes super up when R is low (because of the cost of redoing the same card again and again), and super up when R is high (because the forgetting curve shape).
But there many bias :
- What about considering Memorized is also a function of the Stability/AVG Interval of R ?
- FSRS might predict your DR well, but will the Recommended DR be predicted correctly ?
- What about cards that don't follow a R=DR pattern ? The leech that will make your workload higher than expected if you just add too much new card/day without really taking time to stabilize them ?
IMO, already including some Stability-related weight on the Memorized function would already help to make "Memorized" less linear based on New/Card.
Right now, Memorized is just a proxy for "New Cards/Day" until there is no more new cards and then it becomes "Deck Size" proxy.
I guess it might have done weird things in addons that use the CardStats method and assume it is sorted correctly...
(I think I used it once because it was the only way I saw to get rslib to return the memory state it calculated for the revlog)
The only way to get big number, is to take a deck with no more new cards, and ask for it in 3 days
@quasi shadow I want you to experiment: put a bunch of cards with the same first rating (doesn't matter which one specifically) in the same deck with the same preset and optimize parameters, and see if S0 values change
For example, if all cards have Again as their first rating, see if S0(Good) changes
It shouldn't, but my tests suggest otherwise
lets hope people with those addons haven't updated yet 😂
Days to Simulate, DR
1, .94
2, .94
3, .90
4, .86
5, .80
6, .71
7, .70
#1282005522513530952 message you might be interested that lowering the review count/day also has the same effect
You mean that ?
i edited it it was review count, i was running a special branch there
can you compile anki?
Ok give me asec
a problem with / workload formulations is that with the way we train FSRS, it can be described as a behavioral model, not just a memory model. With the new learnable decay, we get very flat forgetting curves, but this is more just a prediction from FSRS that you will end up studying this card's information outside of Anki. So when workload is "workload within Anki", it is no surprise that a low recommended DR is optimal, just leave the work to outside of Anki!
It's a very very very good point, didn't think about that
Ah yeah indeed, 20 days with 1n/d I get .74, with 2n/d I get .70
Isn't it possible to add in that graph a Memorized/Workload ?
Would help to see the shape of the curve no ?
Because we try to imagine what is going on but I think in that view we already have 99% of what we need
for different dr's?
Hmmm, "Memorized/Workload" on y-axis, DR on x-axis I guess
so we could see : What is the goal function for 70%, 71%, ...
I've been telling Jarrett to do that ever since CMRR became a thing...
But alex point is really insighftul, and it explains also why, if we don't compensate the "Memorized" function by something that would compensate for the very flat forgetting curve, it's way better to do 1 rep ever 2 years at ~40R ...
You’re in luck, I prototyped this already. As you say, for each data-point, you have to run an entire simulation. That is massively unresponsive, even if as you suggest you recalculate after every optimisation. Here’s how it runs with 1 data-point per percent on my actual computer. (It takes 40 seconds)
I'm trying to catch up on the CMRR discussion, from what I gather is it not as efficacious as we'd like it to be? Is it over/underestimating the recommended DR?
Yes exactly
But to see how it behave from 0 to 1
not just from 0.70
With FSRS-6 it almost always outputs 70% aka the minimum value that it's allowed to output
https://github.com/Luc-Mcgrady/anki/tree/dr-curve it might be a pain because i didnt pr it but heres the code
CMRR is not "wrong" in a sense that if you truly want to maximise SUM(R), it's indeed better to just lower your DR low. But we (at least I) think there is a lot of bias behind this ... That might not make the result of CMRR really something you truly want to use
I already added you as a remote
give me a sec
its also not as fast as it could be because i programmed it badly
I just want to see the shape, I don't care if it takes 3min to compute lol
I should maybe delete my "out" ?
oh make sure your submodules arent outa whack
.
Shhhh. You're denying yourself the chance to later go "Look at me. I'm an optimisation wizard! I made it run 5x faster!"
Does it mean "yes" ? 😛
I'm sorry, could you explain what maximizing SUM(R) means?
A card has a retention predicted, the "Memorized" function right now is the sum of all of them
If my output is above 70%, is it safe to presume that it's at least accurate?
@cosmic hedge ah got it
So the way CMRR works is it finds the minimum of workload/knowledge. Knowledge is defined as s sum of probabilities of recall of cards at the end of the simulation
In other words, it finds the value that allows you to learn the most while spending the least time on reviews
submodules it was
I think it's currently still a bit wonky. Hence Jarrett wanting to get rid of it.
Not really. CMRR is not using the same settings as the simulator, and it makes too many unrealistic assumptions
Sadly, making it use the simulator config makes it only mildly more likely to output something other than 70%
@cosmic hedge should it appear here ?
yeah i didnt change it from being dates because im lazy apparently XD
it's something i made super quickly out of personal curiosity and then expertium asked for it and now its just awful XD
With a big decay 0.6621
did you merge fsrs-6 into it?
aah no
i'll do it unless theres thousands of merge confilcts
i pulled it off 😎
yeah i had 0 conflicts i was confused 😂
Well that would explain why it would bottom out at 0.7. Workload doesn't go up again as you reduce DR so it literally does want 0% 😂
like it's some kind of frankenstein
With decay 0.2344
Decay 0.10
I'll check if I can tweak easily the memorized function
to add stability
With decay 0.8
I wonder how much of a clusterfuck it'd be if for every review, Anki sends an API request to ChatGPT, gives it the revlog, and asks it for the next interval :D
...why
i guess in a pinch you can replace review count here
using result.cards and then mapping over the stabilities?
thats in simulator.rs
I see
good opportunity to learn rust i guess 😂
at 1AM
yeah you'd need to sum it though i think
result.card.map(|card| card.retrievability * card.stability)
Now this is where ChatGPT can be useful
nvm you have to return an array sum it on the javascript side or something good god this is awful XD
I remember enjoying rustlings when I was learning Rust:
https://rustlings.cool/
Small exercises to get you used to reading and writing Rust code!
Yeaaah let's do it quietly tomorrow
*by you 😆
Even JEtbrains want to upsell
As much as I love JetBrains it is a bit much that you need 3 separate IDEs open to work on Anki
Yeah
I just noticed I'm paying for Ultimate but my toolbox is not signed in
😆
Right now i'm even paying for Github Copilot that I disabled after 5 days
i think i've done it...
I mean when I fetched your remote I got 50 branches poping up
This may or may not be strictly FSRS-related, but is there a severity of overdueness where, based on what FSRS determines, it would be better to simply reset the card than pick up where the card left off?
I don't think so personally
I even think it's a great opportunity to give broader input to your data FSRS will use later
Hmm interesting, I'll probably keep chugging away then, thanks!
https://forms.gle/rPkfkrPd2hasembM6
Alright, it's time for advanced democracy, with ranked voting
@quasi shadow @ashen light @wind palm @hasty fractal @lapis hearth get pinged get pinged get pinged get pinged get pinged get pinged
This is the last time we're doing this
dr-curve-stability branch
only if i'm Pinky
30+60=90DR
BUt this is just sum(S)
so I guess with SUM(SQRT(S)) it would be a bit different
daily_new_count: vec![result.cards.iter().map(|c| sqrt(c.stability)).sum::<f32>() as u32],
no funky import to do ?
count: _.sum(v.dailyTimeCost) / v.dailyNewCount[0] you can try that on the js side
Also @polar maple and @cursive badge
Foiled! You made it actually work ;p
oh wait no you cant try that because you need every card i'm dumb
hehe
daily_new_count: vec![result.cards.iter().map(|c| c.stability.sqrt()).sum::<f32>() as u32],
I'll post this on r/Anki tomorrow
Although for a "health check" I would be tempted to hide the numbers in an advanced/debug section.
@cosmic hedge I think something is wrong
Even with
daily_new_count: vec![result.cards.iter().map(|c| c.stability).sum::<f32>() as u32],
I don't havethe same graph than earlier
did you forget to roll back your js changes?
doesnt seem like it
this is my only diff
ah wait
Now it's there again
I'll put the sqrt again to see
The thing I don't understand @cosmic hedge , is the fact that we're adding stability in "daily_new_count", but daily_new_count is on the denominator
count: _.sum(v.dailyTimeCost) / v.dailyNewCount[0],
At a certain point your DR is just so low that cards get scheduled off into infinity so don't add to workload ;p
workload/knowledge is minimized
ah yes
We could maximize knowledge/workload or minimize workload/knowledge, it's equivalent
Yeah so with the sqrt (on the left) we're back square 1
daily_new_count: vec![result.cards.iter().map(|c| c.stability.sqrt() * c.retrievability(&req.params)).sum::<f32>() as u32],
R*S
Blue : 100 new/day, Orange 10, Green 2
On 30d days to simulate
2y
I think the intuition might be : Stability is such a bitch, it takes long time to grow
So, for short term, adding word with lower DR is better
for long long term, then you start to have a min around 90%
Just so you know, I won't approve a function with two minima, and I bet 100 bucks Jarrett won't either
It's a clusterfuck
I'm not to a point to make anyone approving anything lol
Just trying to have an intuition
Well, actually, if the second minima is very far on the left, it's fine
If it's veeeeeeery far from 70%
more like 1% ?
Yeah
IT also depends on the decay
Decay 0.3 here, not to compare with the other
I'll run again on the old one lol
Decay 0.23 that one
Meanwhile I made another graph
You can see how much of an improvement FSRS-6 + recency weighting is
Alright, that's fine. As long as the flat part is so far to the left that realistically nobody will ever have to deal with it as long as min(DR)=70%
Maybe some decks just dont work well with it
this one has a lot of review and it gives something more logical
I'll try with sqrt(S) again on that one
Try S*R
It was S*R this one
Which one?
This
R*sqrt(S) this
I think I see some pattern here
Based on how S is rewarded, the curve has a more pronunced convex part
So sqrt(S) get less convex
and for some graph, you get to the old situation where the min is completely at the left
I like S*R (well, average of it, but whatever) because it has an interpretation: #1282005522513530952 message
I'd rather not use something that has no interpretation
It's super decay dependent
I think with low decay, since cards never goes below R a certain threshold, you get a a minimum on the left
It can be interpreted as "average S after taking into account that you won't be able to recall 100% of cards, and S of forgotten cards will dorp"
The same deck but with decay 0.1
You see it's very low workload at ~35-40% DR because basically those card won't ever go there with such a low decay
You should really specify what function you use every time
It's annoying, but makes it easier for us
Sure, it was sqrt(S)*R for those
But the point is, since with low decay you will never really drop to very low R, we should be careful wit hit
for ex, if the minimum recommended DR is 70%, but with low decay 0.10, it takes forever to get there, 70% will be again the new recommendation
So it just tells you to review cards in the year 1002025, kek
Yup
Alright, I'm going to sleep
I think both @cursive badge and @polar maple points are very good :
- @polar maple Since in Anki, R never drop below a certain rate because people review outside, it makes those low R unreliable
- @cursive badge : If it takes forever to drop R at low R, it's more optimal to just have things closer to 0%
Yup same
but at least we have a bit more insights
It's intended.
Nvm, I'm stoopid
I even fine-tuned that function a while ago and then yesterday didn't realize it's doing its job 😅
I switched to FSRS and now I see really long times... even when I would choose hard. I re-optimized the values in Anki and it said they were already optimal
. Is this intended?
I didn't reschedule cards when I switched to FSRS.
- Copy-paste your parameters here and show your desired retention
- Show Card Info of this card (either find it in the card browser then right click then Card Info; or press I while reviewing the card)
https://github.com/Luc-Mcgrady/Anki-button-usage/blob/loss-to-retention/loss-to-retention.ipynb could we solve it by adding a "retention factor" when comparing them, maybe something like this?
Mh, something seems off in the settings, I can't copy the parameter numbers...
After clicking on optimize I could:
1.6603, 14.4036, 20.6832, 20.6832, 6.3285, 0.4861, 3.3194, 0.0010, 1.9391, 0.0320, 1.4193, 1.9950, 0.0771, 0.5318, 2.4091, 0.0021, 3.2024, 0.5423, 0.7565
Oh, ok, so you haven't optimized parameters and are using the default ones
This isn't what I meant by Card Info though. In card browser, right click that card and then click "Info..."
Alright, now try optimized parameters and see what intervals you get
For the same card as here
I marked that one as a lapse already, sorry 🙈 .
This one is similar, even not as long in the future. But is > 6 years really a reasonable time frame?
Mh, when I click on optimizing all presets the whole dialog just closes in on itself...
I'm sure I saw this before but as you said the defaults were written in the dialog when I opened it here. Perhaps because I've just upgraded the Anki version?
You never failed this card in 4 years, so long intervals seem perfectly reasonable to me
Huh
How many presets do you have? And I assume you've been using Anki for a while and have a lot of reviews?
I only have the one. It's about 140k reviews.
It should show "Optimizing preset 1/1", probably
Well, whatever, if you have only one preset you can just click "Optimize Current Preset"
As for intervals, you can increase desired retention to make them shorter, if you want to
Yes, I think so. The console didn't throw an error. It just posted those values when the window was closed by itself:
Starting Anki 25.02.4...
2025-04-26 16:30:05,331:INFO:aqt.mediasrv: Serving on http://127.0.0.1:58249
Starting main loop...
Default: [1.6623251, 14.380958, 20.716198, 20.716198, 6.317123, 0.4848409, 3.328925, 0.001001084, 1.9343293, 0.031222602, 1.4210147, 1.9930667, 0.07823932, 0.532778, 2.4047449, 0.0027229392, 3.2026215, 0.5412417, 0.7573364]
Thanks a ton for your support. I feel assured now 🙂
Is the v3.0.0 included inside the first beta because what does this actually mean
It's for the simulator, it now supports "review sort order"s.
Seems sorting by Retrievability is bugged right now ?
I disabled all addons thinking it was mine at first 😆
Seems soemone already saw it
Still the isseu with a build from after the merge though
It's bizarre, the decay seems to be retrieved from card_data but I don't find it in the DB or the debug info of the card
https://github.com/user1823/anki/blob/bfdfab76faf6582ecbed48977157be071caea744/rslib/src/storage/sqlite.rs#L379
{"pos":543,"s":0.614,"d":7.4,"dr":0.9,"cd":"{"v": "reschedule"}"}
It should fix it.
https://github.com/open-spaced-repetition/fsrs4anki-helper/pull/552#issue-3022435570 Does anyone here speak Portuguese and would be willing to check this?
Long shot I know XD
@west whale

Tested on my problematic decks and all good now ! Thanks
@cosmic hedge Did you implement sum(S*R) instead of sum(R)?
https://github.com/ankitects/anki/pull/3947#issuecomment-2833389135
I don't see it 😭
Man...
nope I'll do it later

i thought all those graphs @bold terrace made had something weird to say about it or something
This one is good
(I assume it's sum(S*R) because Sound doesn't annotate every graph, so I have to do some guesswork)
@bold terrace !!!! which graph is that!!
oh wait he's already pinged XD
that was fast. i guess the fsrs fixes warranted it
Rejoice
Btw, considering how slow CMRR is, I think we need to add a "This may take a few minutes" text that appears when it's running
maybe the build's broken?
Have you tried uninstalling Anki and reinstalling it again?
(not joking)
No
try that
Multiple times
This issue has been always present with me
welp
With every release
then how did you get it to work
https://www.python.org/downloads/release/python-394/
Try downloading one of these files (depends on what OS you use)
In the previous laptop of mine, i have had this error. Usually I would just restart my laptop and then the installer removes the files installs them and puts them in some place
Right so installed
What now
Restart your PC and try installing Anki again
Right
So restarted
I am installing anki again
Should I try putting it in a different path
folder
Nah
Forum time it is
Also, this is an issue I've been having for a while, but never reported
Why does Anki insist on making backups when CMRR is running?
It happens surprisingly consistently
Way too consistently to be a coincidence
@cosmic hedge I guess I need to write things down at this point
- Use sum(S*R) instead of sum(R)
- Add a "This may take a few minutes" text
- Investigate why Anki loves making backups while CMRR is running (not just in this beta)
I think its because CMRR blocks the backup from completing somehow so while normally its done faster than you can notice, while CMRR or optimise is running it will hang there.
How long is CMRR taking you? do you have the days to simulate cranked up to max?
More than 1 minute for 1825 days to simulate
I haven't measured precisely
With 3650 days to simulate and a very large number of cards it probably takes >3 minutes
That doesn't explain why "make a backup" and "run CMRR" miraculously coincide
Right so I think i found out what the problem is
The shortcut to Anki in my Desktop got removed
The shortcut in my taskbar as well
i guess a warning for the higher day numbers would be a good idea?
Or at least is not up to date
So I followed the python file to the destination folder
And I found my actual Anki app there
I think a "This may take a few minutes" text while CMRR is running is better
Also, idk if this will be a pain or no, but maybe prevent Anki from making backups while CMRR is running?
your anki is installed on your desktop?
install it in the normal place and make a shortcut
yeah the convex aspect is more pronounced the higher the S contribution, so probably S*R. But it is also related to decay, with low decay you 'd just an strictly increasing curve
so at the lower decays it still goes to 0.7?
Or lower if you don't bound it, yes
Which is quite logic since at low decay, the R would never never go below a certain threshold
maybe backups should be made on not the main thread? its mostly a visual problem as far as i know.
though it may lock you out of hitting "cancel"
soooo except if you REALLY use big S contribution, you'd still get your good old .7
Let's try sum(S*R) and see
so maybe the integral is better considering decay can be low
because S is just the measure from 100% R to 90% R
Nah, I think the integral will only increase the output a little bit
but best is to plot and ajdust
I expect that with sum(S*R) the output value will be higher than with sum(avg_R_over_the_next_5_years)
Would be great if Luc tried both, of course
this was the integral i think
Or we could do sum(S*avg_R_over_the_next_5_years) 🤣
But then it no longer has a nice interpretation
So let's do either one or the other, but not both
Or maybe use the "natural point of no decrease for the current decay" as the min bound for the desired R ?
Because maybe .7 made sense with decay=0.5
but now with a lot of deck at decay=.2 or even .1
.7 might even be WAY too low
?
I mean
I remember with decay=.1, you could have to wait years and years before you go down to R=70%, no ?
Because by nature of a slow decay, the R drop would slow down more and more quickly
so if you have a decay of .1, thing like waiting for R to drop at a certain level, might not make any sense anymore
while for decay=.5, it might
Anyway, @cosmic hedge , I'd like you to try both sum(S*R) and with sum(avg_R_over_the_next_5_years) with identical simulator config and report the results
But how do you define the cutoff?
Maybe based on the parameters ? Bound the minimum R to be the R in which the interval would go at a certain rate that would be considered "acquired", like 1y ?
If it akes 1y to go with some params from 100% to 75% for ex, the new min bound would be 75%
but yeah depends on the stability too so difficult to generalize that
it would also mean different DR based on different S
so quite the shfit from what we do now
Nah, it's fine
Alright if you say so
Try setting DR to 70% and see your intervals 🤣🤣🤣
Upon optimizing and rescheduling (I havent rescheduled since 6 months). My mature cards skyrocketed from 25% to 45%
Which seems a lot weird
However my Difficulty increased from 82% to 87%
Last Parameters = Decay, Low Decay = Things decay more slowly, will drop to 70% in muuuuuuch more time than with a decay of 0.5
so having way more mature is normal
it means you're doing well
Oh, yeah, some stats use medians now
yep
If I'm remembering well, low decay means slower decrease from 90% to 80% let say, but faster from 100% to 90% right ?
so might fit your history better
Since you like to do very short interval until you know well (and I guess never forget anymore)
is there any science behind mature cards
I'd be curious to see your decay withuot the .1 minimum haha
Depends on the kind of mature
why is it 21d interval specifically
Shit you edited too fast
hahah
Random threshold not even that useful anymore since it's based on 21d interval and not 21d stability
in some custom graph I like to consider mature S=21d
Are you saying that FSRS 6 might f*** me over
Pardon my bad reading comprehension
🤷
we must discover this
and once we do
we will sky rocket
and like also
why only 3 stages to a card
You know the term "broscience" ? It refers to the science bro teach each other in the gym
why not 5
sometimes I think SRS is just big bro science 😆
yes of course i am a science based workout master
like why dont we have
Spaced Recall is a thing, but Spaced Repetition System are like ... the broscience of it
It's the "I'm doing evidence based", but by taking anything that could reduce the amount of cognitive effort you have to do 😆
Add -> Acquired to the end, and now you're the new SRS guru
question @bold terrace
if i search in the stats bar
for cards that have intervals over 21d thats basically just searching for mature cards correct
That rational works better for SM2 though
yes
perfecet
hmm now i must figure out a way to check interval
and difficutly
difficulty based stages
i think i am losing my mind
yeah i am onto nothing
@bold terrace chatgpt is onto something
and he defines "(Stability = FSRS's internal number predicting how many days you can go without forgetting.)"
wait its possible to have stability > 21
but young cards ?
o lala
sound how is the search stats extended plugin going
do you have a version for me now
you might be interested in the readme here:
https://github.com/Luc-Mcgrady/anki-10k-notebooks
thats why i moved it to bad graphs in case you were wondering
thank youl
Nothing better than the owner (@A bloke) last released version I'm afraid 😦
I still need to finish the average load by 5-percentile graphs
Most of the logical is there but need some polishing
And right now I'm more focused on the leech detection addon
With all due respect to GPT (which unfortunately is very little), I think this is very audacious to pretend you have very strong memory because you remember it for 15 days lol
IMO SRS stability doesn't mean that much. Doesn't take in account external-reviews, so you might have a very shaky knowledge that external exposure make you somewhat remember even when the stabily in Anki is >1y
It's only reflecting how you can retrieve an information based on being exposed to the "Front" of your card
SRS is just practice tool
It's a nice practice tool, but it doesn't mean much in my opinion
Those past weeks/months I even repeated quite a few time that I start to see Anki more like a "Grader" than a "Teacher". If tomorrow I really understand what "A" is, by practical experience, usage, connection with other things, the "A" card in Anki will skyrocket in terms of interval
But on the contrary, if I learn "A" through Anki, SRS-ing it alone won't make my interval skyrocket
How much interval do you think most of the english word you use (based on the assumption english is not your mothertongue) should have, for example ?
To me, SRS-ing non acquired things is like keeping them at "warm temperature" until you have enough exposure to it in different settings to be able to acquire it
But acquiring something can go super super super fast.
But it often require some kind of interest in the subject, emotional engagement in the situation where the knowledge is useful, some impact on your daily life, etc
When Trump win the election, you don't need to SRS the information to remember it "at greater interval"
You get the info ? Info is very important ? Ok registered, not to be reviewed again.
You might think "Yeah but I get a lot of repetition through medias", but that would still translate into some surprise like "Ah yeah, it was HIM that won the election ! Forgot it ! Would've pressed Again on Anki !"
Problem is, piece of knowledge taken seperately, don't carry much emotional impact, direct necessity of retaining the info, understanding where that info is meshing in the grand shceme of things, at all
So saying "After 20days, a piece of knowledge is "Solid"", feels super super super super off
And, by extend, maybe the true benefit of using Anki or any SRS is not really to remember something, but be reminded that you're still subject to forget it (and thus, you might need to expose yourself to it in more settings). Once again, SRS being more a "diagnosis/grading" tool than a "teacher"
well
this is a different type of memory, no?
yeah i guess
my issue is just trying to use 1 platform at a time
i struggle with using outside info
the biggest thing im afraid of is overlapping info = waste time
ffs, first time in 5 years 
lol
I still wonder why Anki bothers with mkv at all
instead of just using the build in media stuff of Qt
I assume MPV supports a much wider range of codecs
instead of sum(S*R) we should try the area under the forgetting curve without taking the average, as Sound suggests S doesn't really make too much sense on its own anymore with trainable decay
But then it's not interpretable anymore 😭
Also, I don't see how taking the area instead of average R is better
def average_f_power_forgetting_curve(t1, t2, s, decay):
if not t2 > t1:
raise ValueError("t2 must be greater than t1")
# Calculate F(t2) - F(t1) where F is the antiderivative
integral = integral_power_forgetting_curve(t2, s, decay) - integral_power_forgetting_curve(t1, s, decay)
# Divide it by the difference in time to get the average
return integral / (t2 - t1)
We just divide the raw integral number by the time delta to get average R instead of the area. I don't see why using the area would somehow be better
to me the integrals are more interpretable than S*R given how much decay affects the curve, the integrals describe an exact average over a time interval
maybe S in this case should be shifted to being your S to reach your DR
since imagine if our standard for S is at 99% rather than 90%, I'm sure you'd start to see the problem with it
Fair point
Guess we're waiting for Luc to try the integral
@cosmic hedge I'd like you to try the integral in CMRR with the following numbers:
- 1 year (starting from the last day of the simulation, aka 1 year beyond "days to simulate")
- 2 years
- 3 years
- 5 years
- 10 years
- 50 years
And report the results. Sorry if I'm being annoying
Both Qt and mpv just sit on top of ffmpeg, so format support would be the same
I thought that you meant that using the area under the forgetting curve is better than using average R, even though they would be different only by a constant factor of 1/(t2-t1)
the full area was suggested as an alternative to S*R, if you take the average then it is similar to R
Btw, I leave it up to you to decide whether to implement new RMSE or no: https://github.com/open-spaced-repetition/srs-benchmark/pull/207
I basically said "idk lol"
@1DWalker I brought you some delicious AI slop :D
Check if this is indeed what you have been talking about or if I got it wrong
Actually, wait. Regardless of decay, S is the same as long as we don't change how it is defined
If you vary decay, S doesn't change because of the way it's defined
The whole point of the complicated forgetting curve formula that we are using is that f(S)=90% for any decay
ffmpeg codec/format support (and quality) depends on what libs you compile it with. Maybe Qt didn't enable as much compared to mpv 🤷♂️. There must have been a reason to use mpv in the first place.
That's to to whoever builds qt and mpv
And playback of every common format needs zero external deps anyway
With the notable exception of av1, but I doubt that's a concern for Anki
Alright, the final survey on Evaluate has been concluded: https://docs.google.com/forms/d/17F4KM188RFHU9s-jyd9VDltk52ZgceTMY3qNhWeXYxE/viewanalytics
"Health check" is a clear winner with 73% of first-preference votes, so I don't even have to do the "Eliminate the option with the least first-preference votes, redistribute votes of people who voted for the eliminated option, repeat until some option has >50% of votes" part of ranked voting since we already have a winner.
I'll write about this in the Github issue and ping Dae
Also, I finished the cool calibration graphs
@polar maple @quasi shadow @hasty fractal @anyone_who_likes_cool_graphs
Every graph individually here: https://imgur.com/a/calibration-of-different-fsrs-versions-KfJ32EV
Can someone tell me in relative terms how much FSRS 6 recency is better than FSRS 5 recency by, so that I could start feeling myself
@unique salmon stat-man
Just look at graphs ¯_(ツ)_/¯
Seriously, it's way better than looking at logloss and RMSE because these graphs convey a lot more information about the weak points of FSRS
Like FSRS 6 is better than FSRS 5 by this much-%
It is also gaining on the top players
FSRS-6 recency vs FSRS-5 recency
-2% logloss
-10% RMSE
FSRS-6 recency vs FSRS v1
-20% logloss
-49% RMSE
do note that the calibration graphs do not tell the full story about FSRS performance
for example i can take FSRS v1's graph and apply a function on the raw FSRS v1's outputs to transform it to have nearly perfect calibration
almost certainly codec amounts, but mpv can be used as a library so the real question is why isn't that being used
Cause interfacing with that from Python would be a royal pain
just that S is pretty arbitrary when the forgetting curve is more dictated by decay, and it reminds me of the weirdness of using R*S, would you rather have 1 card at 1000 day stability or 1000 cards at 1 day stability?
or for this one, is the black forgetting curve really worth 10x more than the purple one?
Yeah, here the integral would definitely make more sense
waiting
@polar maple @quasi shadow Something that bothers me: if FSRS mostly errs on the underestimation side (as you can see from the graphs https://imgur.com/a/calibration-of-different-fsrs-versions-KfJ32EV), and only for very low values, how come we see so many posts like "My desired retention is 80% and my true retention is 60%"?
I've said this before, but with every day my suspicion that FSRS adapts to a specific level of retention and then shits its pants if retention during deployment is different from retention during training grows stronger.
- the other way around, "my desired retention is 80% but my true retention is 90%" is just something that a user would feel less likely to need to complain about. they'd likely feel proud of themselves for "beating the algorithm" rather than realizing that this is rather a potential failure in the algorithm
- the distribution of 'true forgetting curves' may have high variance such that it just averages out to FSRS's current curve. Recall that perfect algorithms still need to make a best guess based on limited information
but if deployment consistently results in overestimation then it likely has to do with mental fatigue from the increased workload which is something that we can't model well
I don't know about mpv but I've used ffmpeg's libav stuff a long time ago and I can see why you would be tempted to just use the CLI if it is anything similar.
Even if the library does most of the heavy lifting there can be a lot of faffing around with buffers and passing things between various parsers and decoders vs CLI just doing the magic for you with a few flags.
I really should get back to my idea of using the N number of reviews done on that day before the (N+1)th review as an input feature
Have a look at this https://github.com/Luc-Mcgrady/anki-10k-notebooks/blob/3d38effd7f5b0c2a72eb57079f1c719b77e11d11/fatigue.ipynb
Unless i've done it wrong (😅) fatigue has basically no effect
Look at the readme of the repo theres one guy it seemed to work for
How do you reconcile it with #1282005522513530952 message