#FSRS Megathread
1 messages · Page 7 of 1
a better way of testing this is to see if as you get more data from a card, whether the curve eventually approches an exponential forgetting curve after a while
The data is generated from a single forgetting curve.
So it’s unrelated to the heterogeneity problem.
@quasi shadow a problem in that code is the low sample size, if you increase it to 2000+ points then the gap disappears
and if you change t_sample to t_sample = np.random.randint(50, 2000, 20000) then the exponential fit becomes better
But in the real world, the data of forgetting curve in given review history is very sparse..
okay but this point is completely unrelated right? you did not give evidence about your prior at all here so why should we use this prior in the real world
The state space increases exponentially with the length of review history.
But the optimization actually has a problem. It may lead to the phenomenon where decay=-0.2 has a better log loss.
maybe it is just the truth that decay = 0.2 models memory better. do we have evidence to reject this idea?
no your simulations don't count here because they also decreased generalization performance
but in the 10k set, a lower decay increased generalization
In the same reasoning, GRU-P models memory better. But its forgetting curve’s shape is weird.
So the reasoning is problematic.
if GRU-P's curve shape is what is perfect then sure we should strive to use that, but we don't have reason to when LSTM does better with a power curve already
btw FSRS calibration is what inspired the transformation that i chose
and based on calibration it seems that a flatter curve is indeed proably better
def forgetting_curve(self, t_nh, w_nh, s_nh, d_nh):
return (1 - 1e-7) * (
torch.sum(w_nh * (1 + t_nh / (1e-7 + s_nh)) ** -d_nh, dim=-1)
)
This power forgetting curve’s decay is trainable, right?
Hoenstly, I would be fine with it if it didn't make intervals even longer
Which is why I did this: https://www.desmos.com/calculator/lzzgd5gz0s
The idea was to make sure that intervals aren't longer or even are shorter than now, but the curve provides a better fit. I technically succeeded, but only by a small margin
But underestimating the stability also would cause this kind of calibration.
yes it is so if you use it in FSRS you might need some very heavy regularization
I don't think so. Underestimating stability would shift the "offset" (b in ax+b), but wouldn't change the slope, no?
i see longer intervals as an absolute win
i set my DR to 0.2 and never have to study again? we'll take it
Lol
Fine. I have an alternative idea: let the forgetting curve converge to non-zero value.
Some thing like a * R(t) + (1-a).
It could improve the metrics.
@unique salmon if the predictions are more accurate though the intervals can be shortered by the scheduler on top by ignoring DR and adding a flat value if DR becomes too low or something
Then we have a complete mess
i would just rather FSRS's memory model be as pure as possible and leave messy scheduling to a layer on top
one of the appealing things about FSRS over SM-2 is that FSRS predicts R
but not really? we need to now shift the R prediction a bit to be better
we shouldn't mix scheduling with the memory model together
Yeah, but it doesn't make intuitive sense. Why would the probability of recall never be 0?
Or, well, never <1-a
Some vocabularies you would never forget.
Then we should make it depend on D or something
I can believe that for some cards R never gets too low, but not for very difficult ones
It’s pretty common in language learning. In my formal work, I often observed this kind of forgetting curve.
also many users will review the same content outside of anki in one way or another
so R will not ever go to 0
But not for all cards though
Definitely not for leeches
For leeches it's more like the other way around - there is an upper limit to R
Making it configurable requires a lot of refactoring…
Then you want card-level decay (
It will be a pain.
Btw, making decay or the lower limit depend on D would screw up interpretation of S, so I'd rather don't go that way
I'm talking about this
If the forgetting curve depends on D, we can no longer display this message
Fine. Just let me complete the work of FSRS-5.5.
just display "on average" or just leave the message unchanged but with some additional info somewhere
Let’s delay the decay-stuff to FSRS-6.
We still have neural D to explore
And by "we" I mean "I try stuff, you guys extract the parameters" 🤣
"I try stuff, I send you the code, you run it again on your own, then you extract the parameters"
I wonder when FSRS-5.5 will be the default. Then we can introduce FSRS (experimental version) or some other algorithms.
Could you build Anki from source? I can develop a branch of FSRS with decay=0.2, and then you can play with it.
Actually it is pretty simple. Just modify a constant.
very interesting
it says 1.67 months from 2024-12-11
today 2025-03-30 is most definitly not 1.67 months, and then the next interval is 7.2 months lol
Was it overdue? Have you re-optimized and/or rescheduled in the past 3 months? Or upgraded to a new version of Anki/FSRS?
Have you rescheduled twice? I think only the first reschedule gets a revlog and the interval doesn't get updated if you reschedule again.
Am I allowed to say that I find that Gemini PR bot annoying 😭
I have to scroll past a page or 2 of text no one reads before I get to read the comments.
Is there some way it could be more concise or just do the code review bit? (would still be annoying but less)
I will ask for Minato. He introduced the review bot.
does no one else read the occasional haiku it shits out 
how to get multiple leech definitions
so that i can have like first leech threshold be 8 and tags with leech
second leech threshold 16 and tag with super leech
an addon could do that pretty easily, probably?
hey it almost stopped being a leech
if only we had that cool new leechkit tech in anki I bet that could mark something as a superleech
is:leech -> prop:leech-p>0.95, is:superleech -> prop:leech-p>0.99 or something 🍃
is this sarcasm
I don't even know anymore
gg
the sarcasm is mostly how it will never come to be
the underlying suggesting is real I guess
like if someone writes up a full spec and gets dae to sign off on it, I could theoretically do the work
FSRS uses the parameters of the deck the card is actually in, right? Not of the parent-deck I'm studying from?
yea
Ok, cause I wanna try splitting my huge deck with 18k cards into subdecks, and see if it helps with generating sensible parameters
Speaking of splitting decks, I tried to split in in High D (>0.9) and the rest. Got this
ALL : Log loss: 0.4800, RMSE(bins): 3.10%. Smaller numbers indicate a better fit to your review history.
Low D : Log loss: 0.3500, RMSE(bins): 3.42%. Smaller numbers indicate a better fit to your review history.
High D : Log loss: 0.5054, RMSE(bins): 4.37%. Smaller numbers indicate a better fit to your review history.
Card Difficulty looks a bit more like a real distribution now, the log loss for low D is way better, for High D it's not that godo but High D forgetting curve is still quite simple I guess : Review everyday before hopping increasing intervals 😛
Low D difficulty change when having something wrong
(Params 0.3186, 1.2715, 3.3267, 29.9833, 7.4308, 0.4635, 1.4153, 0.0829, 1.3248, 0.2916, 0.8265, 1.9742, 0.0870, 0.2977, 2.2780, 0.0855, 3.6419, 0.3560, 0.6826)
For High D :
1.5258, 4.4873, 12.0269, 59.4754, 7.1157, 0.6141, 1.3824, 0.0824, 1.6252, 0.0397, 1.0985, 2.0185, 0.0311, 0.3749, 2.3486, 0.2315, 3.0695, 0.5960, 0.7416
It feels how D evolve is somewhat similar but slightly shifte
Not sure if it's super worth it though
Any thoughts ? Anyone brave enough to try out ?
Do you add new cards to the high or low difficulty deck? How are you going to sort them?
Reminds me of this a bit https://www.reddit.com/r/Anki/comments/1hb1esz/comment/m1gvqn7/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
I thought a bit about it ... Basically, since D "only increase" in the "all deck", the one going in "High D" are the one with Lapses. In fact, in my "low D" deck, no cards has a lapses count superior than 5.
So I guess, you add it in your main, you ask anki to flag leech with lapse count 5, and when it reaches it out, you move them to the High D one ?
Now the question is : When to bring back to Low D ? (If it would ever happen). I'd say, when Stability>Threshold ? For example, I have no cards with S>37 in High D. The Median is 6 (instead of 40 for Low D).
So maybe a protocol is :
- If Leech Threshold is reached : Low D -> High D deck
- If Stabilty > Threshold (21 ? 30 ?) : High D -> Reset -> Low D
The reset I think would be nice to not "poison" it
The "un-poison" might make sense to avoid the Low D FSRS model to get too much influenced by the previous mess it was
I'm wondering now do you have a lower log loss on both the separate decks compared to before?
No High D took a .02 hit
But low D improved a lot
oh how did i not see that XD
you should look at how the ALL deck's parameters performs on the low D and high D decks to measure the actual improvement
Yeah I can check again in 30min
I also think that since High D have in my case a lot of reps compared to low D (might be the opposite if you have only a low amount of High D card) the ALL deck just chose to sacrifice the low D
That’s actually a nice point. Adding in the low D one means you might detect it a bit too late
I’d say it also depends if you have a strategy on how you gather new cards, introductory rate, how you can also kind of feel if the element of making it a leech are there or not (kanji looking like another one, etc)
is it possible to mature all these in 100 days?
10 a day since oct 1st
new idea dropped, maturity forecaster 🙏

Log loss: 0.4800, RMSE(bins): 3.10%. Smaller numbers indicate a better fit to your review history.
Low D : Old Params : Log loss: 0.3757, RMSE(bins): 9.20%. Smaller numbers indicate a better fit to your review history.
Low D : New Params : Log loss: 0.3503, RMSE(bins): 3.44%. Smaller numbers indicate a better fit to your review history.
High D : Old Params : Log loss: 0.8177, RMSE(bins): 21.65%. Smaller numbers indicate a better fit to your review history.
High D : New Params : Log loss: 0.5053, RMSE(bins): 4.34%. Smaller numbers indicate a better fit to your review history.
@cosmic hedge @polar maple there you go
It's interesting that jsut the fact of splitting, even with the same params, improved the log loss of Low D already
Mature in the sense of Anki is a bit dependent on your DR (Mature = Interval >21d, which means Stability >21d for DR=90%, but lower for DR=80%)
the evaluate function on anki cheats by testing how well parameters fit the very same data that it was trained on, so if there didn't exist some regularization mechanisms in FSRS, you would expect the metrics to keep decreasing as you split the deck further and further. But this doesn't necessarily increase the performance on new unseen data. We can check with the 10k dataset by doing a similar procedure as what you have done
90% dr
21d of stability is difficult but I also noticed my stability was improving with a constant trend but if you want ALL your card with such stability>21 it means that your blue curve would need to have absolutely 0 card for 0-20 which would be very very difficult to achieve quickly. But the question is also "Why is it so important to have them all matured?"
its not
was just curious
my young true retention for the deck is 87.9% and mature retention is 92.2%
1154 mcqs, of which 30 are on the exam
so i should be fine to pass anyways
I see ... It's also where even if we can come up with some clustering based on D, the question is : when you introduce a new card, how to classify it correctly to the right cluster and the right params
Right now it feels D is like a modified for the same big function but it feels that different params for different "D" could be beneficial
You don't have any new card to introduce, the full set of the 1150 question you already have them in your Young+Mature ?
I see ! So basically now if you really want to try to perfect the score, you just have to increase the DR as far as your workload can handle it
Because having a 80% retention for stability 40d won't necessayr make you have better grade than a 80% for 2d stability
So young/mature doesn't matter that much for Grade/one-shot knowledge test
i just need to pass
because if you dont pass, you dont go onto the next part of the exam
When is it ?
100 ish days
how can i check
wait
is this something checkable
how can i see % retention for x stability
Well right now it seems your avg retention should be ~92-95% (You can check it with the "Memorized" graph of the stat addon that does SUM all the R)
and the stability you see it already there in your blue graph
There'w also that one in the addon
interesting...
But my point is, let say on test day, you have a retentino of 99% but for very very low stability, like 4-5 days
Who cares, you'll do 99% 🙂
If your stability was 900 days, yu'd still do 99%
Stability is more useful for long term knowledge
things you want to build up
For example, alphabets, if you have a 100% recall but with a stability of 2d, you won't be able to read much in a few weeks haha
So normally, the long game is often worth it
but if it's like a specific test, a one -shot with specific question/knowledge, not necessarly fundational for the future ... Stability is not that much big of a deal
It is, however, helping you to have a lower workload everyday 🙂
ah but thats only backwards stability
With a stability of 1d, you'd be doing those 1150 reviews everyday
Yeah but you have a trend
For ex, my trend is to gain 0.25 avg stability per day
Median is a bit lower
If stability decrease it just mean you are adding more new card than your "older one" are able to compensate
Can be beneficial to widen as much as possible your SUM(R) though
i know the day before this exam
For example if you tell me you have still 800 new cards to do over those 1000 total, I'd say "Fuck High DR and High Stab, just add as much as possible per day with DR=70%"
But sometimes it can backfire
For example for vocabulary, if you add so many 10k words with low low stab, it will be a nightmare to figure out the ones that interfere with other, etc etc
yea makes sense
i wonder what retrievability number to go off
i mean lets be honest
if i get the 30 questions on the exam that are the ones that i have the most struggles, then it wasnt meant for me to pass 🤣
Yeah but I would not be too much worry with that
If you have 90% DR with your young cards
It means that even with very very low stab one, you should'nt be failing them that much
You can always just take the one with the most lapse, and do a Filtered Deck to just do them every single day, or find ways to remember ways with mnemonics etc
Filtered deck can be nice for that
Things like prop:lapses>10 prop:r<0.98
"Everything that lapses more than 10 times, now I want a DR of 98%"
All the one you didn't fail that much, you just let them live with 90%
and you cram those bitches if I may 😄
indeed
that might just be the plan
it is gonna feel weird if i pass this exam and no longer do the deck
See it also like this :
RIght now Anki IS giving you the most problematic one the most often
Sooo if even by doing so, Anki can't make you fail more than 15%
I don't think the examinators will be able to find a better way to make you fail 😄
I have my DR set to 95% for 236 days now. The workload is surprisingly low, even though 95% is supposed to be on the high-end. So in my opinion, it’s managable and one can remember more, so it would be a good change. (though I’m not sure if the bump from .90 to .95 is going to make that big of a difference)
so they can just be like "i want money"
depends on the deck
if i change my deck to 95% dr
im going to die
90% dr
95% dr
Takes you a long time to do those reviews ?
yes cause i just cant lock in
It's about the default value
ill just turn it off
already on 90% dr bro
i see half the deck every 5 days lol
today i have half the deck in backlog
ill give it an attempt in easter break
IMO
With short intervals, don't reschedule 🙂
I only reschedule cards with >90d intervals
i reschedule daily
If a card is due in 15 days, I'll do it in 15 days
WELL actuall it's quite bad because the loadbalancer with FSRS plugin is flatten it
gg
I took the time to note the first 4-5 colums of Future Reviews for ~2 weeks
After a reschedule
As you can see, it start flat but then the true curve emerges after 5-6 days
if you reschedule every day, you just reflatten it
making even the R lower than what you should
Made a few subdecks. Even just for looking at their stats they're worth it.
240 extra decks though :D
@unique salmon to avoid cheating on the Anki metric we can train on the first 4/5 of the revlog and evaluate on the last 1/5. This value is then reported to the user. Then the last 1/5 is included for some epochs to maximize future performance. I think this could strike a decent balance in computational cost and the validity of the metrics
I'm not sure what you are talking
You mean within Anki itself?
Previously I told Jarrett "Why do we use different procedures in the benchmark and in Anki? Let's use the same 5-way split in Anki" and he was just like "lol no"
yeah
tbh i just assumed that the problem was the added computational cost
why else wouldn't we do a 5-way split?
currently i think the optimization does 5 epochs over the entire revlog
@quasi shadow
Yep
so perhaps we can instead do 3 epochs over the first 4/5 of the revlog, then evaluate on the last 1/5 revlogs, then do 2 more epochs on the full revlog for the same total of 5 epochs
but we no longer cheat on the metric
I think Jarrett just doesn't want extra work 🤣
That sounds complicated, and it's not how we do it in the benchmark, so why?
so that the computational cost is nearly the same as what it is right now for anki users
a 5-way split is like 2x the total computation i think
but if we can take that sacrifice then sure we should use a 5-way split instead
Oh, wait, I remembered a reason why this is undesirable - "Evaluate" would be as slow as optimization itself. Granted, I doubt many users care or even use Evaluate at all
Even I barely use it
Probably more like 2.5, but yeah
i think people would normally only use evaluate after an optimization so it shouldn't be a big problem
Yeah, I guess we could do something like that
- Save logloss and RMSE, and the corresponding parameters
- When user clicks Evaluate, check current parameters in the parameter field
2.1) If they are the same as the saved ones, it means the user didn't manually tweak parameters, so we can show the logloss and RMSE values
2.2) If they are different, it means user manually tweaked parameters, so we can't show stored values, we have to re-evaluate
Couldn't it also change from new reviews?
it also becomes difficult to evaluate parameters that users manually tweak, since they would be the ones cheating instead with the full revlog
it would be pretty easy for a user to tweak the values such that they beat the parameters that were trained from the first 4/5 of the revlog
which wouldn't be wise of them to do but imagine the complaints
I don't follow. In the 5-way split procedure the final metrics are averaged across 5 splits
Why would it be easier to cheat than the current "evaluate once on the entire history" approach?
you cannot evaluate on the entire history right, so how would you evaluate parameters if a user manually modifies them
so if we do a 5-way split we get a uncheated metric, but if we allow users to manually modify parameters and their evaluate button will check the parameters against the entire revlog, then the 5-way split metric will naturally be higher than what is possible when a user manually modifies the parameters
a user can very easily modify parameters by a bit to account for the last 1/5 of the revlog that a 5-way split could never have considered
so if we do use a 5-way split we might need to change how the evaluate button works when a user manually modifies parameters
or remove it
No complaints from me, lol
or we can separate the metric into two parts, one for generalization and another for data fit. The 5-way split would generate a generalization metric, and after we finally train on the entire revlog we can also produce a data fit metric.
If a user manually modifies the parameters, their evaluate button would only generate metrics for data fit
Then what's the point of the "generalization metric"?
Also, no. I'd rather not make this even more complicated
the 5-way split produces the generalization metric and indicates how well FSRS can generalize on the revlog
but anything else the user does to manually modify parameters will fall into a train loss regime where they cannot be certain if it would actually improve test loss
People already think that log loss means that their reviews have been corrupted (yes, really), I don't expect that anyone will understand this level of complexity
ok we should remove the evaluate button then
I’ll first briefly note down the two issues with evaluation: Log-loss and RMSE (bins) are technical ideas not easy to understand for everyone. There was never a good way of telling people what log-loss is. Now, after we add a recency weighting function in evaluation, it’ll become even harder. As an exemplar for the first point, this user ...
https://github.com/open-spaced-repetition/fsrs4anki/wiki/The-Metric
should write a section to clarify that the new version of RMSE (bins) can also be cheated
and also give reasons as to why RMSE (bins) is still a decent metric to use
So...do it?
you won't find me giving reasons for why it is a decent metric to still be using
if it were up to me it'd be gone everywhere
soo i'm just saying my bias
i think you should write it
hm, I feel like optimizing 60*3 decks individually is a bad idea. By far not enough reviews/cards
Even the 60 levels individually is still not all that much. ~250 cards in each, and the later ones only with 1000~2000 reviews
Just have like 5-10 presets, man
I reeeeeeeeeeeeeeeeeaaaaaaaaaaaaaaaaaaaaaaaaaally doubt you need more than that
I'm playing around with what works
That's what it optimizes for level 1: 2.6012, 2.6863, 2.7470, 2.7999, 7.0829, 0.6672, 1.3630, 0.1048, 1.6653, 0.0051, 1.1366, 1.9341, 0.1317, 0.3160, 2.2616, 0.2886, 3.0035, 0.0000, 0.0000
And this for level 57: 1.7663, 12.3438, 47.6679, 100.0000, 7.2883, 0.5351, 1.5596, 0.0010, 1.4359, 0.1967, 0.9115, 1.9473, 0.0968, 0.2825, 2.2758, 0.2232, 2.9898, 0.4103, 0.5488
What surprised me a bit is that level 1 and 57 somehow have nearly the same number of reviews. 16xx
Even though the cards in level 1 are all 2 years old now
FSRS-5.5 is coming soon!
Ok, this is epic
I finally got around to it and I fixed almost 1000 cards suffering from interference, being poorly designed (e.g. multiple fact recall instead of single), or incomplete/wrong information. After this, the success rate of all these cards is going to increase a lot.
Do I have to do anything or will FSRS automatically adjust with to these cards magically becoming much easier? I considered resetting history, but I only did that for a few cards that I had to completely rewrite
lmao
honestly we shouldn't have that thingy
normal homo sapiens don't give a shite about log loss and RMSE
Something I tried yesterday but I'm still experimenting is to move all the high difficuly card (prop:d > .95) in a separate deck/preset and having different parameters for those. In the evening, I selected the 20 cards with the higher interval while still being high difficulty/high lapses, and I noticed that some of them were all successful for the past year. So I reset their history completely. I think it's only dangerous if you lose so much review history that FSRS would have not enough data to optimize.
Not before you test neural D 💢
okok
Hmm question, I reset this card, and I resetted the review/lapse count. But I still see the revlog. So I guess, FSRS will still train on it right ?
So it means, if it was a leech back then, but now it's a more "normal card", the leech phase will still "poison" a bit the parameters ? (Which could be also be a good thing I guess, but I don't know)
I was thinking about another potential work flow
- Card Mined, Gathered initially in low-D deck.
- Leech Threshold at 5-6 (or anything similar that lead to the "high cluster" of D. When thershold reached, moved to high-D deck with its own preset
- Set Leech Threshold at 10-15 (or anything similar so it gets in the highest D of the high-D). When happens -> suspend
Why suspend ? I read a bit why people are suggesting to suspend, a good argument is the fact that you could build a higher word count in memory with more easy words than having a huge portino of your reviews being cards you can't really remember even after 100 reviews. Better to suspend and re-introduce them later (when you have introduced all the one easy to learn)
BTW I tried reseting 200-300 cards and logless/rmse got worst and I was able to re-optimize so I guess the reseted card even if their revlog is still there are not taken in account (?)
@quasi shadow I can never remember how resetting is supposed to work with FSRS 😅
Maybe I need to make a card for it
https://github.com/ankitects/anki/blob/ccab18b7ba624d888f3d881e14f04c830e3eaa44/rslib/src/scheduler/fsrs/params.rs#L327-L345 Nope, forgotten cards aren't included.
If its just a test deck use a different profile
Well I wanted to make a test for several weeks
split my hard words in SM2 and FSRS and see how it would behave
why're manually reset cards labelled forgotten btw? is resetting a common thing to do?
unless significantly changing the card or its content i suppose, after which an "again" wouldnt cut it
Yeah thing is card that fall in the "Hard Difficulty" realm never comes back to "Easy Difficulty" anymore
I checked, some were top difficult but haven't failed them for a year
but their interval are still lower than what they could if they were in a "healthy deck"
but adding them in the healthy deck with all their messed up revlog, would mess up the model for normal cards
This is why I'm glad it works that way at least.
does setting due date without changing the interval (such as just putting 2 for 2 days from now) mess with the algorithm at all?
Normally no but what's the point though ? I mean, setting a due date equal to the current due date ?
It does mess with the retreivability value in the browser I think https://forums.ankiweb.net/t/set-due-date-doesnt-update-the-interval-of-card/52646
It makes the ivl not align with due, which may induce this problem: For example: I learnt this card today, and the due date is 2024-12-0, the interval is 4 days and R = 100%: However, if I set due date to 3 days later: The interval is still 4 days, and the R = 96% which is incorrect: I must set due date in this way to get correct...
It doesn't impact the scheduling, because the current interval isn't used in computing the next interval (unlike in SM-2).
FSRS ignores the review history before the reset.
@polar maple I find a new problem when I test the power forgetting curve with decay = -0.2. The optimal retention is extremely low (0.01 << 0.8).
😂 It makes this feature useless.
@bold terrace
so lets say i have a bunch of material to learn for my exam in around 70 days
was it you that told me/recommended i should try to get through each new card as fast as possible
lets say DR 70% (i think you said this number)
and then once i get them all seen, then up the DR to like 80 and then to 90 maybe
I tried several decay values, -0.25 seems to be a threshold.
$ python evaluate.py --fast
Model: FSRS-5-preset-dev
Total number of users: 469
Total number of reviews: 13682985
Weighted average by reviews:
FSRS-5-preset-dev LogLoss (mean±std): 0.3273±0.1629
FSRS-5-preset-dev RMSE(bins) (mean±std): 0.0506±0.0349
FSRS-5-preset-dev AUC (mean±std): 0.7127±0.0786
Weighted average by log(reviews):
FSRS-5-preset-dev LogLoss (mean±std): 0.3615±0.1693
FSRS-5-preset-dev RMSE(bins) (mean±std): 0.0670±0.0438
FSRS-5-preset-dev AUC (mean±std): 0.7031±0.0862
Weighted average by users:
FSRS-5-preset-dev LogLoss (mean±std): 0.3633±0.1712
FSRS-5-preset-dev RMSE(bins) (mean±std): 0.0690±0.0446
FSRS-5-preset-dev AUC (mean±std): 0.7024±0.0883
parameters: [0.35885, 1.2747, 3.14435, 15.5999, 7.1998, 0.531, 1.5584, 0.00455, 1.51725, 0.1241, 1.00435, 1.92805, 0.11405, 0.28615, 2.2698, 0.2315, 2.9898, 0.5166, 0.5966, 0.3429]
Model: FSRS-5-dev
Total number of users: 469
Total number of reviews: 13682985
Weighted average by reviews:
FSRS-5-dev LogLoss (mean±std): 0.3269±0.1624
FSRS-5-dev RMSE(bins) (mean±std): 0.0510±0.0356
FSRS-5-dev AUC (mean±std): 0.7145±0.0817
Weighted average by log(reviews):
FSRS-5-dev LogLoss (mean±std): 0.3599±0.1669
FSRS-5-dev RMSE(bins) (mean±std): 0.0672±0.0434
FSRS-5-dev AUC (mean±std): 0.7017±0.0886
Weighted average by users:
FSRS-5-dev LogLoss (mean±std): 0.3616±0.1685
FSRS-5-dev RMSE(bins) (mean±std): 0.0692±0.0441
FSRS-5-dev AUC (mean±std): 0.7011±0.0906
parameters: [0.2769, 1.2174, 2.7866, 15.3263, 7.1784, 0.5269, 1.7207, 0.006, 1.5184, 0.1399, 1.0231, 1.8932, 0.1165, 0.2662, 2.2709, 0.2191, 3.0002, 0.5799, 0.487, 0.202]
I have a new finding: with trainable decay, preset level optimization has a large decay (compared with collection-level).
0.3429 vs 0.202
🤔 preset-level optimization has a lower heterogeneity than collection-level.
Well there's a catch : Adding "as much as possible" only really work when you won't really have time to have good retention on the whole.
If you have a clear deadline, and the deck is already complete, you can try to use the "Compute minimum recommended retention" and see what it gives you
The thing, let say you need to learn 1000 cards and the test is in 100 days. Ideally, you'd like to do 10 new cards per day right ? But that might increase your workload, so if you can't handle that workload, you can drop DR to 70% so you can. But if it's totally manageable, you can add 20 new cards per day for example.
But since going from 80% DR to 90% will multiply your workload by 2.4, and if your test is successful if you succeed 60% of the material, it's better in general to start by just trying to cover as much card as possible at 70-80% DR than 90%
Because if you can only handle 5 new/day with DR=90%, at the end of the 100 days you'll know at 90% retention 500 card ... which means you will probably fail if you get the other 500 cards at the test
If you have 70% retention on 1000, well you might have more trouble recalling them at the test, but at least you covered them all during training
But since going from 80% DR to 90% will multiply your workload by 2.4
Are you saying this based on interval lengths?
You can't get a 100% accurate number that way, we wouldn't need simulations and CMRR if it was just "longer intervals = less work" all the way down to 1% DR
Ok not the same but is it wrong to say "longer intervals = less work" ? The CMRR is not interested really in "less work" but "more optimal work", which is sum(R)/load if I'm not mistaken, so CMRR is not that simple, but isn't "less DR less work" really false ?
with 90% dr
my avg card retention is like this
is there a way to see retention per rep
not per interval
1st rep 2nd etc
learning retention is like 70%
then young gets to 80%
and mature is 92%
@quasi shadow thoughts on having a "I use Hard as "fail"" toggle right next to "Optimize"?
Algorithmically, it would mean that Hard=0 if it's enabled. The only pain in the ass is that we would have to also include an alternative version of FSRS, one where the formulas are slightly different: the current main SInc formula that is used for Hard, Good and Easy would only be used for Good and Easy, not for Hard. Instead, the PLS formula would be used for Hard. hard penalty parameter would be repurposed to be used in the PLS formula instead.
we could limit the possible values like > 0.5 retention only
also if we use trainable decay we can add a loss function to bias the decay towards higher values
is there a reason why ‘hard’ messes up the FSRS thingy? Should I just not click ‘hard’ at all if I have FSRS enabled?
They are talking about hard misuse. It's when somebody forgot a card and instead of "again" they click "hard". That messes with an algorithm. If you use hard as intented (I remembered it, but it was really hard) then it doesn't mess up anything.
but how does the algorithm know if you’ve misused hard?
It doesn't, that' the problem they are trying to solve above with the "I use Hard as "fail"" toggle. Expertium is making the case for adding the toggle will treat hard as again
ohh alrlr
could you argue that clicking hard when FSRS is enabled, regardless if it was actually hard or you just forgot, has no definite change in the algorithm? what i mean is that when clicking again or easy, the algorithm knows how to change your review intervals while when clicking hard, its random and may result in an unreasonable review interval.
Manual says again is failing grade, and hard/good/easy are passing grades.
And now potentially there will be option for users to define what is passing/failing grades.
It feels wrong.
Also more stuff for users to get confused about.
If we support this case, everything related to retention should support it...
we should work on auto-detecting hard misuse. I think @cosmic hedge has some working ideas on this
FSRS Optimizer Package. Contribute to Luc-Mcgrady/fsrs-optimizer development by creating an account on GitHub.
I hope its effective at least XD
should I pr it?
BTW with @cosmic hedge we're also looking at helping detecting leech by Checking the Ratio of "Lapse/Day" since the last Introduced/Reintroduced.
For example, if you lapsed 10 times in the last 100 days, you have a ratio of 10%, the higher the ratio, worst it is.
Could help when you've been using anki for long, so a lapse of 27 count, for a card that was added 360 days ago, is not that bad compared to a lapse 10 over 50 days
JS info :36413 Lapse Ratio: 30, Card ID: 1736119027215, Lapses: 3, Introduced Duration: 10
JS info :36413 Lapse Ratio: 27, Card ID: 1736364326036, Lapses: 3, Introduced Duration: 11
JS info :36413 Lapse Ratio: 33, Card ID: 1736623228139, Lapses: 2, Introduced Duration: 6
JS info :36413 Lapse Ratio: 27, Card ID: 1736623425408, Lapses: 3, Introduced Duration: 11
JS info :36413 Lapse Ratio: 60, Card ID: 1737490967897, Lapses: 3, Introduced Duration: 5
JS info :36413 Lapse Ratio: 25, Card ID: 1738269963453, Lapses: 1, Introduced Duration: 4
JS info :36413 Lapse Ratio: 50, Card ID: 1739465889403, Lapses: 1, Introduced Duration: 2
JS info :36413 Lapse Ratio: 42, Card ID: 1740067012962, Lapses: 3, Introduced Duration: 7
that would be useful! I have a few cards that I failed a lot when I began to learn the language and had to get used to everything including the writing system. Now I'd like them to be like any other cards .. but because of the leech threshold they're not allowed to fail ever again..
there you go https://github.com/open-spaced-repetition/fsrs-optimizer/pull/173 hope it's not awful somehow 😭
The idea goes as follows (correct me if I'm wrong):
w[15] is a measure of the degree to which hard differs from good.
If the user is abusing hard, handling hard reviews with stability_afte...
Same ! I'm also with time leaning towards resetting some card, not much, but to start resetting some cards. Right now my workflow I started yesterday is :
- Deck "Low Difficulty" : Leech Threshold at 7, if it reaches it, just tag them. I'll move them in the "Deck Hard Difficulty" once per week
- Deck "High Difficulty" : Reset Leech Tag, and set Threshold at 14. If it reaches it, I just suspend this time.
Once in a while, I check if High Difficult doesn't have cards with high stability, no failure for quite some time, if yes, I reset and move them to low difficulty again
Yesterday I did the split, today I had 185 reviews instead of 300, did 90% retention instead of 86%, and I reset ~60 cards more or less
I don't like to fiddle that much with the cards, the main reason I prefer FSRS over the old algorithm is because you don't have to set your own steps and care about the infamous "ease hell".. moving cards around in decks to trick the algorithm sounds like there is a problem with the algorithm in the first place.
My idea is that the problem comes from the leech detection algorithm, and not the FSRS algorithm, so I was thinking about stopping to use the Anki leech tool, which is too naive, and coding my own.
https://github.com/rbrownwsws/leechkit there was recently some work on a new method of detecting leeches
Contribute to rbrownwsws/leechkit development by creating an account on GitHub.
and I can only imagine all this extra discourse lately surrounding leeches is @unique salmon making sockpuppet accounts to convince me that people actually want this feature so I will actually implement it in anki proper. but that is a level of comittment that I think I can no longer ignore and I guess I'll look into it for real 🍃
I would certainly make sock puppet accounts if that convinced Dae to implement automatic optimization
this is just the test run
I am more easily manipulated, apparently
easy mode, if you will
Imagine if some other app that uses FSRS got automatic optimization before Anki
Like...IMAGINE
I mean they probably already do because they hide away all knobs
I would shit myself. Or call Dae a motherflipper. Or both.
Remnote
Some random "literally who" apps use FSRS
IIRC Remnote still uses FSRS-4.5 though
do you still have to optimize in it manually?
if you want auto-optimize you gotta make optimizing faster 🍃
make it a 1second background task
I thought the main issue was sync clashes.
I think it is, yes
I'm not a puppet! I swear!
can someone actually explain sync clashes and why its a problem
on the surface it sounds like a total bullshit answer
Optimising params changes the memory state of every card.
Dae ¯ \(ツ)/¯
does sync not work on a per-attr level?
I've never actually looked at the sync code.
I have not looked to deeply at it
so is the idea that some reviews happen on device A, optimization on device B, A's reviews could disappear cause B took priority in the sync?
I think that's it, and like for other possible sync clashes, the app ask you to make a choice
also I should probably read how fsrs even works and why it needs to mod every card
I guess something along those lines. It might be something that requires a change in how syncs work, which could be hell with all the people who do not immediately use the latest version of the app.
If you change params, values of difficulty, stability and probability of recall have to be recalculated
mods, check his ip
surely those can be derived on the fly 🍃
If you mean "fast", then yes
It's faster than optimization itself...I think
Unless Anki works in some extremely janky way
I mean rather than store something on the disk that needs to be synced, calculate them every time they're needed
Theoretically it should be faster
Then they will have to be recalculated after every review, which means that cards with a long review history would be laggy
how laggy
You need the entire revlog for the card (at least until the last reset). I don't know if that would cause a noticeable delay.
is 1ms laggy?
Cause in order to calculate intervals you need the entire review history
So you go over review 1, review 2, review 3, etc.
how many reviews would a card need for there to be a notable delay
Idk
my point is it might be slower but it also might be 1ns vs 1ms
aka completely irrelevant
mods, check his ip
not his, her 🙂
brb digging in anki to find where this shit is mathed out
(don't ask me where I dug this up)
So I guess the delay wouldn't be bad with Rust
Microseconds. In Rust
No, wait, milliseconds
I'm looking at the last one
ms is miliseconds
yes it is
also i'll be worse on phones
in that 10k dataset, how many cards are there with 1000 reviews though
and like, maybe this code is optimizable
do you have the code setup for that benchmark
It might be 10-20% faster now, IIRC Jarrett did some optimizations
Nope
rip
I don't even remember where this is from
well you saved me from wasting a day doing optimization attempts
but I think if this could be fast on shitty phones, it could possibly remove the sync issues and get fsrs autooptimize? I might tug on this thread a bit this weekend
But maybe there's a middle ground between calculating on the fly and preventing smooth sync when the params changed? Maybe the new review could be merged without loosing the new params?
Hold on, let me fidn an actual proposal that Dae rejected 🤣
yeah I'm curious what he vetoed
I think there is also the problem that we now have Anki clients that expect the FSRS memory state to be in the card info. You cannot just drop that because it would break all the old clients.
I mean its as much a breaking change as any other update?
yeah sync sounds like a real difficult way to go
The sync tools seems to be very basic today: some changes on a note/card could be merge without loosing data, but in doubt the sync is rejected. I like that the app is very cautious, but it's also frustrating sometime.
hence the plan to just make there be nothing to sync
(old clients can still sync it if the ywant it, but new clients wont bother!)
I think its a pretty safe change re: backwards compat?
I might write up an issue on the repo and see what dae things on the matter
(I might need some hard numbers about perf first though)
calculating things on the fly instead of fixing the issue with sync doesn't seems like a good idea to me, for the project as a whole
and for clarification: pub(crate) memory_state: Option<FsrsMemoryState>, is the thing that fsrs recalculates on optimize?
But if the new clients change the params and don't update the card info the old clients will go wrong. They will keep scheduling based on the stale memory state in the card info.
sync is a mess adn literally no one is going to fix it ever
Btw, the actual times will be longer because of the overhead of loading the review history from the hard drive
I think the big problem with sync is that the AnkiWeb stuff is all closed source so updating it relies on Dae having the time to do it himself.
This is just pure "math time", without counting the time it took to load the review history
oh but.. theoricaly, one could have their own sync server, right? there's a "Self-hosted sync server" option in the computer app.
There is, but it isn't the full blown AnkiWeb. I don't know if you can just update the sync logic in the publicly available code or if Dae would have to do more work in the closed source part of AnkiWeb.
I don't fully understand the problem yet. A user could press the optimize button after each and every review and there wouldn't be a sync issue. But can't we just simulate this effect by doing an optimization automatically in the background after every review? Where in this process does the sync issue appear?
If AO is done on a second device
- Device A: do reviews, optimise, don't sync
- Device B: do reviews, optimise, sync
- Device A: sync
so from what I can tell this isn't an AO issue, this is an O issue
it's perfectly legal for a user to go out of their way to press the optimize button after every review
Yes, but AO makes it a problem that will occur much more frequently.
then however this is currently solved for O, we can just do the same thing for AO
surely the mechanisms already exist, since this problem already exists
🤷♂️
Asked GPT to summarize since my last message
Asked if Jake was nice
"Total bullshit answer"
- Maybe a bit blunt
lol
this will certainly suck for the two weeks the mobile versions are behind, but people can also just wait for everything to be updated, it doesn't take long anymore
thats hardly a reason to not do this though
so in my r&d of this idea I did find one roadblock: rescheduling cards when fsrs-ing
how would an automatic fsrs optimization handle that case?
cause I'm pretty sure my proposed system would not work when that is desired (and that is most certainly going to cause a syncing mess)
I think ideally you would mark cards as either automatically or manually scheduled.
Automatic cards don't cause sync conflicts, you just re-optimise and recalculate due dates after you merge the differing revlogs.
I think the hard part is still playing nice with old clients and updating the sync protocol in a way that is not annoying to Dae.
It might also be nice to have a way of doing more granular resolution of sync conflicts than just choose local or remote, but that is another feature entirely.
I think based on dae's response about sync that anything touching sync is by definition a shitshow and so anything that requires a change to that is not an option
https://github.com/ankitects/anki/blob/main/rslib/src/scheduler/answering/mod.rs#L439-L452 turns out the logic is already here! just only triggered in cases where the values aren't already there for some reason
I can't wait for @quasi shadow to show up and tell me I'm very dumb and wrong in my entire approach here
Yep. It took 1.5 years for three my colleagues to refactor the backup module to a real-time sync system...
Yeah, it is.
We are using Operational Transformation for sync.
sounds like my idea isn't totally stupid then? 🍃
Of course not. I also suggested it long time ago.
I forget it. I'm searching it😂
Sorry, I'm lost in the message records...
The user is potentially switching back and forth between apps (think copy+paste to create cards). If the one month period elapses at an inconvenient time, the user may end up having to wait for the optimization to finish, or will need to repeatedly cancel it each time they switch back. That benchmark ignores the I/O costs of reading that data ...
But I find this reply.
i/o complaint
I want to know what sort of storage medium he uses where reads on the order of a kilobyte cause problems
is he using s3fs for his anki decks? like thats the only reason to think about io speeds in this context
I still have a fuckin magnetic spinning disk in this machine, I should benchmark on that to show how stupid it is here
does anyone have a deck laying around that has cards with an absurd number of reviews?
I don't know the strats for generating absurd edge case decks
In my benchmark, reading the entire collection and convert them into fsrs items only takes 30ms.
not surprising!
actually I also have another spinning disk in here thats on its last legs and basically dead, fails reads half the time
I should benchmark on THAT too
running 1 test
Time taken: 205.6825ms
[src/convertor_tests.rs:329:5] revlogs.len() = 88158
but are you including the disk read? 🍃
are you on one of those newfangled nvmes? I need to optimize for 90s tech
Of course.
I think 2010s is enough.
Because someone still use Ankimobile in iphone 4s.
my joke was any ssd should be fine in terms of speed for the sheer....lack of volume of reading of data that occurs
wait why the fuck does any of this matter you only need need this calculated at the end you can just precalcualte these numbers after rendering the front and then its like anything under 100ms is acceptable
you said it was refactored, but I guess you're talking about another project?
I was talking about my formal work.
It's a test of the improved same-day S formula with six learning steps.
In previous version, the stability goes crazy😅
whats your strat for making so many reviews so quickly, I need something like that to benchmark disk speeds
Create a test deck and a test card.
Then you can make so many reviews so quickly via rating the card with again and easy
oh nice
Oops. I guess storing the memory state in Card is still necessary if we want to display the S/D variables in card browser and stats.
can it not just be calculated on the fly there too?
browser maybe not
that'll actually be expensive
I assume those are values that weirdos like @bold terrace would want to query
@bold terrace thanks for killing auto-optimize 🍃
You will feel the lag significantly with 10k cards.
rip
modified from https://github.com/rbrownwsws/leechkit
patch: fsrs4anki-helper.ankiaddon.zip
You can feel the lag with this PR.
But it could be improved.
oh well
at least I can continue my streak of not being helpful
I was awfully close to doing something useful there
Yeaaah not being able to query them would be a bit sad. I think even the FSRS Plugin need it no ? Or maybe does it recompute it live each time ?
THere's always the possiblity to create graphs based on recomputation of those, but sometimes you have different result with the fsrs forgetting curve function if you use the javascript one instead of the rust etc etc so it would be definitely a bit sad
I don't also know why can't they be just like any normal values in the database instead of being json-sneakily-stored
This is why we can't have nice things
gonna sepearate a deck out by difficulty
lets see what happens
@bold terrace
here was before, in all of the deck
then i divided by 10% 4 times so basically in4
these are the parameters before, and now i optimized them
60-70% difficulty - .0411, 4.20% rmse, 4067 reviews --> .0069, .57%
70-80%- .2417, 7.95%, 4803 reviews --> .2174, 6.39%
80-90%- .4618, 11.33% -11,681 reviews --> .3856, 5.83%
90-100 %- .6039, 16.00% - 5,130 reviews --> .4781, 8.47%
future due looks like this
woah life changer
but i truely do wonder if i have messed something up
oh is that the actual sync problem?
and this is after
the average stability went from 19 days to 8.2 months?
yes
i wonder if it shot the reviews out so far because maybe its comparing them to each other
what is the average stability of the 60-70% difficulty subdeck?
If anybody wants better D formulas, go kick Alex and Jarrett in their butts
70-80 turns into 1.7 months, 77.72% - 92.97% - 88.42%
80-90 turns into 8 days, 78.55%, - 86.04% - 55.13 % (78 reviews)
90-100 turns into 3 days, 78.86%,- 79.66% (3800 reviews) - 44.44% (9 reviews)
d formulas are your job
i guess it makes sense, if you make a subdeck with only cards that only pass then you would expect the stability to be very high like the 2 years that you got
should i keep them like this
and review them like this from now on?
cause like looking at the first 1 with 2 years stability
its 2754 reviews with 99.92% young answers
and 428 reviews with 100% mature answer
you can check to see if your method actually generalizes to new cards or not by doing something like:
for each new subdeck, only optimize the params on the first 4/5 of the deck by splitting into two new subdecks with the first 4/5 and the last 1/5 respectively
then evaluate the params on the last 1/5 params
My job is as follows:
- Make a non-trivial (more than changing one number in one line) change to the benchmark code
- Get an untraceable error like "Something broke. No, I will not tell you the variables and functions involved. Just something broke somewhere"
- Complain that the benchmarking code is dogshit
- Ask Jarrett to implement my idea
first 4/5, what do you mean by this
by which sort
to sort them by difficulty, first review etc?
@robust hill don't worry about it, Alex just wants to do complicated things that are extremely cumbersome to do in Anki
can you write a formal spec on the new leech detection stuff so it can be posted as an issue on the repo dae can reject it?
I wrote a comment in the PR for the Helper add-on, we'll test it there
legit I think the prop:leech-p thing is the best part of the idea
is that leech to review ratio
thats the percent confidence that its a leech
o
the problem is in the way FSRS parameters are evaluated in Anki. The parameters are evaluated on the very same data that it was trained on. i can easily come up with an "algorithm" that achieves nearly 0 on both log loss and rmse by just doing a database lookup
so to properly evaluate FSRS parameters you need to manually evaluate parameters on data that FSRS wasn't trained on.
yes
so how can i do that
You can't
gg
It's not implemented yet
you can do it manually
- make a subdeck A with the first 4/5 of reviews and another subdeck B with the last 1/5 review
- optimize on subdeck A
- copy the parameters from step 2 to subdeck B
- press evaluate on subdeck B
what if i set the due date of all the cards to today and do them out and see what the interval says
First Jarrett needs to implement it in the Helper add-on the way I described in the
Then we can show it do Dae
Then: https://youtu.be/VyQOnn2K_zw
yes but what do you mean first 4/5
is there a way i should sort them
or just by whatever
yeah idk about that actually i don't know how anki stores its data
okay ima just do some random
okay so card stability of the 4/5 = 2.33 years 10% avg diff
and 1/5 becomes 8.8 months 7% diff
on 4/5 deck its .007 logloss, .59%
on 1/5 0.3579, 10.83%
actually this method is very hard to test properly because when you made the 60%-70% difficulty split you are already sending back information from the future in the way that you are constructing the deck. You know already that these cards will have low lapses
i think just forget about this idea
probably not
lemme try by dividing the deck into 2 sections with half of the cards
.4083, 4.64% - 7 days, 71%d 89% r, - before opt- 13 days avg interval, after - 8 days
.1395, 3.60% - 6.7 months, 64%d, 98%r - before opt - 1 month, after - 6.2 months
stability matches the interval when i split them apart
i have the ultimate geography deck but only for
name of country to the flag shown
i will review this deck for the next month and see how it goes, but month is probably not long enough
@spring adder i summon you
but how bro how 100% mature correct for 25% of the deck
i think you all think im crazy
Problem with the 4-way split is how cards won’t easily moved between them and how you’ll have to find ways to manage this yourself
With two you can still have heuristics to help you (lapse etc)
But with 4 it feels a bit tedious no ?
Where to gather new cards, when to move them up, or down
The original motivation of the split was for all the people that had parameters that make card go after every lapse in a new "Higher D" cluster and never be able to go back
But your Difficulty in the old deck feel like it was allowing that
so you didn't really had something to fix
Of course, you get better params for each partition, but how to know in which partition new cards should go ? And how will they move between ?
In my case, I had two spikes, one at 60% D, one at 95% D, when something was failing 2-3 times, they would go in the 95% D cluster. Which lead my FSRS to have really a single equation trying to describe 2 very different cluster by moduling D
But in your case (here)... it feels your cards were distributed quite nicely.
THis is with my old params : A single fail would take FOREVER to be compensated by goods
WIth the new one, ˜10 would be sufficient
I think the problem with D is that, in ideal setting, it would modulate smoothly your higher and lower difficulty card so it's able to represent them both and represent how successfuly rating one, would lead you to lower D again. I think your old prams were able to give you that situation.
IN my situation, D evolved more like a "binary selector" where you switch from Low to High D super quickly, and with enough amplitude between its starting end ending value that it can drastically change the prediction, emulating 2-function-in-1
By Splitting the deck, I basically allowed D to play again a smoother role 🙂
But I'm not sure it's really what we see with you
What we do when we split, is in some way running first FSRS to do a clustering based on D, and then for each cluster we do optimizing on each cluster. It will give better parameters (as long as the clustering has logic, reduced entropy) because now instead of having ˜18 params to describe the collection, it has N * 18, with N the number of clusters you created.
Problem is, in my case, the clustering is quite obvious : Cards with 0-2 leeches vs the others. So I can play the first layer myself. WIth your params, I'm not sure on the long run you'll be able to keep those cluster "different enough" so the params will be able to still yield better prediction
That's also why Neural Network are godsent for that kind of application : You can model create a shitton of parameters so it can do that for you in one go, and without having to manually conceptualize what is S, D, R
(Well you can still define them, but then you just feed the model with them as input and the model, if they can find relation between Stability and what you want for example, it will)
(I don't know @polar maple for your tests, if you just feed what you got from Anki or if you did some Feature Engineering to define more input variables ?)
You could even imagine using FSRS output parameters as engineered feature (Or thing like "Max Stability in history", "Max Good Answers Streak", etc)
Thank you for this
I will read tomorrow
I completely forgot about one thing but good that you reminded me
About the moving them
Honestly, how dare you.
I'm not a big thinker, but...
- Your metrics seem to show that FSRS gets better at predicting if you do this split.
- The massive growth and shrinking in stability per new split along with the improved metrics suggest that the cards you've split are very different from each other and should be split.
- Like Sound says, this seems cumbersome to maintain*.
I generally prefer splitting on card type (TL -> NL vs NL -> TL; image -> name vs name -> description) because it's easy to do automatically, but splitting on difficulty is something that I'm also toying around with.
I noticed that a lot of the cards I have are things I'd like to keep (because I'll probably forget them one day), but some of them are very, very easy and, at least in the short term, I don't get them wrong. So I've split one of my decks into "normal" and "super easy" subdecks. All of my cards are introduced through the "normal deck" and every now and then I look for cards there that I introduced more than 100 days ago and have never pressed again on.
-introduced:100 -is:new -rated:9999:1 ||was not introduced in the last 100 days AND is not new (so was introduced at some point) AND was not rated again in the past 9999 days||
Those go into a new "super easy" deck. This has generally, I feel (and by the metrics), made FSRS better at scheduling the cards and my retention in the "normal deck" has actually improved despite pulling out the 98+% retention cards into a separate deck.
But I'm probably in the wrong for even having cards that I never get wrong, lmao.
--
* The least cumbersome way I could see to retain this split would be...
-1. Introduce all of your new cards into your hardest (or most populous?) subdeck for initial study.
Why hardest subdeck? ||If you put them in the hardest deck, you'll study easy cards a little more often then you'd like before you shunt them off to the subdeck, but hard cards will get reviewed optimally, which is probably more important.||
Why biggest subdeck? ||If you put them in the most populous deck, you increase the percentage of cards that are already in the right place to start with, and reduce the effect the "temporary" cards have on your parameters for that deck, but that's only a problem if you optimize more frequently than you split.||
Once a month
- Move all of your cards back into the parent deck.
- Optimize as one to get updated D.
- Send them to subdecks again based on their new D.
- Reoptimize to get new parameters for subdecks with new members.
Should take like five minutes or less, especially if you're smart and save the queries in Anki
Or, if you can identify that there are specific types of cards that you rarely get wrong and just preemptively shunt those off into the easy deck.
--
Finally, the super long long intervals on some cards might be concerning. If you're using this for academic study or something, you should probably use a filtered deck before your test to cram those to (A) see if the interval prediction is even close to right (do you actually get almost all of them right) and (B) you know be safe or whatever.
--
Finally finally, it seems like you and I are both fucked up because we have a large number of cards that we almost never get wrong.
In my case, those cards seemed to be messing with FSRS' ability to properly predict stabilities for the cards I was getting wrong, and things have subjectively (and by metrics) been better since I pulled those out into a separate deck. (I like @bold terrace's explanation of this above with the N^18 parameters instead of just 18 parameters.)
So you can take some amount of heart that what you're doing isn't the worst idea, maybe. We can hope.
Your metrics seem to show that FSRS gets better at predicting if you do this split.
We can't be sure with the way Anki implements Evaluate. Since training set = test set, it may just be overfitting. Without implementing the procedure from the benchmark, we can't rule out overfitting
Remember kids, never test on the same data that the model was trained on
Though, even then there is another concern - RMSE being correlated with N reviews, so splitting decks can easily makes RMSE worse
I like @Sound's explanation of this above with the N^18 parameters instead of just 18 parameters.
ACHCKUALLY 🤓 N*18, not N^18
But whatever, without a proper evaluation procedure this is all moot anyway
By "proper" I mean "no overlap between training data and testing data"
If both (A) reducing the number of reviews generally worsens RMSE and (B) splitting the decks resulted in better RMSE across the board in this specific case
then, while it's not definitive, doesn't that actually increase the confidence that it's doing something useful, because the expected negative (reduced reviews = worse RMSE) was countered, despite limitations in the current Evaluate?
let's add an advanced advanced tab under FSRS with the real evaluate button 🤔 
in this case the RMSE can fall due to a similar phenomenon as the following procedure:
We have a biased coin that we want to estimate the proportion of heads. suppose you have results from several coin flips [0, 0, 1, 0, 1, 1, 1, 0, 0]. A good estimate for the proportion of heads is p = 0.5 and get a non-zero MSE. But similar to your Anki procedure you decide to gather the 0s into one subgroup and the 1s into another subgroup. Then in the subgroup you would just predict p = 0 or p = 1 for either subgroup and get an MSE of 0.
this is related to D since D is highly correlated to the lapse count and also to the proportion of reviews to lapses. And as dyzur saw with their deck, in the lowest difficulty subdeck they have pretty much no lapses on young and mature cards.
so RMSE should be expected to fall if you split a deck by D but it remains to be seen if doing so will actually help the subdecks to generalize on new unseen cards
Actually the above example isn't completely accurate since RMSE (bins) has a different calculation than just MSE. RMSE (bins) in the p=0.5 case would be 0. But log loss would still have the wrong behavior and log loss is the superior metric
Are there any updates about the leech detector
I feel like Hayao Miyazaki. I always say I'm going to retire, but it never actually happens.😭
well i think
i wouldnt move them anymore
i wouldj ust kep them how they are
and i have no more new cards
so
this seems like a good solution
over the break I have, I will rework my Language learning deck
to divide this, into just
Vocabulary
-Speaking/Writing
-Reading/Listening
afaik the smallest syncable unit is a table row, i.e a card, revlog entry, a single tag and an entire preset
what do you mean by 0-2 leeches?
like the card gets leech tagged 2 times?
how in anki can i start with
greater than or equal to
and
less than or equal to
=>
=<
or i cant
prop:lapses>7 by lapses
what if I divide the decks
thats kinda what i understood from you
its not so hard for me to do this every day
all cards = 77% difficulty
0 lapse = 66% - 431 cards 37.3% of deck - 100% true retention
1 lapse = 78% - 244 21% of deck - 92% true
2 lapse = 83% - 133 11.5% -88%
3 lapse = 84% - 123 10.6% - 86%
4 lapse = 86% - 77 6% - 84%
5 lapse = 87% - 47 4% - 82.5%
6 lapse = 87% - 19 1.6% - 83.2%
7 lapse = 87% - 19 1.6% - 81%
and then after 8 lapse = 99% for all of them -44 3.8% - 78%
for now i have 3
if a card has 0-2 lapses, its in the normal deck, if it has 3-7 its in the leech deck and more than 8 = super leech deck
and as obezag said, in around 2 months i will go through all of them to see how this works out
i did this and, it cut my workload in half for the 0-2 lapse cards, but 2.5x the 3-7 lapses, which makes sense
Hey guys, how can I set a weekly easy day schedule for all my presets at once? (I have a lot of presets but I want this weekly easy day schedule applied to all my cards)?
you'll have to go through each preset individually.
that's the only way sadge
I want to ask something. This has been on my mind for quite a while now, possibly ever since I started using Anki 3 years ago.
I know Anki was not originally thought to be a tool you could learn and encode new stuff in your brain with. That was the SM2 era. Then came FSRS v4 - back then up until FSRS v4, I was using my own learning steps, and then finally FSRS v5 came along 6 months ago which takes short term reviews into account, but also comes with the short term scheduler (which has no short term memory model as of writing this)
However I DO use Anki as a tool to learn new stuff for the first time, not just for retaining already learnt stuff. It organizes stuff for me, and makes me keep my learning pace without dwelling too much on thing.
How much is it hampering my learning when I use Anki as my primary learning tool
And would FSRS be able to cover for first time learning completely in the future, not saying that it doesnt already, just probably not optimally for the lack of an actual short term model
Well, the next version will have one new parameter for that + the "set params to 0" thing will be changed so that instead the last params just have specific maximum limits that vary from one user to another. So your situation will improve.
As for a proper short-term memory model, idk, maybe at some point in the future
I do actually use Anki to learn things from scratch. I mean, I mine them in real setting but in general when I see them for the first time in Anki I have most of the time 0 chance recognizing them, and thus when I do, of course, it will be longer (Initially Stability : A:0.4, H:1.4, G:3.3, E:33).
In my opinion, and to directly criticize what I saw with you, is that you seem to bruteforce reps in Anki and from what I remember, you have an average time by review of 1-2 sec. With such an habit, you're basically destroying completely your Stability and no amount of "Good Algorithm" will compensate for your lack of "brain effort" to learn/encode it.
Don't get me wrong, I don't blame you for the sake of blaming, I also have done the same error in the past, blasting through reps like "FSRS will adapt, same-day review will fix it for me", but what it led me to is to fail 30 times the same card for ~5 days. Only when I started taking actual time to look at those, trying to understand those, that suddenly "reps" become more meaningful.
That's also why I'm pretty convinced now that failing is "NOT OK". By not OK, I mean sure it's inevitable, if you really want to never fail you'd be reviewing every word every day. But forgetting might often mean dropping stability potentially by a lot and also having Difficulty increasing, Difficulty that might never go back. So personally, I'm really really not convinced that dropping DR lower than 90%, is a smart move, if you care about Stability.
Having said that, I still think SM-2 has a lot of advantage over FSRS in that sense : Each card has it's own ease factor, each time you fail you start completely over reinforcing that idea "I want to have longer and longer cycles, and if not, I'll start a new cycle a bit slower". It's for sure "less optimal" than FSRS and less accurate in terms of retention prediction, but it has the big big big advantage of making learning predictable. With FSRS, if you keep doing 90% R for your card, your stability will just converge to some value, after declining since all lapse will increase D. Playing with the visualiser is a very easy way to see it.
Still, I'm using FSRS and not SM2 because I think that if you find ways/counter-measure to compensate for that lack of card-by-card customization (like splitting deck in hard and easy one, keeping a high DR instead of being tempted to do a 60% DR for the sake of optimal memorized/workflow), FSRS accuracy is still a fucking god sent for predictable workload.
IMO, the next big steps to improve Anki would be to stop chasing the perfect logloss/RMSE but building tools to find optimal ways to order cards to learn, based on some N+1 principles (For example, findnig cards very similar to the one you have, to be able to add as much new card/day as possible without screwing everything, etc etc)
yeah honestly fair enough
if i actually sat down for a second, maybe hand write my leeches out a few times
i could eliminate that
lack of card-by-card customization
I have no clue what you mean, considering that each card has its own review history (interval lengths and grades), and FSRS is using all of that
he says it there in the parenthesises
@quasi shadow so is the idea with neural D dead?
Well, less of a "dead" and more of a "never began"
Since nobody wants to fucking do it
you can try to do it yourself
Can you at least make a combined revlog from 20-100 users?
just look at pretrain.py, it concatenates revlogs from multiple users
😭
this reassures me a bit. Thank you🙏
- It use all the card as a whole to build only one set of parameters that will be used for all cards, which means outliers will be sacrificied "for the greater good", which means if you don't exclude those outliers, you'll either poison the model, or those outliers might become mega-leeches. WIth SM2, the ease-hell will be located to some card.
- This could be mitigated if multiple parameters would be generated for clusters of cards. A good clustering would be on D for example.
But forgetting might often mean dropping stability potentially by a lot and also having Difficulty increasing, Difficulty that might never go back. So personally, I'm really really not convinced that dropping DR lower than 90%, is a smart move, if you care about Stability.
But forgetting might often mean dropping stability potentially by a lot and also having Difficulty increasing, Difficulty that might never go back. So personally, I'm really really not convinced that dropping DR lower than 90%, is a smart move, if you care about Stability.
Yes I think I have fallen in that trap already, but I dont think that is because of a mistake of mine necessarily. My D has converged to some high value that is very tough to move because of this.
For some cards, I have 240 reviews just in their learning phase, i can't get past the >1d mark for them!! The leech of all leeches
Yeah but you see, if you can't remember something more than 1d ... It really means you don't spend enough time with the thing you really want to learn, or you just put too much of them
My deck looks like this rn in terms of difficulty. But that doesnt mean I wasnt trying to encode the cards in different ways. i tried mnemonics (imagery, mnemonics, acrostics) you name it. To the point I am not trying anymore. I just try to brute force it out of despair
Stop recording VTuber drawing dicks and you'll be able to do it 😄
I am trying to spend more time with my card. Not just pattern recognition if that is the word you are looking for. I have fallen into that trap before and I still do from time to time
Have you tried to learn outside Anki ?
I mean, I have really trouble imagining someone not being able to really learn some stuff for more than 1d
Maybe it's the format that is too "in a vacuum" ?
People always advocate how cards should be atomic, but if you can't relate anything between them, you have less "links" in your brain to actually retain them
My pace is very erratic outside of Anki. I have enormous ADHD, Anki helps to keep me in line to the point I have grown dependent on Anki and cannot learn outside of it
Ok ok ...
It is atomic enough. If you call one word answers not atomic then I do not know what atomic even is 😓
Yeah sure but I'm saying sometimes having atomic things to remember is the culprit
Sometimes it's easier to remember things by connecting them together
Not in Anki for sure
Sometimes (or maybe more often than not) my brain just outright refuses to learn stuff
but Anki is more for testing yourself in those situations realy
It is a deadlock
I see, maybe no FSRS update will really solve this for you
but you'll figure it out I'm sure
🙂
IMO, the next big steps to improve Anki would be to stop chasing the perfect logloss/RMSE but building tools to find optimal ways to order cards to learn, based on some N+1 principles (For example, findnig cards very similar to the one you have, to be able to add as much new card/day as possible without screwing everything, etc etc)
This is why I agree with you on this. I am all for sacrificing a bit of retention if it is about making the learning experience easier
It was suggested at one time to @quasi shadow
1-FSRS-sec
2-Reducing the minimum stability to 0.001
This is to make FSRS produce even smaller intervals so that I can work my way up slow and steady if it is needed for difficult cards
But both suggestions ended up worsening RMSE and logloss though.....
So I have nothing to wait for except this long awaited short term memory model
- It makes metrics worse. And since short-term S formula doesn't even use interval lengths in the first place, I really, really don't see the point
This is to make FSRS produce even smaller intervals so that I can work my way up slow and steady if it is needed for difficult cards
I know it does make the metrics worse which is why Jarrett canceled the idea
short-term S formula doesn't even use interval lengths in the first place
Hmm tf
it works metrics worse way less than a smaller decay improves metrics. if a smaller min stability helps some users then we should use it, just like with a higher decay in order to screw 90% of users and help 10%?
Like I said, I wouldnt mind sacrificing a bit of retention if it would make learning with Anki (the encoding part) much easier
Yeah but minimum stability is what, 10min ?
It is 13 min
💀
for me at DR 95%
more learning steps
Sorry to be that direct, but if you can't find a way to remember something for 13 min, no amount of FSRS improvement will help you. Take a fucking pen, a fucking piece of paper, write it down 20 times or for 10 min, and you'll remember it
Seriously though, people learnt things before Anki just fine
You're destiny is not to be an hamster in a Anki-Wheel
If you have ADHD, I'm not sure having flashcard flashing in front of your eyes every 3 secs will help you getting enough focus to actually focus on ONE thing for 5-10min
I tried so, my brain just has enormous difficulty. I have gotten to the point where I started questioning my mental capabilities because of this
It's like trying to learn something through TikToks
We need to figure out why different decay is optimal at different retentions, and then either make decay adaptive OR admit that something about optimization is messed up and the optimizer is overcompensating for...idk, something
Yup, maybe a doctor could help you better than Jarrett on this one
You could put it that way. But no I have structured my cards well enough to not be too information dense.
My weakness is lists and processes where you have to recall things in series. My brain just short-circuits...
The problem too with those hamster-wheels, is that you believe that if you slow down or stop it, you won't ever be able to recover from it. But sometimes it's by taking things maybe a bit more slowly that you can build stable knowledge
Here is how my cards look like
my theory is that if the data was generated with SM-2 as the underlying scheduler, different user retentions corresponds more to a different distribution in data difficulty and it is not surprising in this case that decay is different
I try to speak out the answer so that I dont zombie-review
And dont fall into pattern-recognition reviewing
I mean, if your studying to become a healthcare professional, at some point, even with a degree, you'll need to be able to talk to patient and remember important things from them without having Anki recalling them for you
I know. I have grown too dependent on Anki 😞
TBH honest it's I think something too many people fall into
But when you look with facts, like how many new cards/day someone can handle etc, you realize there's no way to learn something through anki alone
Which is funny since there was a time where I found Anki too alien for me. I even thought it was the reason for my failure at one time to the point where I completely stopped Anki for a couple of months
I do have to say though, I am learning in a foreign language
100K words with 10 per day means spending 27 years before knowing them ! And we estimate easy language to have 100-200K, and japanese 500-600k
Yeah, but didn't Jarrett show that even with synthetic data generated from a forgetting curve with decay X, the optimizer optimizes decay to Y instead?
I (and you) learnt English without Anki ...
German is not my language, which does make things considerably tougher
But I dont know how tough does learning medical subjects in German make it before I start blaming my brain powers
I also have autistic tendencies
I began self-diagnosing
which is debilitating mentally
I think anyone who uses Anki every day for more than a year is on the spectrum 🤣
he didnt generate the data in a way that can differentiate the two curves well, both types of curves has a similar training error and the power curve generalizes on unseen data worse. Whereas in the 10k set we know that a power curve with a lower decay generalizes much much better
Oh you wouldnt know. My Anki app run time surpasses game time of all those hardcore COD players
Yes indeed. But this problem goes way back as well for me. I never really was able to absorb information quickly now that I think about it which makes me more and more inclined to think that I have some kind of mental impairment, but I am running from a formal diagnosis because frankly, I dont have time for it and too lazy to even bother about it
So all things Anki, FSRS is my life
I have discord just to be able to check in on Anki news
I follow Anki news like it is the stock market
Unironically don't follow stock market news if you want to invest in the stock market
I could write a bit about investing, but idk if anyone here would be interested
me too
I want some @unique salmon stock tips
I wanna know how he uses vtuber trends to predict the next best stock
TLDR: just read "The Little Book of Common Sense Investing"
Lol
There is actually a publicly traded vtuber company, but it's a lump of shite
Bscause minimu stabillity is set to 0.01 and not lower
what is minimum stability
I have understood it as such
And I don't think it's listed on any stock exchanges outside of Japan anyway
dude that just means it has opportunities for BIG GAINZ 🍃
Jargon that Jarrett and Andrew have concocted
so now I'm just imaging @unique salmon made it big on stocks adecade ago and is now just a total neet
The minimum value of memory stability (FSRS thing https://github.com/open-spaced-repetition/fsrs4anki/wiki/abc-of-fsrs)
what if we execute jake
Nah, it's Nijisanji aka the blackest fucking company in the industry. Minus that one company that doxxed it's own talents
sacrifice jake to the algorithm
I don't think that'll give good returns in the market
But seriously, read "The Little Book of Common Sense Investing"
(this isn't adressed to anyone in particular, just to whoever may be interested)
I would if I were literate
You basically buy a thingy and then sit on your ass for 20 years
and had plans to do anything with my money
yeah same i cant read
Yeah, no "check the price every 314159 nanoseconds"
I need 0DTE options to feel alive
- Realize I have money I don't need for the few years.
- Put money on the most standard ETF
- When it's crisis time, put a bit more
- When it's going up, and if I need moeny, sell a bit
- Forget about it
real
~10min every 4 months xD
like I have a 401k (us retirement fund) thats purely in market index stocks
Lol
I just bought stock in the beginning during COvid
Many such cases
its ok its gonna go up
remember: every graph goes up and right to infinity
a nice 100% that will never repeat itself lol
Quick summary of "The Little Book of Common Sense Investing"
Step 1: buy an ETF aka a "basket" with tons of stocks of largest companies in it
Step 2: ???
Step 3: profit
wow my 401k is up 20%! (over like 5 years)
maybe I should pull my money out before these tarrifs hit though 🍃
They already hit 🤣
I am watching some red candles right now, btw
oh rip
anyways
To be honest, I don't complain xD
shows me for literally never looking at the news
If I see a -5% on a specific day, I put a bit more
😄
Red means invest
Green means remove
😄
remember to catch that falling knife
The only caveat is for HOW LONG will it fall lol
SP500 (weekly bars)
At this rate I'll need to sell my house lol
hey man this is just a great tiem to buy sp500 for when the tarrifs get lifted because they actually just made thigns worse
Yeah, but I don't have money right now cause I already bought Japan225 (Japanese SP500, basically) a while ago and...
(the profit is in USD, btw)
TBF since the step1 of my workflow is "is it money I don't need in the next yearS", I'm fine
nice gainz
I mean
I don't know if there were a rationale behind it
I like japan, but that doesn't mean I'll put my money on things I like
There was. I bought at the bottom of a large dip, and then the price didn't go as high as I set my take profit (well, my strategy did, but whatever)
I kinda need cash immediately on hand, I'm probably eventually going to need to move and I'll need to buy a house then
You realize you just did the exact opposite of your advice
Put money, forget it, profit
of course I've been thinking I'd need to move for the last 3 years and have not had to yet ¯_(ツ)_/¯
Yes
The thing is - you CAN consistently make profit from stocks/crypto, but trading "by hand" is a death sentence. You need something that can be back-tested on historical data
Unrealized-wise, well XD
Aka you need to code shit
man I got wrecked by covid, I had money in the market and then covid hit and everything went down, and I was in the middle of buying a house so I had to take the L
Ah shit bad timing
(the return on the house was pretty rad though so it ended up fine?)
But you couldn't postpone the house ?
I got screwed for my car btw
Had to buy it when there was none available
spent 50% more
I could have postponed, but I basically still had exactly how much I needed still in the market and couldn't really risk it going down further
Obviously I don't expect that most people will code trading strategies, hence the "just buy an ETF and sit on your ass for 5/10/20 years" advice
Anyway, let's get back to FSRS, lol
Yes my situation 😦 I only mentioned stock market as a hyperbole
Idk man, if you need to do 1000+ reviews every day, I think you are cooked anyway
In what way am I cooked
Like cooked-cooked or cooked but there is a way back
Like, "you will be mentally exhausted" cooked
No, it's fine
can i switch to half a button ?
?
So, do I read this right, that Anki isn't designed to learn completely new content, even with FSRS? Cause that's what I'm primarily doing, and would explain why it's so messed up.
kinda yea
I have the no-no thoughts
did u really not know this?
anki is not for learning its for memorization
from what i understood the correct way and best way
is to learn something, then you put in anki to keep it memorized
Well, how would I learn 2000 random Kanji and 12000 random Vocab in the wild though?
it'd never get done
probably like
When I started, I knew some of them. But I'd say 95% of what's in this deck is completely new
one way which i want to do but ill never do cause i dont have energy for it
go on youtube, watch videos of that word etc
learn what it means and then put it into anki
That'd take me more time than a full time job to get the same result
you have to expose yourself to the words then use anki to keep it memorized
Specially cause a lot of those Kanji are extremely arcane and there's a good chance I'll never encounter them in the wild
yet I need to know them
same for a lot of the Vocab
if they are extremely arcane then why do you need to know them
Cause JLPT says so
I need that to be allowed to work lol
It could probably be reasonably fixed to make at least FSRS be somewhat compatible with learning-in-anki
but my attempt at adding that was shot down...
how tho
Hide the learning phase from FSRS
The first review FSRS sees is the first review-review
doesnt it already do that
nope
the very first review on a card is very important
agreed
Honestly, the whole distinction between the "learning phase" and "reviewing phase" feels extremely arbitrary to me
I think a lot of people treat them as real categories and not just arbitrary shit that is an artifact of the old scheduling
it's where the initial stability comes from
could be
well whys there a big gap in my percentages then
65% for learning cards, 80% for young 93% for mature
checkmate
Cause you kept hitting Again? :D
exactly
It's mostly that a review chain of 3133333 is very different for FSRS compared to 1333333 for some reason
Don't
Don't learn my cards?
Well, but a new card is new to me
The whole learning phase vs review phase
so I have to learn it first
I can sometimes make educated guesses due to Radicals and knowing the meaning/reading of Kanji in the vocab. But it is completely new material to me.
Literally the only reason this distinction exists at all is because Anki is janky when it comes to <1d vs >=1d intervals
And devs didn't bother to make Anki handle both the same way
And used integer intervals, in days, for everything
So that same-day reviews have an interval of 0
Because fuck logic
It kinda makes sense though, doesn't it? While you're still learning a card, multi-day intervals are too long. Meanwhile while reviewing known material, the accuracy of to the second due times is not needed.
And FSRS has to work around it
what if we get new devs
Wouldn't it make sense for FSRS to just flat out ignore any of the same-day reviews, and treat them as (re)learning, and only look at the ones that are >=1d?
To be fair, using accurate interval lengths (not rounded to the nearest whole day) won't do jack shit. For long intervals 100 days vs 100.5 days doesn't matter, for short intervals we don't have a good formula
That was the case before FSRS-5. FSRS-5 is more accurate than FSRS-4.5, even though it can't "see" real interval lengths and only uses grades from same-day reviews
So there is some benefit in using them, even in the most rudimentary way imaginable
Even before FSRS-5, the very first "learning phase" review was/is HIGHLY influential on the cards future intervals
since it sets the initial stability
My point is that using information from same-day reviews improves predictions for >=1d reviews
absolutely
Even if the current approach is very crude
I'm just looking at my deck, and seeing how 95% of my initial revies are "Again", just by nature of it being completely new material


