FSRS Megathread | Anki | Page 10

quasi shadow Apr 11, 2025, 9:50 AM

#

😎 21 parameters ≈ 297 weights

unique salmon Apr 11, 2025, 9:52 AM

#

Now if only we could finally make D that depends on R, I would consider FSRS to be complete

lapis hearth Apr 11, 2025, 9:53 AM

#

unique salmon Now if only we could finally make D that depends on R, I would consider FSRS to ...

short term memory model

unique salmon Apr 11, 2025, 9:53 AM

#

...oh, right

lapis hearth Apr 11, 2025, 9:53 AM

#

You bet your sweet potato I won't forget about this

quasi shadow Apr 11, 2025, 10:45 AM

#

FSRS-6 with optimizable decay reduces 16% RMSE(bins) relatively.

#

The absolute difference is 0.0085, which is equal to the difference between FSRS v4 and FSRS-5 recency

#

😎 So it's good enough for a major version.

#

😅 The only problem is I have to refactor fsrs-rs to support it...

cosmic hedge Apr 11, 2025, 10:51 AM

#

quasi shadow FSRS-6 with optimizable decay reduces 16% RMSE(bins) relatively.

Nooo not my simple retreivbility calculations! XD

quasi shadow Apr 11, 2025, 10:59 AM

#

cosmic hedge Nooo not my simple retreivbility calculations! XD

We have to pass the decay value to forgetting curve function.

#

So, there are two ways: 1) store the decay in parameters, or 2) store it in the card

unique salmon Apr 11, 2025, 11:00 AM

#

Oh, yeah, decay will probably have to be stored in card info

unique salmon Apr 11, 2025, 11:01 AM

#

quasi shadow So, there are two ways: 1) store the decay in parameters, or 2) store it in the ...

It will be stored in parameters anyway, no?

cosmic hedge Apr 11, 2025, 11:04 AM

#

quasi shadow So, there are two ways: 1) store the decay in parameters, or 2) store it in the ...

Yeah I'd say the best option is both like we do for desired retention?

lapis hearth Apr 11, 2025, 11:08 AM

#

quasi shadow 😎 So it's good enough for a major version.

When is Dae releasing the next Anki version

quasi shadow Apr 11, 2025, 11:10 AM

#

unique salmon It will be stored in parameters anyway, no?

yep, I mean, the source when the code need to read it from

quasi shadow Apr 11, 2025, 11:11 AM

#

lapis hearth When is Dae releasing the next Anki version

He will release it soon because of a security issue.

#

I’m afraid that FSRS-6 cannot catch up this release.

bold terrace Apr 11, 2025, 11:11 AM

#

I was at first like “oh no gain then” then I saw the GRU haha

#

Well done 🙏🏻

unique salmon Apr 11, 2025, 11:17 AM

#

quasi shadow I’m afraid that FSRS-6 cannot catch up this release.

Remember when Dae said "Let's make FSRS the default in the next update after 24.11?" 🤣

quasi shadow Apr 11, 2025, 11:18 AM

#

ankieyes

lapis hearth Apr 11, 2025, 11:18 AM

#

the next update could wait long enough

#

He could make betas

bold terrace Apr 11, 2025, 11:19 AM

#

Now I have wet dream about D shenanigans (clustered parameters etc …) but let’s celebrate first 😂

unique salmon Apr 11, 2025, 11:23 AM

#

Just let me run 50 more tests with neural D, surely I will find something good

#

🍃

#

hasty fractal Apr 11, 2025, 11:29 AM

#

we gotta throw a party!

#

hmm... so do we gotta update the manual too? optimise if u change DR.

#

@unique salmon please confirm before I open an issue.

unique salmon Apr 11, 2025, 11:32 AM

#

hasty fractal hmm... so do we gotta update the manual too? optimise if u change DR.

Not really

hasty fractal Apr 11, 2025, 11:33 AM

#

oh ok then it's a nice change

bold terrace Apr 11, 2025, 11:34 AM

#

unique salmon Just let me run 50 more tests with neural D, surely I will find something good

D has a lot of properties (“higher D lower Interval”) that might fit really well how it’s already computed no ?

unique salmon Apr 11, 2025, 11:34 AM

#

bold terrace D has a lot of properties (“higher D lower Interval”) that might fit really well...

We aren't sure. And the fact that D doesn't depend on R is theoretically crappy

bold terrace Apr 11, 2025, 11:35 AM

#

Sure

unique salmon Apr 11, 2025, 11:35 AM

#

Imagine two scenarios: the user presses “Easy” when R=99% and the user presses “Easy” when R=1%. Clearly, in the latter case this is a very surprising outcome, whereas in the former case it’s not surprising at all. Meaning that D should be updated by a different amount in those cases.

bold terrace Apr 11, 2025, 11:37 AM

#

Yeah as I see it the optimizer find already the best way to set how D move to fit user history, and the fact it optimizes itself into very distinct clusters might be a sign that instead of trying to bind the equation with those D parameters, we could just optimize every other parameters based on D clusters

hasty fractal Apr 11, 2025, 11:38 AM

#

Wait, so you might need to optimise after you've changed your DR?

#

Because you'll have more reviews in a different R region?

#

someone other than expertium confirm it for me.

#

I think I'm confused. I'll leave it upto others. Signing off.

unique salmon Apr 11, 2025, 11:41 AM

#

bold terrace Yeah as I see it the optimizer find already the best way to set how D move to fi...

No, it's a sign that D is updated in fixed amounts that depend on the grade

#

Like
Again = +2
Hard = +1
Good = +0
Easy = -1

unique salmon Apr 11, 2025, 11:42 AM

#

hasty fractal Wait, so you might need to optimise after you've changed your DR?

No, I've already said that

#

If you change DR your data won't change

#

At least not until you actually do reviews

bold terrace Apr 11, 2025, 11:55 AM

#

unique salmon No, it's a sign that D is updated in fixed amounts that depend on the grade

But the optimizer still decided that those amount should be almost null for good and easy

#

It could have decided differently if that would have been the optimal way

unique salmon Apr 11, 2025, 11:55 AM

#

Nope. The linear relationship is hard-coded

bold terrace Apr 11, 2025, 11:56 AM

#

My low D has a more healthy D management for example

unique salmon Apr 11, 2025, 12:00 PM

#

The D update formula is basically just new_d = old_d - (parameter * (grade - 3))
Where Again=1, Hard=2, Good=3, Easy=4
So for Good new_d = old_d - (parameter * (3 - 3)), which is just old_d
For Again new_d = old_d - (parameter * (1 - 3)), which is old_d + 2* parameter
We've tried making the values associated with each grade optimizable, it didn't do shit

#

So overall
Again -> new_d = old_d + 2*parameter
Hard -> new_d = old_d + parameter
Good -> new_d = old_d
Easy -> new_d = old_d - parameter

#

Hence why you get clusters

#

Then there is extra stuff to make it a little smoother

bold terrace Apr 11, 2025, 12:27 PM

#

unique salmon So overall Again -> new_d = old_d + 2*parameter Hard -> new_d = old_d + paramete...

Hmmm for Good this is not the implementation, it's controlled by w[7] for example

#

My "Low-Normal D deck" : 0.3888, 1.4114, 3.4578, 32.9702, 7.4108, 0.4662, 1.5312, 0.0677, 1.3478, 0.3241, 0.8557, 1.9796, 0.0889, 0.2942, 2.2884, 0.1258, 3.3983, 0.3663, 0.7039

#

And the distribution is quite nice too

unique salmon Apr 11, 2025, 12:29 PM

#

bold terrace Hmmm for Good this is not the implementation, it's controlled by w[7] for exampl...

w[7] + "linear damping" at high D smooth it, yes

unique salmon Apr 11, 2025, 12:29 PM

#

unique salmon So overall Again -> new_d = old_d + 2*parameter Hard -> new_d = old_d + paramete...

But overall most of the change is done like this

hasty fractal Apr 11, 2025, 4:12 PM

#

unique salmon At least not until you actually do reviews

u take things too literally and end up arguing with yourself

#

that's what I meant ofc

#

Because you'll have more reviews in a different R region

unique salmon Apr 11, 2025, 4:18 PM

#

Oh, ok, my bad. Still, I'm not sure if I would recommend optimizing parameters more than before FSRS-6

hasty fractal Apr 11, 2025, 4:33 PM

#

hmm, fair enough.

soft skiff Apr 12, 2025, 12:05 AM

#

I would like to ask if there is an exam at the end of next month, is one month of "fsrs" review enough?

quasi shadow Apr 12, 2025, 5:09 AM

#

#

I'm evaluating some collections with extremely low decay.

#

Their retention is > 93%.

#

And their decay is < 0.03.

polar maple Apr 12, 2025, 6:01 AM

#

quasi shadow

which user?

#

i guess decay would be low whenever you have data where R looks like it is increasing over time, which can happen by chance if the collection size is small

quasi shadow Apr 12, 2025, 6:21 AM

#

polar maple which user?

https://github.com/open-spaced-repetition/fsrs-optimizer/pull/169#issuecomment-2798453145

GitHub

Feat/FSRS-6 by L-M-Sherlock · Pull Request #169 · open-spaced-rep...

candidate for FSRS-6
Log Loss: 0.3273 -> 0.3257 (-0.0016)
RMSE(bins): 0.0518 -> 0.0510 (-1.5%)
Model: FSRS-5-dev
Total number of users: 9999
Total number of reviews: 349923850
Weighte...

bold terrace Apr 12, 2025, 7:37 AM

#

soft skiff I would like to ask if there is an exam at the end of next month, is one month o...

It will depends on the amount of material, the type of material, and the score desired :). But if you have all the material already learnt, learnt in Anki, and you have a good Desired Retention (Above 80%), yes ! If no, you should learnt all the cards ASAP in Anki + outside Anki

bold terrace Apr 12, 2025, 7:38 AM

#

quasi shadow https://github.com/open-spaced-repetition/fsrs-optimizer/pull/169#issuecomment-2...

We can try the new optimizable decay with our collection somewhere ? For ex here ? https://colab.research.google.com/github/open-spaced-repetition/fsrs4anki/blob/v5.3.3/fsrs4anki_optimizer.ipynb#scrollTo=wG7bBfGJFbMr

Google Colab

cosmic hedge Apr 12, 2025, 7:56 AM

#

bold terrace We can try the new optimizable decay with our collection somewhere ? For ex here...

https://github.com/open-spaced-repetition/fsrs-optimizer/pull/176 is this what you want?

GitHub

Expt/trainable forgetting decay by L-M-Sherlock · Pull Request #17...

quasi shadow Apr 12, 2025, 8:32 AM

#

bold terrace We can try the new optimizable decay with our collection somewhere ? For ex here...

You can install fsrs-optimizer via pip+git

quiet saddle Apr 12, 2025, 8:42 AM

#

I have a question about the parameters I use for FSRS:
When I switched to FSRS I followed some explanations suggesting to use something along preset:"Bases" -is:suspended as parameter for the optimization field. But now that I think about it, as I suspend the leeches it means FSRS never use the data on them to optimize. So maybe the -is:suspended part creates some kind of survivor bias 🤔
Should I remove that part?

quasi shadow Apr 12, 2025, 10:03 AM

#

😅 The mystery of float 32.

#

#

I cannot solve this problem (

unique salmon Apr 12, 2025, 11:22 AM

#

@quasi shadow here are the results of trying optimizable decay
It's very slightly better than decay=-0.2, according to my tests. Regularization doesn't help much, and increasing learning speed makes results worse
Regarding clamping, I already said it on Github - even if for some users decay >-0.1 provides a better fit, we shouldn't use it for scheduling reasons. We don't want people to have intervals measured in thousands of years

quasi shadow Apr 12, 2025, 11:25 AM

#

unique salmon <@449662392314494987> here are the results of trying optimizable decay It's very...

Slightly better?

#

In my test, it’s ~5% better.

unique salmon Apr 12, 2025, 11:28 AM

#

Here it's like 2-4% better

#

In these tests the decay parameter is clamped between (-0.1, -0.7) btw

#

Again, as I said on Github, we can extend the lower limit to -1, but the upper limit must be -0.1. Anything closer to 0 than that will not be usable for scheduling

#

For example, with S=1 and decay=-0.025, the first interval at DR=80% would be something like 120 days, and the first interval at DR=70% would be something like 25000 days

unique salmon Apr 12, 2025, 11:38 AM

#

quiet saddle I have a question about the parameters I use for FSRS: When I switched to FSRS I...

If you have suspended cards, they are probably very different from the rest of your cards. So having FSRS learn from them is not going to make it better on your normal cards

unique salmon Apr 12, 2025, 11:52 AM

#

quasi shadow Slightly better?

S=1

Decay=-0.025, the first interval at DR=80% is around 120 days, the first interval at DR=70% is around 25000 days
Decay=-0.1, the first interval at DR=80% is around 4.5 days, the first interval at DR=70% is around 18 days
Decay=-0.15, the first interval at DR=80% is around 3.4 days, the first interval at DR=70% is around 9.5 days

On second thought, even -0.1 is a little crazy for scheduling. Let's make the limit -0.15. Again, I understand that for some users it won't provide the best fit, but we have to worry about the intervals being reasonable.
@bold terrace @polar maple @hasty fractal your input is welcome

unique salmon Apr 12, 2025, 12:14 PM

#

I feel like we have a spectrum, where Alex is on the far end of "Screw scheduling as long as metrics look good", Jarrett somewhat closer to the center, and me at the other end (but not super far) of "Screw metrics a long as scheduling looks good"

bold terrace Apr 12, 2025, 12:16 PM

#

Personally I think if an user has a profile that lead the optimizer to get a decay very close to 0, I think it's fine as long as he realize that he will have to push the DR very high, or set some max interval limit. I don't think it's very healthy to put restrictions on decay itself if the problem is the interval.

After all, if S=1, if DR=90% then in all cases the first interval would be 1d, by definition, right ? So it's not per say a big big issue as long as the user is aware that since he don't drop to 70-80% easily, either he chose a higher DR, either he chose to "never see those cards".

But I'm not 100% against putting some limit, I'm just worried about the kind of compouding effect it could have : If the decay is limited let's say to 0.15, but the user truly would need a 0.10, then other parameters will be optimized to try to make the reviews anyhow longer. Sure, it won't really have the same drastic effect than a decay lower, but if the user keep outperforming the prediction, the other parameter will try to compensate for that decay we didn't allow to go to 0.10, for ex

bold terrace Apr 12, 2025, 12:17 PM

#

unique salmon I feel like we have a spectrum, where Alex is on the far end of "Screw schedulin...

So personally I'd put myself in : Let's keep the scheduling as pure as possible so it gets the best metrics it can, and let's build ways for the user to be able to navigate what those parameters might means for him (He needs a high DR, or to accept he won't see much cards if he initially rate them good, etc)

unique salmon Apr 12, 2025, 12:19 PM

#

hasty fractal Apr 12, 2025, 12:20 PM

#

unique salmon I feel like we have a spectrum, where Alex is on the far end of "Screw schedulin...

I'm a centrist.

#

actually I'm a left-leaning moderate liberal.

unique salmon Apr 12, 2025, 12:21 PM

#

We now have SRS-left and SRS-right 🤣

bold terrace Apr 12, 2025, 12:21 PM

#

I'd be vertically aligned lol : The model itself should be metric-focused, but the UX could be controlled by external factor than the model itself 😄

unique salmon Apr 12, 2025, 12:21 PM

#

SRS political compass

bold terrace Apr 12, 2025, 12:21 PM

#

alt-left

#

or alt-right

#

But alt-right has bad connotation

#

😄

hasty fractal Apr 12, 2025, 12:22 PM

#

(rightly so)

unique salmon Apr 12, 2025, 12:22 PM

#

To me anything political has bad connotations tbh

hasty fractal Apr 12, 2025, 12:23 PM

#

expertium hides his political opinions behind an apolitical mask

#

we've seen it all

bold terrace Apr 12, 2025, 12:28 PM

#

I also think that people might consider things as "not looking good" when they don't necessarly realize that it's their history that led their current interval to be what is is, and even if the scheduler overshoot, it will self-correct once those overshoot will be indeed evaluated as being overshot. And even if the card getting scheduled 2y later won't be re-reviewd for the next 2 year, a lot other card will be, and the optimizer will take those in account, and with some regular reschedule, those will be adjusted

#

So 2 cases scenario : Either you're in fact studying for something in 1 month, and you're not in a mood to see as little as possible but you want to maximise your score : Then crank up the DR, do mass-review, set your max interval to 1d, whatever.

But if in contrary your goal is long term learning of something, forgetting a few things for a few months is in the grand scheme of things, a really trivial thing

unique salmon Apr 12, 2025, 12:29 PM

#

bold terrace Personally I think if an user has a profile that lead the optimizer to get a dec...

Personally I think if an user has a profile that lead the optimizer to get a decay very close to 0, I think it's fine as long as he realize that he will have to push the DR very high, or set some max interval limit. I don't think it's very healthy to put restrictions on decay itself if the problem is the interval.

Sadly, I'm pretty sure the result of this approach will be 100 posts with "Why is my first interval 100 years?"

bold terrace Apr 12, 2025, 12:30 PM

#

unique salmon > Personally I think if an user has a profile that lead the optimizer to get a d...

Those guys can be recommended to put a max interval in the settings

#

Max Interval means : Whatever the DR, I always want to be recalled something every X interval

unique salmon Apr 12, 2025, 12:30 PM

#

And then every single interval will max out

#

Nah, we gotta set a reasonable limit to decay

bold terrace Apr 12, 2025, 12:31 PM

#

And if with time they realize they have a 99.99% retention in that hardcoded interval, they will gradually get more confidence increasing it 🙂

bold terrace Apr 12, 2025, 12:31 PM

#

unique salmon Nah, we gotta set a reasonable limit to decay

I'm not against a reasonable one, but maybe we should based it on observation, something like 95th percentile of the lowest decay observed in real user

unique salmon Apr 12, 2025, 12:33 PM

#

The thing about decay is that the difference between -0.1 and -0.01 is not 10x longer intervals, more like x50000 longer intervals (at DR=80%, specifically)

bold terrace Apr 12, 2025, 12:33 PM

#

bold terrace And if with time they realize they have a 99.99% retention in that hardcoded int...

Because in fine, I think most of those people are people not confident enough to trust an algorithm, so maybe the max interval route for them is not the worst. Let's not forget that at a max interval of 90d, you could have 1000 card and still it will give you an average daily workload of ˜11 reviews per day .... The price to pay "to be sure to never go north to 90d" is not that big to pay right ?

#

And I'm pretty sure those guys have less than 500 cards in a review state

#

Personally at 3k and having done Anki for the past ~15 months, I'm in a state that if you tell me ~20% of my cards will ahve a 1y interval instead of 30d, I'll tell you thank you lol

unique salmon Apr 12, 2025, 12:35 PM

#

@quasi shadow what are the 5th and 95th percentiles here?
(if the 5th percentile is >-0.1 [aka <0.1 in absolute values], let's just pretend it's -0.1 😉)

cursive badge Apr 12, 2025, 12:40 PM

#

unique salmon > Personally I think if an user has a profile that lead the optimizer to get a d...

What if I have knowledge that was seared into my mind by the old ones the first time I saw it but I want Anki to make me revise it in 100 years just in case? ;p

unique salmon Apr 12, 2025, 12:41 PM

#

Lol

bold terrace Apr 12, 2025, 12:45 PM

#

cursive badge What if I have knowledge that was seared into my mind by the old ones the first ...

Set the max intervalto 100 years 😄

#

I mean, I think algorithm should be as "pure" of any external alteration, while setting external limits are OK 🙂 By doing so, you can more easily troubleshoot if you have a clear information that : "Right now, the system thinks you'll need X months to remember it, but the interval will be 30d because you wanted it like that"

unique salmon Apr 12, 2025, 12:47 PM

#

unique salmon <@449662392314494987> what are the 5th and 95th percentiles here? (if the 5th pe...

Just by eyeballing it, it seems around 20% of users have decay <=0.1 (absolute values)

#

The thing is, even people for whom it provides a better fit wouldn't be happy, because nobody wants 100 years intervals

#

I really don't think that making R 0.1% more accurate is worth making users 100x more concerned about interval lengths

bold terrace Apr 12, 2025, 12:51 PM

#

unique salmon The thing is, even people for whom it provides a better fit wouldn't be happy, b...

Yeah but the restriction you'll put on the decay will impact the others parameters somewhat

#

And maybe those users have a big DR

cursive badge Apr 12, 2025, 12:52 PM

#

*cough* *points vigorously at book covered in sigils and giving off a menacing aura*

unique salmon Apr 12, 2025, 12:53 PM

#

bold terrace And maybe those users have a big DR

Oh, speaking of which
Can you guess what interval length with S=100 and DR=95% people will have at -0.01 decay? 🤣

bold terrace Apr 12, 2025, 12:53 PM

#

After all, the decay was optimized that way, because predicting the next interval to be 100 years later, was indeed the better prediction

bold terrace Apr 12, 2025, 12:53 PM

#

unique salmon Oh, speaking of which Can you guess what interval length with S=100 and DR=95% p...

Ok but 0.01 might be excessive xD

#

I'd say, let's see already what the top 95th percentile has as a decay

unique salmon Apr 12, 2025, 12:54 PM

#

Just guess
S=100 days, DR=95%, decay=-0.01

bold terrace Apr 12, 2025, 12:54 PM

#

maybe we're arguing for 0.10 instead of 0.09 or 0.11

bold terrace Apr 12, 2025, 12:54 PM

#

unique salmon Just guess S=100 days, DR=95%, decay=-0.01

100k day ?

#

273 year ?

unique salmon Apr 12, 2025, 12:55 PM

#

0.45 days

bold terrace Apr 12, 2025, 12:55 PM

#

wait xD

#

Ah

#

WAit

#

Isn't maybe a sign that an exponential is not that great xD ?

unique salmon Apr 12, 2025, 12:55 PM

#

At DR=99% and decay=-0.01 and S=100 days, the interval would be something like 0.0045 days

bold terrace Apr 12, 2025, 12:56 PM

#

Strange that between [100,90] and [90, 0] you have a compressing/expanding effect on interval no ?

#

I mean, mathematically it make sense

#

but then it means people with low decay might just be people with DR<90%

#

and people with higher decay people with DR>90% ?

unique salmon Apr 12, 2025, 12:56 PM

#

https://www.desmos.com/calculator/6fwtu0dzbf
Here, have fun

Desmos

FSRS-6 curve+

bold terrace Apr 12, 2025, 12:57 PM

#

As long as the prediction are accurate 🤷

unique salmon Apr 12, 2025, 12:57 PM

#

This is decay=-0.5 (purple) vs decay=-0.01 (green) at S=100 days

bold terrace Apr 12, 2025, 12:58 PM

#

But once again, might be a sign that the prediction are good because it was training at that specific DR

#

Not because the forgetting curve is truly good

#

The optimized decay might just be a way to accomdate slightly the prediction around DR

#

but going from 90% DR to 60% or from 60% to 90% is asking for very bad prediction

unique salmon Apr 12, 2025, 12:58 PM

#

bold terrace The optimized decay might just be a way to accomdate slightly the prediction aro...

Nope, not according to this

#

It's only bad for people with REALLY low retention, like, 20-35%

#

This graph is kinda weird because people with really low retentions were combined into one bin and I'm not sure what their average retention within that bin is, I assume around 20%

#

unique salmon Apr 12, 2025, 1:02 PM

#

unique salmon Nope, not according to this

Then again, this is optimizable decay, not fixed

#

(I think)

#

(Jarrett, is it fixed here?)

bold terrace Apr 12, 2025, 1:04 PM

#

I think there are still a lot of stuff we don't understand behind short memory and long term memory relation 😅

unique salmon Apr 12, 2025, 1:04 PM

#

Man, this is getting tiresome
@quasi shadow, my guy, can we just agree to make the limits of decay (-0.1, -0.8) and be done with it? 😅

bold terrace Apr 12, 2025, 1:04 PM

#

Maybe it's something like "You have a certain level of recall probability in long term memory, and short term recall might both make you more able to recall it right now, as well as bumping SLIGHTLY your long term recall chance"

#

Which could explain that kindof "baseline" recall that people seems to never get below (~20-40% let's say) but they drop like crazy initially

#

and maybe thus multiple decay rate would be necessary laughcry

sick moth Apr 12, 2025, 1:06 PM

#

unique salmon I feel like we have a spectrum, where Alex is on the far end of "Screw schedulin...

Can we have me on "screw most things as long as the standard user has a reasonable experience"

bold terrace Apr 12, 2025, 1:06 PM

#

One to represent the short term loss, one to represent the long term baseline

unique salmon Apr 12, 2025, 1:07 PM

#

bold terrace and maybe thus multiple decay rate would be necessary <:laughcry:101861493438652...

Let's calculate TWO stabilities for TWO forgetting curves and then take their average

#

Actually, let's take it even further, like Alex
Let's calculate THREE stabilities for THREE curves with THREE different decays and then take their WEIGHTED average

#

Ngl, I actually kinda want to try that. It sounds horrible, but I want to see the metrics

bold terrace Apr 12, 2025, 1:09 PM

#

Yeah the weigheted average would make more sense I think

cursive badge Apr 12, 2025, 1:09 PM

#

At a certain point you just end up putting it all into the Memotron 9000 Neural Network

bold terrace Apr 12, 2025, 1:09 PM

#

Short Term probability after 5d might be 0%
Mid term might be ~60%
Long Term might be 40%

A pure avg would account the 0% of short term as being as important as the other

bold terrace Apr 12, 2025, 1:10 PM

#

cursive badge At a certain point you just end up putting it all into the Memotron 9000 Neural ...

Still, the good thing with FSRS is how we can interpret after the parameters

#

Memotron is all fun until he decide to kill everyone

unique salmon Apr 12, 2025, 1:11 PM

#

cursive badge At a certain point you just end up putting it all into the Memotron 9000 Neural ...

Alex is working on something like that 🤣

#

His unreleased neural net can achieve RMSE of around 1.4% and logloss of around 0.27, beating the hell out of everything you see here

lapis hearth Apr 12, 2025, 1:14 PM

#

unique salmon His unreleased neural net can achieve RMSE of around 1.4% and logloss of around ...

Let me guess, we dont use it because it gives some weird intervals❓

#

Holy shit I just realized the number of params

cursive badge Apr 12, 2025, 1:17 PM

#

The downside of going full NN is that it becomes much more difficult to fine tune and fine tuning can break it entirely if done wrong.

bold terrace Apr 12, 2025, 1:17 PM

#

Also, all of this is real fun but to be honest we tend to forget how much FSRS is already god tier

#

#

I mean since switching to DR=90% and spliting in Low/High D my deck

#

I have not a single day with a difference of more than 1.5% from my DR

#

Sure 1.3% would be better than 1.5%

#

But it's already completely god tier

#

Or how clean my average stability doesn't deviate from the trend

lapis hearth Apr 12, 2025, 1:19 PM

#

bold terrace Or how clean my average stability doesn't deviate from the trend

Where do you get these graphs

bold terrace Apr 12, 2025, 1:19 PM

#

sum(R*f(S)) same

bold terrace Apr 12, 2025, 1:19 PM

#

lapis hearth Where do you get these graphs

https://ankiweb.net/shared/info/1613056169

cursive badge Apr 12, 2025, 1:19 PM

#

It's kind of amazing that any of this works at all considering how little information we give the scheduling algorithms.

lapis hearth Apr 12, 2025, 1:21 PM

#

There is so much information FSRS is missing out on (Time of Day, Sleeping Time, Answer Time, Contextual Content of the cards, Interference and Similarity of cards with other cards etc.. etc.)

bold terrace Apr 12, 2025, 1:22 PM

#

lapis hearth There is so much information FSRS is missing out on (Time of Day, Sleeping Time,...

Might be a sign that all those things doesn't really much haha

unique salmon Apr 12, 2025, 1:22 PM

#

lapis hearth Let me guess, we dont use it because it gives some weird intervals❓

Well, that one in particular would be almost impossible to use for scheduling. He has another one with RMSE of around 2.5% and log-loss of around 0.3, which is more promising for scheduling, but I wouldn't bet that any of it will ever be used in Anki

#

Btw, his neural net uses answer time, deck ID, preset ID, sibling information and whatnot

lapis hearth Apr 12, 2025, 1:22 PM

#

unique salmon Well, that one in particular would be almost impossible to use for scheduling. H...

why not

bold terrace Apr 12, 2025, 1:22 PM

#

Next graph I'd like to do is this one but with percentile on x-axis instead of actual repetitions count

#

Would be great to see that the 90-100th percentile represent ~20% of your workload

lapis hearth Apr 12, 2025, 1:23 PM

#

unique salmon Btw, his neural net uses answer time, deck ID, preset ID, sibling information an...

Really ANKIPOGGERS So what is stopping it from being implemented

lapis hearth Apr 12, 2025, 1:24 PM

#

lapis hearth Really <:ANKIPOGGERS:750799808192708638> So what is stopping it from being imple...

FSRS is bound to reach its full potential. Any other improvement above what it could give would require a change in framework

unique salmon Apr 12, 2025, 1:25 PM

#

lapis hearth Holy shit I just realized the number of params

I think Alex's net has like 1 mil parameters

lapis hearth Apr 12, 2025, 1:25 PM

#

Full blow Neural Network Mode

lapis hearth Apr 12, 2025, 1:25 PM

#

unique salmon I think Alex's net has like 1 mil parameters

Holy-

unique salmon Apr 12, 2025, 1:26 PM

#

And instead of optimizing it for each user and testing on the same user, it's just pre-trained on 5k users and tested on the other 5k

lapis hearth Apr 12, 2025, 1:26 PM

#

So it could be even better than expected

unique salmon Apr 12, 2025, 1:26 PM

#

So the optimization procedure is very different

unique salmon Apr 12, 2025, 1:26 PM

#

lapis hearth So it could be even better than expected

?
I'm not sure what you mean

lapis hearth Apr 12, 2025, 1:27 PM

#

Should it not be trained and optimized on the same users

unique salmon Apr 12, 2025, 1:27 PM

#

I'm saying that unlike FSRS, where you optimize parameters for each user, this one is trained on a massive dataset and then parameters are kept fixed
So there would be no "Optimize" if it was used in Anki

lapis hearth Apr 12, 2025, 1:27 PM

#

Hmm

#

Dae wouldnt like this

unique salmon Apr 12, 2025, 1:28 PM

#

I think FSRS-6 will be good enough that there won't be much of a reason to use a neural net

lapis hearth Apr 12, 2025, 1:29 PM

#

When is it coming out presumably. I cannot wait

unique salmon Apr 12, 2025, 1:29 PM

#

lapis hearth When is it coming out presumably. I cannot wait

Idk

#

Oh, and Alex's net uses fractional interval lengths too

cursive badge Apr 12, 2025, 1:30 PM

#

If it can inherently learn enough that it doesn't need fine-tuning it would be amazing. A lot fewer support requests if there are fewer dials to twiddle 😅

lapis hearth Apr 12, 2025, 1:30 PM

#

unique salmon Oh, and Alex's net uses fractional interval lengths too

Why are you trying to seduce me😭

unique salmon Apr 12, 2025, 1:30 PM

#

So it kind of has a short-term memory model somewhere within it's matrices with tons of floating point numbers

lapis hearth Apr 12, 2025, 1:30 PM

#

unique salmon So it kind of has a short-term memory model somewhere within it's matrices with ...

Now I am REALLY intrigued

lapis hearth Apr 12, 2025, 2:49 PM

#

bold terrace Next graph I'd like to do is this one but with percentile on x-axis instead of a...

I dont understand these graphs. Could you help me understand them

#

#

cosmic hedge Apr 12, 2025, 2:58 PM

#

lapis hearth I dont understand these graphs. Could you help me understand them

Load is 1/interval grouped by cards with that number of lapses
Distribution is just the amount of cards with that number of lapses
Total is the amount of lapses total on cards with that number of lapses

Replace the word "lapses" with "repetitions" and you get the explanation for the repetitions one

bold terrace Apr 12, 2025, 3:13 PM

#

Speaking of which, I made some progress on the percentile x-axis 🙂

#

It's really nice because now each bar represent 5% of your card

#

and you see the total load ratio it represents

#

in my case, my last 5% represent 10% of my load

#

But the 5% between 60 and 65%, represent only 4%

lapis hearth Apr 12, 2025, 3:16 PM

#

bold terrace Speaking of which, I made some progress on the percentile x-axis 🙂

Is this your addon

bold terrace Apr 12, 2025, 3:18 PM

#

no it's the same but it's still in a feature branch

robust hill Apr 12, 2025, 3:25 PM

#

new name dropped for him

unique salmon Apr 12, 2025, 4:00 PM

#

lapis hearth Holy-

2.7 mil
@polar maple explain why it won't be implemented in Anki

#

(aside from Dae saying "FSRS is good enough", which is quite likely)

polar maple Apr 12, 2025, 4:06 PM

#

unique salmon 2.7 mil <@142448513622605824> explain why it won't be implemented in Anki

the model size isn't a big issue, i could make a smaller version

#

the problem is probably syncing issues which i don't understand fully

polar maple Apr 12, 2025, 4:07 PM

#

bold terrace I mean, I think algorithm should be as "pure" of any external alteration, while ...

100%

unique salmon Apr 12, 2025, 4:09 PM

#

polar maple the problem is probably syncing issues which i don't understand fully

What about scheduling? I assume it's completely impossible with RWKV-P, but possible with RKWV that uses an average of three forgetting curves

cursive badge Apr 12, 2025, 4:10 PM

#

polar maple the problem is probably syncing issues which i don't understand fully

Does your NN save some kind of state in the cards? I thought it just looked at the entire revlog each time it did scheduling.

polar maple Apr 12, 2025, 4:10 PM

#

unique salmon Nope, not according to this

iirc this is fixed decay=0.5 vs fixed decay=0.2

polar maple Apr 12, 2025, 4:11 PM

#

unique salmon What about scheduling? I assume it's completely impossible with RWKV-P, but poss...

yeah RWKV does use forgetting curves so scheduling is possible, now it is an average of 128 exponential forgetting curves

unique salmon Apr 12, 2025, 4:12 PM

#

polar maple yeah RWKV does use forgetting curves so scheduling is possible, now it is an ave...

128 exponential forgetting curves
why

#

Surely there are no benefits beyond 2-4 curves

polar maple Apr 12, 2025, 4:12 PM

#

cursive badge Does your NN save some kind of state in the cards? I thought it just looked at t...

it's a recurrent nn that keeps a hidden state for each card, note, preset, deck, and a global state, on each line of the revlog the corresponding states get updated

polar maple Apr 12, 2025, 4:12 PM

#

unique salmon Surely there are no benefits beyond 2-4 curves

for theoretical interest

#

because now we can maybe later on interpret it as a probability distribution over stabilities

unique salmon Apr 12, 2025, 4:13 PM

#

Oh, that's interesting

cursive badge Apr 12, 2025, 4:19 PM

#

polar maple it's a recurrent nn that keeps a hidden state for each card, note, preset, deck,...

Isn't that state just regenerated from scratch at runtime as you pass it the revlog? (i.e. not persisted in the Anki DB anywhere) I'm confused how it could create sync problems.

polar maple Apr 12, 2025, 4:22 PM

#

cursive badge Isn't that state just regenerated from scratch at runtime as you pass it the rev...

it would need to be stored to improve cpu performance

#

also for cpu performance i expect maybe around 200 rows of the revlog / second, which is enough in the amortized sense imo but there could be other problems that i'm not aware of

cursive badge Apr 12, 2025, 4:24 PM

#

polar maple it would need to be stored to improve cpu performance

Oh, I didn't realise it would be slow enough to be significant. I guess the issue is any cached state would become invalid when you merge non-linear revlogs.

#

Hmmm. That's a tricky one. My first thought is you could save "snapshots" at each sync so you only have to reset to the oldest common point, but that doesn't solve it completely. You could always have a rogue device that hasn't been synced for a while that forces you to go really far back in time.

lapis hearth Apr 12, 2025, 4:29 PM

#

Is it even possible for Anki to even have a neural network. Does it work like FSRS, easy to run on basic consumer grade laptops

unique salmon Apr 12, 2025, 4:29 PM

#

An example of a user with decay=-0.028 that Jarrett shared
That negative slope 🤣
FSRS's predictions are anti-correlated with his retention - the lower the value of R that FSRS predicts, the higher the user's retention

unique salmon Apr 12, 2025, 4:30 PM

#

lapis hearth Is it even possible for Anki to even have a neural network. Does it work like FS...

According to Alex, yes

cursive badge Apr 12, 2025, 4:30 PM

#

lapis hearth Is it even possible for Anki to even have a neural network. Does it work like FS...

It depends on the complexity. The bigger problem is it will also have to run on quite old phones.

unique salmon Apr 12, 2025, 4:31 PM

#

Distillation time! Just train a 10x smaller net on the big "teacher" net's predictions

lapis hearth Apr 12, 2025, 4:31 PM

#

cursive badge It depends on the complexity. The bigger problem is it will also have to run on ...

You can make it a compatiblity issue, no upgrade to newer versions unless you have better hardware

#

But that is alien to the concept of Anki

#

And you would exclude a lot of people

#

People who live in poorer countries

cursive badge Apr 12, 2025, 4:32 PM

#

You would be surprised what you can get running on phones though. You can even run LLMs that give vaguely sensible output on phones now.

clever cargo Apr 12, 2025, 4:33 PM

#

someone's going to provide anki as a service in that case (i guess that's already ankiweb xD)

lapis hearth Apr 12, 2025, 4:33 PM

#

cursive badge You would be surprised what you *can* get running on phones though. You can even...

Yes, phones now have almost reached the limit of innovation.

#

Is it because of the millions of params that this NN is tough on a device

clever cargo Apr 12, 2025, 4:34 PM

#

lapis hearth But that is alien to the concept of Anki

already happens a lot though, being held back by addons or ankidroid's minimum supported android version being raised

#

even qt5 being dropped

lapis hearth Apr 12, 2025, 4:35 PM

#

clever cargo already happens a lot though, being held back by addons or ankidroid's minimum s...

Alright so that precedence is already there

polar maple Apr 12, 2025, 4:35 PM

#

cursive badge Hmmm. That's a tricky one. My first thought is you could save "snapshots" at eac...

i could try to train a version of the nn later on that is more robust in the order of the revlog, like maybe i randomly drop out certain chunks of the revlog, feed them out of order, etc, so in theory some of this problem could be mitigated. When there is a sync conflict, just drop one of the states and keep the more updated one, but idk how much this would affect performance yet

lapis hearth Apr 12, 2025, 4:35 PM

#

@polar maple would you want to personally use a NN on your own Anki cards

polar maple Apr 12, 2025, 4:36 PM

#

lapis hearth <@142448513622605824> would you want to personally use a NN on your own Anki car...

if it implemented rn? yeah

lapis hearth Apr 12, 2025, 4:37 PM

#

polar maple if it implemented rn? yeah

I have heard neural nets do some weird crap thanks to @unique salmon

#

If it is safe enough, I am all for it

polar maple Apr 12, 2025, 4:37 PM

#

not wrong but we'll have to try it to see

unique salmon Apr 12, 2025, 4:37 PM

#

Anyway, can we all just collectively convince Jarrett to clamp decay to (-0.15, -0.7) or (-0.15, -0.8)?

lapis hearth Apr 12, 2025, 4:37 PM

#

unique salmon Anyway, can we all just collectively convince Jarrett to clamp decay to (-0.15, ...

what gives

polar maple Apr 12, 2025, 4:38 PM

#

unique salmon Anyway, can we all just collectively convince Jarrett to clamp decay to (-0.15, ...

keep the internal memory model at (-0.01, ) maybe since it does seem to improve the metrics, but when scheduling clamp it

lapis hearth Apr 12, 2025, 4:38 PM

#

He said he would add optimizable decay didnt he

unique salmon Apr 12, 2025, 4:38 PM

#

lapis hearth I have heard neural nets do some weird crap thanks to <@530106856593424407>

It's just that with something like FSRS it's easy to ensure certain behavior, like the Hard interval never being greater than Good interval, Again always decreasing the interval and never increasing, etc.
it's much harder to do that with neural nets

unique salmon Apr 12, 2025, 4:38 PM

#

polar maple keep the internal memory model at (-0.01, ) maybe since it does seem to improve ...

https://tenor.com/view/the-office-no-angry-steve-carell-michael-scott-gif-5606969

Tenor

unique salmon Apr 12, 2025, 4:38 PM

#

lapis hearth what gives

Do you want your first interval to be 10,000,000,000 days?

#

At 70% DR

lapis hearth Apr 12, 2025, 4:39 PM

#

unique salmon It's just that with something like FSRS it's easy to ensure certain behavior, li...

f no

polar maple Apr 12, 2025, 4:39 PM

#

unique salmon Do you want your first interval to be 10,000,000,000 days?

sure if it's the truth

lapis hearth Apr 12, 2025, 4:39 PM

#

But only the weak would choose DR at 70%

unique salmon Apr 12, 2025, 4:39 PM

#

polar maple sure if it's the truth

polar maple Apr 12, 2025, 4:39 PM

#

if i input the 100 most common english words to anki and i want a 99% DR then i also expect an infinite interval 🤣

clever cargo Apr 12, 2025, 4:40 PM

#

there's no short-term memory model, and there's no long-term dementia-or-death model either apparently

polar maple Apr 12, 2025, 4:40 PM

#

anki now predicts your death

lapis hearth Apr 12, 2025, 4:40 PM

#

polar maple anki now predicts your death

If it does some neural shit then yeah

polar maple Apr 12, 2025, 4:40 PM

#

unique salmon

a good scheduler easily falls out of a good memory model

#

so we should get a good memory model first

unique salmon Apr 12, 2025, 4:41 PM

#

CIVIL WAR

polar maple Apr 12, 2025, 4:42 PM

#

that's what separates FSRS and SM-2 in the first place

unique salmon Apr 12, 2025, 4:42 PM

#

https://tenor.com/view/peeporiot-peeporiot-havi-gif-23057486

Tenor

polar maple Apr 12, 2025, 4:42 PM

#

FSRS predicts R, SM-2 doesn't

#

so FSRS claims superiority

unique salmon Apr 12, 2025, 4:43 PM

#

Sure, but at some point making R more accurate by a fraction of a percent at the cost of user experience is just a terrible trade-off

cursive badge Apr 12, 2025, 4:43 PM

#

polar maple anki now predicts your death

It turns out the only way to get a perfect scheduler is to first invent an oracle algorithm that perfectly simulates the future. All the "this is the day you die" stuff is just a bonus. ;p

unique salmon Apr 12, 2025, 4:44 PM

#

And decay between 0 and -0.1 is exactly such case

bold terrace Apr 12, 2025, 4:45 PM

#

Still drafty, but if people are interested in checking their avg load by 5% quantile

📎 searchStatsExtended.ankiaddon

#

sum load sry

polar maple Apr 12, 2025, 4:45 PM

#

unique salmon Sure, but at some point making R more accurate by a fraction of a percent at the...

so we have the scheduler be a layer on top of the memory model that makes the user experience nicer

unique salmon Apr 12, 2025, 4:48 PM

#

How? Please no "We predict R using one value of decay but use a different value for scheduling"

#

Capping max. interval? Then all intervals will just be equal to the max. interval

#

Capping the relative increase between two consecutive intervals? Same issue, though probably better in practice because it's harder for the user to spot

#

Maybe some combination of capping both interval lengths AND the relative increase. But then that could lead to TR not being equal to DR

#

Decay close to 0 just introduces intervals that are way too insane

polar maple Apr 12, 2025, 4:51 PM

#

unique salmon How? Please no "We predict R using one value of decay but use a different value ...

why not?

unique salmon Apr 12, 2025, 4:51 PM

#

https://github.com/open-spaced-repetition/fsrs-optimizer/pull/169#issuecomment-2798834253

GitHub

Feat/FSRS-6 by L-M-Sherlock · Pull Request #169 · open-spaced-rep...

candidate for FSRS-6
Log Loss: 0.3273 -> 0.3257 (-0.0016)
RMSE(bins): 0.0518 -> 0.0510 (-1.5%)
Model: FSRS-5-dev
Total number of users: 9999
Total number of reviews: 349923850
Weighte...

#

S=1 day
Decay=-0.01, the first interval at DR=80% is around 130 000 days, the first interval at DR=70% is around 10^11 days
Decay=-0.025, the first interval at DR=80% is around 120 days, the first interval at DR=70% is around 25 000 days
Decay=-0.1, the first interval at DR=80% is around 4.5 days, the first interval at DR=70% is around 18 days
Decay=-0.15, the first interval at DR=80% is around 3.4 days, the first interval at DR=70% is around 9.5 days

polar maple Apr 12, 2025, 4:52 PM

#

remember that predicted R does affect S updates so it is in our best interest to have it be as accurate as possible

hasty fractal Apr 12, 2025, 4:52 PM

#

polar maple sure if it's the truth

💀 tbh

#

truth doesn't matter if people don't use anki and it's not like there's a global state that'll help us

polar maple Apr 12, 2025, 4:53 PM

#

hasty fractal truth doesn't matter if people don't use anki and it's not like there's a global...

it'll be fine, just have sane defaults in the scheduler layer

#

i just want the memory model and the scheduler to have separate responsibilities

#

don't lie in the memory model to get good scheduling, keep them separate

lapis hearth Apr 12, 2025, 4:54 PM

#

polar maple i just want the memory model and the scheduler to have separate responsibilities

Are you actually going to go ahead with the Neural Net thing. It seems to have some sort of short term memory model inside it

unique salmon Apr 12, 2025, 4:55 PM

#

unique salmon > S=1 day > Decay=-0.01, the first interval at DR=80% is around 130 000 days, th...

S=1 day
Decay=-0.01, the first interval at DR=80% is around 130 000 days, the first interval at DR=70% is around 10^11 days
Decay=-0.025, the first interval at DR=80% is around 120 days, the first interval at DR=70% is around 25 000 days
Decay=-0.1, the first interval at DR=80% is around 4.5 days, the first interval at DR=70% is around 18 days
Decay=-0.15, the first interval at DR=80% is around 3.4 days, the first interval at DR=70% is around 9.5 days

Let's just vote based on this
@bold terrace @hasty fractal @polar maple @lapis hearth @cursive badge @cosmic hedge
I want to choose the limit of of the "decay" parameter in the upcoming FSRS-6. The closer it is to 0, the longer the intervals at DR<90%. I want you guys to vote on what the limit should be based on these examples

hasty fractal Apr 12, 2025, 4:55 PM

#

polar maple i just want the memory model and the scheduler to have separate responsibilities

oh yea, then it probably doesn't matter? develop a value-alignment scheduler that cares about user experience to filter out anything crazy.

bold terrace Apr 12, 2025, 4:55 PM

#

unique salmon Capping max. interval? Then all intervals will just be equal to the max. interva...

TBF this would not be such a big problem. A 10k card at 1y interval is like 27/day

polar maple Apr 12, 2025, 4:56 PM

#

lapis hearth Are you actually going to go ahead with the Neural Net thing. It seems to have s...

no it's too much work for now, i'd have to find the motivation and this would be for a fork since there is a low chance this would get into the main anki

polar maple Apr 12, 2025, 4:56 PM

#

hasty fractal oh yea, then it probably doesn't matter? develop a value-alignment scheduler tha...

exactly

polar maple Apr 12, 2025, 4:57 PM

#

unique salmon > S=1 day > Decay=-0.01, the first interval at DR=80% is around 130 000 days, th...

you cannot just ignore the idea of keeping the scheduler as a separate layer

#

how about 0.01 internally and 0.1 externally?

hasty fractal Apr 12, 2025, 4:57 PM

#

unique salmon > S=1 day > Decay=-0.01, the first interval at DR=80% is around 130 000 days, th...

I vote on Alex's idea.

unique salmon Apr 12, 2025, 4:58 PM

#

polar maple how about 0.01 internally and 0.1 externally?

I've said before - I HIGHLY doubt that other parameters are "decay-agnostic" aka that they converge to the same values regardless of the choice of decay, so for any decay all other parameters will remain the same

#

If you use a different value of decay, the other parameters will be sub-optimal

lapis hearth Apr 12, 2025, 4:59 PM

#

unique salmon Apr 12, 2025, 5:00 PM

#

You messed up the description of Decay =-0.0025 (and it's -0.025 btw), but oh well

lapis hearth Apr 12, 2025, 5:00 PM

#

unique salmon I've said before - I **HIGHLY** doubt that other parameters are "decay-agnostic"...

I dont think we should have an influence on it ourselves, then you would allow an arbitrary factor into it. It should come from within the model that no such absurd intervals appear

polar maple Apr 12, 2025, 5:01 PM

#

unique salmon I've said before - I **HIGHLY** doubt that other parameters are "decay-agnostic"...

yeah but 0.1 wouldn't be used as a true decay in this case, it would just be a way to make intervals shorter. Any way that makes intervals shorter would work here

unique salmon Apr 12, 2025, 5:03 PM

#

lapis hearth I dont think we should have an influence on it ourselves, then you would allow a...

Btw, with decay of -0.01 for a card with S=100 days, at 95% DR the first interval would be half a day
You would love it 🤣

#

Nothing like reviewing every card every day, lel

polar maple Apr 12, 2025, 5:04 PM

#

lapis hearth I dont think we should have an influence on it ourselves, then you would allow a...

there is a conflict when a memory model tries to model R exactly but we schedule according to DR. If I want to study common english words, my R will never reach 0.9 so the intervals will be really long. But I want to keep the memory model pure rather than having it make wrong predictions on purpose, rather, i'd let a scheduler layer do the dirty work

cursive badge Apr 12, 2025, 5:06 PM

#

I abstain because I've not been following the conversation close enough to make an informed choice.
Also isn't it suspected that at a certain stability memories enter another domain where they are effectively permanent. Hence that other algorithm that begins with S that Jarrett worked on.

polar maple Apr 12, 2025, 5:06 PM

#

unique salmon Btw, with decay of -0.01 for a card with S=100 days, at 95% DR the first interva...

it could be accurate, if a user adds 9 known cards for every 1 card that they actually need to learn, this would be the sort of forgetting curve that you would expect

cursive badge Apr 12, 2025, 5:07 PM

#

Some things might just be so easy to remember that they one-shot into permanent status and the tiny decays are just silly ways to try to model that.

lapis hearth Apr 12, 2025, 5:09 PM

#

polar maple it could be accurate, if a user adds 9 known cards for every 1 card that they ac...

This is the thing. You dont know where this decay value or not however absurd it is. Only the algorithm knows. The question asked is not a valid question

polar maple Apr 12, 2025, 5:09 PM

#

@unique salmon btw how did you implement the regularization for decay?

unique salmon Apr 12, 2025, 5:10 PM

#

polar maple <@530106856593424407> btw how did you implement the regularization for decay?

https://github.com/open-spaced-repetition/srs-benchmark/blob/3ff84018756d1a2b098ef4724cc7818cbbf48cbb/other.py#L994

https://github.com/open-spaced-repetition/srs-benchmark/blob/3ff84018756d1a2b098ef4724cc7818cbbf48cbb/other.py#L1031

GitHub

srs-benchmark/other.py at 3ff84018756d1a2b098ef4724cc7818cbbf48cbb ...

A benchmark for spaced repetition schedulers/algorithms - open-spaced-repetition/srs-benchmark

#

I have an extremely dumb and janky idea:

Optimize parameters
If decay >-0.1 (aka <0.1 in absolute terms), optimize them again with decay=-0.1
Keep both sets of parameters
Use the first set with very small decay to schedule intervals
Use the second set as a "sanity check": the intervals given by the first set of parameters AT ANY DR should not be shorter than with the second set at DR=99% AND they also should not be longer than with the second set at 70%

So when Anki calculates the interval length for a card, it checks this:
interval(params_2, S, 99%) <= interval(params_1, S, users_DR) <= interval(params_2, S, 70%)

polar maple Apr 12, 2025, 5:11 PM

#

cursive badge Some things might just be so easy to remember that they one-shot into permanent ...

if the resulting forgetting curve accurately models R, i don't see the problem?

#

just add a scheduling layer to play nice to human values and output a smaller interval

polar maple Apr 12, 2025, 5:12 PM

#

lapis hearth This is the thing. You dont know where this decay value or not however absurd it...

not sure what you mean

lapis hearth Apr 12, 2025, 5:13 PM

#

polar maple not sure what you mean

What I mean is an interval could look very absurd to you when it is actually the truth. But then you want to end up choosing some decay value which makes intervals look better to the eye.

I am saying it should not come to this

polar maple Apr 12, 2025, 5:15 PM

#

lapis hearth What I mean is an interval could look very absurd to you when it is actually the...

well that's what expertium wants, 0.01 decay could be closer to the truth but we have a prior that says 0.1 looks better to us

#

@unique salmon can you try with a much higher std?

unique salmon Apr 12, 2025, 5:16 PM

#

lapis hearth What I mean is an interval could look very absurd to you when it is actually the...

Now I know where you belong on this scale

polar maple Apr 12, 2025, 5:16 PM

#

increase std to a very high value until you can see the decays are stuck at 0.2, then decrease it a bit to let it learn near the neighbourhood of [-0.1, etc]

unique salmon Apr 12, 2025, 5:16 PM

#

polar maple <@530106856593424407> can you try with a much higher std?

You mean lower? Higher = less regularization

cursive badge Apr 12, 2025, 5:16 PM

#

polar maple if the resulting forgetting curve accurately models R, i don't see the problem?

I mean at a certain point is it even worth trying to model R any more because it has entered another domain where external interference matters more than time since you saw it last. You can just stop scheduling it and retire the card as "memorized". It is up the the user to then decide if/when they want to cram the retired cards.

polar maple Apr 12, 2025, 5:16 PM

#

unique salmon You mean lower? Higher = less regularization

yeah lower

robust hill Apr 12, 2025, 5:16 PM

#

im onn both sides at once

polar maple Apr 12, 2025, 5:17 PM

#

cursive badge I mean at a certain point is it even worth trying to model R any more because it...

in order to know the latter, you need to model R accurately. Only after we have modelled R can we make these decisions that you want

lapis hearth Apr 12, 2025, 5:18 PM

#

unique salmon Now I know where you belong on this scale

Man you are all for whipping people who complain about having FSRS schedule them long intervals 😭

cosmic hedge Apr 12, 2025, 5:20 PM

#

unique salmon > S=1 day > Decay=-0.01, the first interval at DR=80% is around 130 000 days, th...

https://github.com/open-spaced-repetition/fsrs-optimizer/pull/169#issuecomment-2796547056
https://github.com/open-spaced-repetition/fsrs-optimizer/pull/169#issuecomment-2798453145
what are metrics for if not deciding the answers to things like this?

GitHub

Feat/FSRS-6 by L-M-Sherlock · Pull Request #169 · open-spaced-rep...

candidate for FSRS-6
Log Loss: 0.3273 -> 0.3257 (-0.0016)
RMSE(bins): 0.0518 -> 0.0510 (-1.5%)
Model: FSRS-5-dev
Total number of users: 9999
Total number of reviews: 349923850
Weighte...

cursive badge Apr 12, 2025, 5:21 PM

#

polar maple in order to know the latter, you need to model R accurately. Only after we have ...

But you get into this weird place where if your optimisation finds decay to be tiny for a preset do you even bother ever modelling R. You can just retire a cards as soon as they graduate their learning steps.

unique salmon Apr 12, 2025, 5:22 PM

#

I wonder what's the ratio of "time spent arguing about the limits of decay" versus "time spent coding FSRS-6"

cursive badge Apr 12, 2025, 5:22 PM

#

At that point it R kind of a lie anyway because we are operating outside of the bounds where it is valid 🤷‍♂️

polar maple Apr 12, 2025, 5:23 PM

#

cursive badge At that point it R kind of a lie anyway because we are operating outside of the ...

R is never a lie unless the user is lying about their answer buttons

cosmic hedge Apr 12, 2025, 5:23 PM

#

unique salmon I wonder what's the ratio of "time spent arguing about the limits of decay" vers...

woulden't that just be arguing 1-0 coding for everyone but Jarret XD

cursive badge Apr 12, 2025, 5:23 PM

#

unique salmon I wonder what's the ratio of "time spent arguing about the limits of decay" vers...

Don't make us get Jake back in here to bully you into learning rust ;p

unique salmon Apr 12, 2025, 5:23 PM

#

cosmic hedge woulden't that just be `arguing 1-0 coding` for everyone but Jarret XD

Hm, yeah

cosmic hedge Apr 12, 2025, 5:24 PM

#

unique salmon Hm, yeah

given that ratio, I'm inclined to trust Jarrett's judgement on things like clamping decay.

unique salmon Apr 12, 2025, 5:25 PM

#

Well, it wouldn't be zero for Jarrett, since me and him have been arguing

polar maple Apr 12, 2025, 5:25 PM

#

cursive badge But you get into this weird place where if your optimisation finds decay to be t...

yeah this is a win, i don't want to study cards that don't need to be studied. but my point was that to know this in the first place, you need to accurately model memory and if a very low decay is what is required then we should be using that

unique salmon Apr 12, 2025, 5:25 PM

#

cosmic hedge given that ratio, I'm inclined to trust Jarrett's judgement on things like clamp...

Do you like it when your first interval is 273 million years?

#

I'm not even joking, first interval for S=1 at decay=-0.01 is like 10^11 days at DR=70%

cosmic hedge Apr 12, 2025, 5:26 PM

#

if id remember that card in 273 million years then yes

unique salmon Apr 12, 2025, 5:29 PM

#

cosmic hedge if id remember that card in 273 million years then yes

#

I don't think Jarrett has uploaded a file with metrics for all 10k users with optimizable decay, has he?

#

Actually, no, I think we need something a little different - find all users with decay >-0.1 and re-run the optimization for them with decay clamped to -0.1, and see how much worse the metrics become

cursive badge Apr 12, 2025, 5:32 PM

#

polar maple yeah this is a win, i don't want to study cards that don't need to be studied. b...

I agree it is a win. I'm not saying any of this is necessarily bad. I'm just saying a certain point maybe we accept that the decay value is so off the scale it is silly and the modelled R is not really something we should pay attention to any more. External factors are going to be much more important than our interval in determining if we remember something.

polar maple Apr 12, 2025, 5:33 PM

#

cursive badge I agree it is a win. I'm not saying any of this is necessarily bad. I'm just say...

yep. So we still accurately model R with a low decay as this is what would tell us that we have this problem in the first place, and then let a scheduler deal with making the user experience right

polar maple Apr 12, 2025, 5:34 PM

#

unique salmon I don't think Jarrett has uploaded a file with metrics for all 10k users with op...

https://raw.githubusercontent.com/open-spaced-repetition/srs-benchmark/0636c1d8479d97a87770c1d51a0f30ec41062ba2/result/FSRS-6-recency.jsonl

#

i count 20 parameters so there is prob decay

unique salmon Apr 12, 2025, 5:35 PM

#

unique salmon Actually, no, I think we need something a little different - find all users with...

Guess we wait for Jarrett to do this

bold terrace Apr 12, 2025, 5:36 PM

#

unique salmon I'm not even joking, first interval for S=1 at decay=-0.01 is like 10^11 days at...

How many people have a decay of 0.01 ?

#

Also, for DR between 90% and 100%, decay of 0.01 will make it learn learn almost every day

#

So, all those risk of millions of year of interval, can be easily controlled by DR

#

If the guy has 10 million year of interval for DR=70% but only a few days for DR=90%, I don't think it's a big issue

#

It's just that yes, workload compared to DR won't be that easy to map anymore, but that's normal

unique salmon Apr 12, 2025, 5:38 PM

#

bold terrace How many people have a decay of 0.01 ?

Based on the scientific method of "bro look at the image", I'd say 20% of users have decay between 0 and (-)0.1, and 0.5-1.0% have decay of around (-)0.01

unique salmon Apr 12, 2025, 5:39 PM

#

bold terrace So, all those risk of millions of year of interval, can be easily controlled by ...

Reviewing every day is also a "risk"

#

It defeats the purpose of Anki

bold terrace Apr 12, 2025, 5:39 PM

#

Maybe but that's why the fact it's controlled by DR is fine

#

To be honest, it depends a lot of the approach of the guy, but I think with a dynamic Decay, now everyone can be represented, so it's a big win

#

The people who want to never review anymore a card if they have at least 70% chance of recalling forever ? That's a win
THe people who want to review endlessly their card with a DR at 99% and a agressive decay ? That's a win
People who want sensible interval and control their workload with DR ? That's a win

unique salmon Apr 12, 2025, 5:47 PM

#

Alright, on Github I told Jarrett that if he wants to, he can re-run FSRS-6 with opt. decay clamped to (-0.1, -0.8) or (-0.15, -0.8) and check how much worse metrics become

bold terrace Apr 12, 2025, 5:52 PM

#

Anyone knows how to get the revlog in csv format for the optimizer ? https://github.com/open-spaced-repetition/fsrs-optimizer

GitHub

GitHub - open-spaced-repetition/fsrs-optimizer: FSRS Optimizer Package

FSRS Optimizer Package. Contribute to open-spaced-repetition/fsrs-optimizer development by creating an account on GitHub.

#

SELECT cid as card_id, id as review_time, ease as review_rating, type as review_state, time as review_duration FROM revlog ?

unique salmon Apr 12, 2025, 5:59 PM

#

Anki nerds arguing whether some parameter in the Poopen-Farten algorithm should be 0.1234 or 0.1235

#

https://tenor.com/view/fighting-gif-25431986

Tenor

bold terrace Apr 12, 2025, 6:32 PM

#

40min later the optimizer runs laughcry

#

0.2924

#

I guess that would be my new decay

#

For my normal D deck the result would be
"w": [0.1687, 1.1435, 3.1934, 20.4036, 7.2316, 0.5491, 2.0316, 0.0686, 1.3334, 0.1155, 0.8393, 1.8538, 0.1024, 0.3336, 2.3554, 0.1919, 3.0933, 0.7447, 0.3726, 0.079, 0.1328],

My current log loss being 0.3530 and RMSE 3.23, the optimizer tell me now :

Loss before training: 0.3686
Loss after training: 0.3654
Last rating = all
R-squared: 0.8470
MAE: 0.0077
ICI: 0.0064
E50: 0.0043
E90: 0.0154
EMax: 0.1520
RMSE(bins): 0.0257
AUC: 0.6197

#

RMSE from 3.23 to 0.0257 seems like a violent upgrade

#

Seems my decay for that deck is a .13 instead of the previous 0.20

#

Let's see on the hard now

#

"w": [0.0104, 0.0222, 0.0743, 0.0617, 7.766, 0.2282, 2.4887, 0.0302, 0.9422, 0.2648, 0.4128, 1.8164, 0.1254, 0.2906, 2.2589, 0.2292, 2.9629, 0.6093, 0.1445, 0.1923, 0.3794],

#

current logloss : 0.4395, RMSE:4.42%

Loss before training: 0.7136
Loss after training: 0.6013
Last rating = all
R-squared: 0.9718
MAE: 0.0177
ICI: 0.0129
E50: 0.0103
E90: 0.0233
EMax: 0.0565
RMSE(bins): 0.0470
AUC: 0.6616

Not much gain on that one, seem it's even worst 🤔

unique salmon Apr 12, 2025, 7:00 PM

#

It's weird that there is such a big discrepancy in log-loss. Something's off

bold terrace Apr 12, 2025, 7:00 PM

#

The 0.13 decay and 3.79 might make sense though since the first deck, I would highly doubt my retention would drop lower than 50-60% even if I was not reviewing them for multiple month

unique salmon Apr 12, 2025, 7:00 PM

#

bold terrace RMSE from 3.23 to 0.0257 seems like a violent upgrade

It's from 3.23% to 2.57%

bold terrace Apr 12, 2025, 7:00 PM

#

Yes sorry

#

but still violent

#

in the good sense

unique salmon Apr 12, 2025, 7:02 PM

#

bold terrace current logloss : 0.4395, RMSE:4.42% ``` Loss before training: 0.7136 Loss after...

current logloss : 0.4395, RMSE:4.42%
Loss before training: 0.7136
Loss after training: 0.6013

yeah idk man, something's really off. I cannot think of any explanation why log-loss is so different that doesn't involve at least one of the following 2:

Anki/google Colab include/exclude different cards
One of the optimizers is bugged

bold terrace Apr 12, 2025, 7:02 PM

#

#

the left one is for my hard deck

#

right one for the normal/easy

unique salmon Apr 12, 2025, 7:03 PM

#

Jesus that is dogshit calibration

#

On both

polar maple Apr 12, 2025, 7:03 PM

#

bold terrace

is this with the optimizable decay?

bold terrace Apr 12, 2025, 7:03 PM

#

laughcry

#

Yes

polar maple Apr 12, 2025, 7:04 PM

#

lol

bold terrace Apr 12, 2025, 7:04 PM

#

I got ~.12 decay for my normal D deck, and ~.37 for my hard one

#

Thing is, I only review things at my DR

unique salmon Apr 12, 2025, 7:04 PM

#

For your hard one there's almost no correlation between predicted retention and actual retention

bold terrace Apr 12, 2025, 7:04 PM

#

Soooo I guess the "actual R" for everything outside 90% is dogshit 😄

polar maple Apr 12, 2025, 7:05 PM

#

for the hard deck was it from taking lower D cards from the normal deck?

bold terrace Apr 12, 2025, 7:05 PM

#

polar maple for the hard deck was it from taking lower D cards from the normal deck?

High D

robust hill Apr 12, 2025, 7:05 PM

#

its like

bold terrace Apr 12, 2025, 7:05 PM

#

the spike at ~97% D

#

robust hill Apr 12, 2025, 7:06 PM

#

every day the topic of discussion is changed so much

bold terrace Apr 12, 2025, 7:06 PM

#

my Retention for the hard above

#

The one for the normal

#

I don't have to complain to be honest

#

But yeah those graphs are funky

polar maple Apr 12, 2025, 7:06 PM

#

ok so i guess since D is closely related to the lapse ratio, it is already going to be flattened to be a certain R

bold terrace Apr 12, 2025, 7:07 PM

#

Yep

#

That's why I also think some clustering could be interesting

#

there's really different profiles of card/review story inside the same deck

unique salmon Apr 12, 2025, 7:09 PM

#

robust hill every day the topic of discussion is changed so much

Oh, right, you haven't participated in The Civil War of Decay
Here: #1282005522513530952 message

robust hill Apr 12, 2025, 7:17 PM

#

yk what i vote for .001 decay

bold terrace Apr 12, 2025, 7:17 PM

#

Last test, I'll run on both

#

Paste this into your scheduling code
{
    // Generated, Optimized anki deck settings
    "deckName": "revlog.yomitan.both",// PLEASE CHANGE THIS TO THE DECKS PROPER NAME
    "w": [0.0564, 0.3174, 2.3289, 17.0138, 6.991, 0.8772, 2.3117, 0.001, 1.1084, 0.1602, 0.6028, 1.7299, 0.122, 0.2495, 2.1961, 0.1854, 3.1603, 0.7786, 0.3116, 0.1502, 0.3762],
    "requestRetention": 0.7,
    "maximumInterval": 36500,
},

Loss before training: 0.5829
Loss after training: 0.5331
Last rating = all
R-squared: 0.9567
MAE: 0.0181
ICI: 0.0109
E50: 0.0087
E90: 0.0280
EMax: 0.0573
RMSE(bins): 0.0332
AUC: 0.6626

From logloss .4225 and RMSE .0344

#

Dogshit version 2

#

😄

unique salmon Apr 12, 2025, 7:23 PM

#

Congrats, now there is negative correlation between FSRS predictions and real retention

polar maple Apr 12, 2025, 7:25 PM

#

on fsrs-optimizer does it use a train/test split?

#

i don't see how the calibration can be so bad if it is trained and evaluated on the same data

bold terrace Apr 12, 2025, 7:25 PM

#

To be fair I have almost no card with Predicted R under 80%

unique salmon Apr 12, 2025, 7:27 PM

#

polar maple on fsrs-optimizer does it use a train/test split?

https://github.com/open-spaced-repetition/fsrs-optimizer/blob/5bc00d74dc6d09af8b657171ba9ba5b66bd8175f/src/fsrs_optimizer/fsrs_optimizer.py#L1232
It can, but it's set to False by default

GitHub

fsrs-optimizer/src/fsrs_optimizer/fsrs_optimizer.py at 5bc00d74dc6d...

FSRS Optimizer Package. Contribute to open-spaced-repetition/fsrs-optimizer development by creating an account on GitHub.

unique salmon Apr 12, 2025, 7:36 PM

#

bold terrace ``` Paste this into your scheduling code { // Generated, Optimized anki deck...

Maybe you excluded suspended cards in Anki, but not in the optimizer?

#

In Anki suspended cards are excluded by default, but in the google colab optimizer it's the opposite

#

That could explain the difference in log-loss

#

Btw, this is the hardest deck I have and it has reasonable calibration

#

Not great, but at least somewhat reasonable

bold terrace Apr 12, 2025, 7:41 PM

#

The distribution of your predicted R also look better

#

But for example I can't really having review with predicted 0.2-0.6

#

My DR was at 80-90 and I never skip any day so having a 40% is quite unlikely

unique salmon Apr 12, 2025, 7:47 PM

#

bold terrace Dogshit version 2

Sure, but on your image you can see bins where predicted R is not 90%

bold terrace Apr 12, 2025, 7:48 PM

#

Sure Sure

#

With my Filtered Deck I also think I'm able to really squeeze the predcited R close to the DR

#

which can also explain that distribution

unique salmon Apr 12, 2025, 7:50 PM

#

I really want you to try not using filtered decks in whatever version will Anki will have FSRS-6 + fine-tuned LB
Fine-tuned LB is guaranteed to make it into the next release, idk about FSRS-6

bold terrace Apr 12, 2025, 7:56 PM

#

Well my workflow is quite simple, I have one Filtered Deck for R<DR, and I keep checking the ratio it represents

Interestingly, moving from 85% to 90% DR made the number of items scheduled by the Filtered Deck lower than before

#

I also have higher and higher stability those past weeks so I think it also plays a role

#

The future avg predicited R is also closer to DR, thus limiting the need of those Filtered Reviews

#

Still, this is without LB

#

So I get it's the Fuzz that still push the due date a bit further than what they should

unique salmon Apr 12, 2025, 8:03 PM

#

I increased the weight of interval lengths in the fuzz formula, making it more likely to schedule cards earlier

#

So in the next Anki release LB will be better

#

https://github.com/ankitects/anki/pull/3864

GitHub

Fine-tune load balancer by Expertium · Pull Request #3864 · ankit...

Copying my Discord comment:
Alright, I've set up the optimization loop (using a Bayesian optimizer) to optimize these powers that are used in the load balancer's weight formula:
(1 ...

bold terrace Apr 12, 2025, 8:04 PM

#

Does it also affect the fuzz or only the LB ?

unique salmon Apr 12, 2025, 8:04 PM

#

...unless the simulations are very inaccurate

unique salmon Apr 12, 2025, 8:04 PM

#

bold terrace Does it also affect the fuzz or only the LB ?

I'm not sure what you mean, considering that fuzz = LB

#

Just different names

#

LB is more appropriate

#

LB is just "fuzz that chooses the random interval in a less random way"

bold terrace Apr 12, 2025, 8:14 PM

#

unique salmon https://github.com/ankitects/anki/pull/3864

Nice !

bold terrace Apr 12, 2025, 8:15 PM

#

unique salmon I'm not sure what you mean, considering that fuzz = LB

From what I remember from @ashen light , even if I disabled the LB by doing mw.col._set_enable_load_balancer(False) the due date is still shitfed by the fuzz

unique salmon Apr 12, 2025, 8:15 PM

#

Ah, idk

#

I don't know how that command works

bold terrace Apr 12, 2025, 8:15 PM

#

The Fuzz apparently is there in Anki for years now

unique salmon Apr 12, 2025, 8:15 PM

#

Yes

#

LB is new, fancier fuzz

#

I didn't know you can disable LB but still have the old fuzz

#

You might be the only person on the planet using it 🤣

bold terrace Apr 12, 2025, 8:17 PM

#

Probably 😛

bold terrace Apr 12, 2025, 9:20 PM

#

Load by Lapse 20-quantile 🙂 Definitely a bit more gradual than Load by Reps 20-quantile

ashen light Apr 12, 2025, 10:13 PM

#

unique salmon You might be the only person on the planet using it 🤣

I can think of one other person 🍃

bold terrace Apr 12, 2025, 10:16 PM

#

I said so many times flagging a leech based on lapse was dumb

#

But now I realize it was my statement that was dumb

#

I check other deck for another language, same tendency

quasi shadow Apr 13, 2025, 8:26 AM

#

unique salmon I don't think Jarrett has uploaded a file with metrics for all 10k users with op...

https://github.com/open-spaced-repetition/srs-benchmark/blob/Expt/trainable-forgetting-decay/result/FSRS-6.jsonl

GitHub

srs-benchmark/result/FSRS-6.jsonl at Expt/trainable-forgetting-deca...

A benchmark for spaced repetition schedulers/algorithms - open-spaced-repetition/srs-benchmark

quasi shadow Apr 13, 2025, 8:30 AM

#

unique salmon Alright, on Github I told Jarrett that if he wants to, he can re-run FSRS-6 with...

I'm benchmarking it

quasi shadow Apr 13, 2025, 9:07 AM

#

The preliminary result:

#

Model: FSRS-6-dev
Total number of users: 844
Total number of reviews: 27826685
Weighted average by reviews:
FSRS-6-dev LogLoss (mean±std): 0.3346±0.1594
FSRS-6-dev RMSE(bins) (mean±std): 0.0491±0.0330
FSRS-6-dev AUC (mean±std): 0.7109±0.0790

Weighted average by log(reviews):
FSRS-6-dev LogLoss (mean±std): 0.3557±0.1665
FSRS-6-dev RMSE(bins) (mean±std): 0.0652±0.0432
FSRS-6-dev AUC (mean±std): 0.7056±0.0874

Weighted average by users:
FSRS-6-dev LogLoss (mean±std): 0.3583±0.1680
FSRS-6-dev RMSE(bins) (mean±std): 0.0675±0.0444
FSRS-6-dev AUC (mean±std): 0.7048±0.0895

parameters: [0.20255, 1.1585, 2.8436, 15.9828, 6.96915, 0.562, 2.2429, 0.00835, 1.51745, 0.11915, 1.0329, 1.7994, 0.11795, 0.2945, 2.28385, 0.21265, 3.00505, 0.7968, 0.29115, 0.14205, 0.204]

Model: FSRS-6
Total number of users: 844
Total number of reviews: 27826685
Weighted average by reviews:
FSRS-6 LogLoss (mean±std): 0.3342±0.1593
FSRS-6 RMSE(bins) (mean±std): 0.0486±0.0327
FSRS-6 AUC (mean±std): 0.7103±0.0806

Weighted average by log(reviews):
FSRS-6 LogLoss (mean±std): 0.3552±0.1667
FSRS-6 RMSE(bins) (mean±std): 0.0646±0.0430
FSRS-6 AUC (mean±std): 0.7050±0.0885

Weighted average by users:
FSRS-6 LogLoss (mean±std): 0.3578±0.1682
FSRS-6 RMSE(bins) (mean±std): 0.0669±0.0441
FSRS-6 AUC (mean±std): 0.7042±0.0906

parameters: [0.19025, 1.1416, 2.84035, 16.0223, 6.96865, 0.56225, 2.24175, 0.00775, 1.52485, 0.11935, 1.0378, 1.79665, 0.11955, 0.2907, 2.27985, 0.2125, 3.00505, 0.81515, 0.28365, 0.13125, 0.2077]

#

The clipper I apply to FSRS-6-dev is w[20] = w[20].clamp(0.15, 0.8).

bold terrace Apr 13, 2025, 9:08 AM

#

IMO I find it too arbitrary to clamp just based on what we feel should be right or not

quasi shadow Apr 13, 2025, 9:08 AM

#

It's ~1% worse than (0.01, 1.0)

bold terrace Apr 13, 2025, 9:08 AM

#

If we really want to clamp, we could always use the 2th/98th percentile of the training set

bold terrace Apr 13, 2025, 9:09 AM

#

quasi shadow It's ~1% worse than (0.01, 1.0)

1% relative only ?

#

Feels random but quite good actually 😅

quasi shadow Apr 13, 2025, 9:10 AM

#

2% percentile of decay values: 0.0347
98% percentile of decay values: 0.7270

bold terrace Apr 13, 2025, 9:11 AM

#

At least we can assume that 96% of people will fit in that clamp 🙂

#

And the 4% we can just ask them to reflect on how they use the algorithm haha

quasi shadow Apr 13, 2025, 9:32 AM

#

If the I use (0.1, 0.8) as the clipper:

Model: FSRS-6-dev
Total number of users: 876
Total number of reviews: 28673715
Weighted average by reviews:
FSRS-6-dev LogLoss (mean±std): 0.3339±0.1604
FSRS-6-dev RMSE(bins) (mean±std): 0.0486±0.0325
FSRS-6-dev AUC (mean±std): 0.7101±0.0785

Weighted average by log(reviews):
FSRS-6-dev LogLoss (mean±std): 0.3557±0.1667
FSRS-6-dev RMSE(bins) (mean±std): 0.0646±0.0426
FSRS-6-dev AUC (mean±std): 0.7053±0.0869

Weighted average by users:
FSRS-6-dev LogLoss (mean±std): 0.3582±0.1681
FSRS-6-dev RMSE(bins) (mean±std): 0.0669±0.0437
FSRS-6-dev AUC (mean±std): 0.7046±0.0891

parameters: [0.19415, 1.13795, 2.8374, 15.98545, 6.9694, 0.56155, 2.2378, 0.00775, 1.51735, 0.11995, 1.0336, 1.799, 0.1187, 0.29145, 2.28435, 0.2106, 3.0051, 0.81215, 0.28495, 0.1352, 0.2056]

Model: FSRS-6
Total number of users: 876
Total number of reviews: 28673715
Weighted average by reviews:
FSRS-6 LogLoss (mean±std): 0.3338±0.1605
FSRS-6 RMSE(bins) (mean±std): 0.0485±0.0325
FSRS-6 AUC (mean±std): 0.7091±0.0804

Weighted average by log(reviews):
FSRS-6 LogLoss (mean±std): 0.3556±0.1670
FSRS-6 RMSE(bins) (mean±std): 0.0645±0.0427
FSRS-6 AUC (mean±std): 0.7046±0.0881

Weighted average by users:
FSRS-6 LogLoss (mean±std): 0.3581±0.1684
FSRS-6 RMSE(bins) (mean±std): 0.0668±0.0438
FSRS-6 AUC (mean±std): 0.7039±0.0902

parameters: [0.1933, 1.1416, 2.84035, 16.0035, 6.9689, 0.5619, 2.2396, 0.0077, 1.5194, 0.1196, 1.03675, 1.79805, 0.1194, 0.2913, 2.28145, 0.2105, 3.0053, 0.8154, 0.2847, 0.1302, 0.2079]

#

It's only 0.2% worse.

#

OK, let's use it.

#

@polar maple @unique salmon The Civil War of Decay has its ending!

#

😂 In this week, I have run a dozen of benchmarks.

bold terrace Apr 13, 2025, 9:48 AM

#

Computer goes brrr

#

Btw I played a bit with fsrs-optimizer yesterday, I tried to run it on my "normal D" deck (~ low lapse), "high D" (higher lapse count), and on both aggregate. I got as decays 0.1328, 0.3794 and on both 0.3762

I get more and more the feeling those past weeks that behind a user or even a deck, there might be multiple population of cards/review.

Thing that right now is somewhat handle with D, but since we can see even the decay could be very different based on which population we're in, wouldn't make a sense to try to see how to cluster the reviews and having different sets of parameters for different populations ?

quasi shadow Apr 13, 2025, 9:58 AM

#

bold terrace Btw I played a bit with fsrs-optimizer yesterday, I tried to run it on my "norma...

Due to the limitation, FSRS could only distinguish different cards/review based on the history of rating and elapsed days.

#

So, the heterogeneity is still very high.

bold terrace Apr 13, 2025, 10:01 AM

#

Sure but look at D and how for many people it is a proxy for "Lapse" (which can be infered from the reviews alone)

Also, isn't it possible to run a first optimization, and based on Difficulty to then cluster it and run 2nd-layer optimization on each ?

#

(But I agree that then, implementing that in Anki would be difficult, having parameters not really based on deck but attached to cards, based on a population-id..)

unique salmon Apr 13, 2025, 10:31 AM

#

quasi shadow <@142448513622605824> <@530106856593424407> The Civil War of Decay has its endin...

https://tenor.com/view/hai-gif-10567033442484308038

Tenor

unique salmon Apr 13, 2025, 10:34 AM

#

bold terrace At least we can assume that 96% of people will fit in that clamp 🙂

No, you misunderstood
96% of people fit between 0.0347 and 0.7270, but it doesn't mean that 96% of people fit between 0.1 and 0.8

#

@quasi shadow how many users (%) have decay between 0.1 and 0.8?

#

I assume something like 80%?

quasi shadow Apr 13, 2025, 10:37 AM

#

Number of users with decay between 0.1 and 0.8: 8074
Percentage of users with decay between 0.1 and 0.8: 80.75%

bold terrace Apr 13, 2025, 10:42 AM

#

unique salmon No, you misunderstood 96% of people fit between 0.0347 and 0.7270, but it doesn'...

Yeah indeed but I sent this message when we were talking about [0.0347, 0.7270]

#

But if [.1, .8] is only 0.2% worse ... I mean ... being in that 20% (1-80.75%) is probably fine

#

We could argue that for those 20%, the prediction will be worst than what they could, but I guess they'll already be way better than before

unique salmon Apr 13, 2025, 11:11 AM

#

http://www.incompleteideas.net/IncIdeas/BitterLesson.html
TLDR: forget about carefully crafted rule-based models that utilize human knowledge, just use general-purpose models AND LOTS OF COMPUTE and get better results
Chess? Just use a lot of compute and a general-purpose model
Go? Just use a lot of compute and a general-purpose model
Image recognition and speech recognition? Just use a lot of compute and a general-purpose model

I like this article because right now we have a crystal clear example of it: a neural net outperforming the carefully crafted FSRS with its simple formulas based on our understanding of human memory. If all we want is maximum predictive accuracy, making a giant neural net and just taking advantage of more compute would be a better approach

quasi shadow Apr 13, 2025, 11:18 AM

#

unique salmon http://www.incompleteideas.net/IncIdeas/BitterLesson.html TLDR: forget about car...

😄 if you want to run the general-purpose model in your device, please buy a dozen of RTX 5090.

unique salmon Apr 13, 2025, 11:20 AM

#

quasi shadow 😄 if you want to run the general-purpose model in your device, please buy a doz...

Lol
Well, according to Alex, I only need one CPU to run his model

quasi shadow Apr 13, 2025, 11:21 AM

#

so why not implement it in Anki?

unique salmon Apr 13, 2025, 11:21 AM

#

Ask Alex

quasi shadow Apr 13, 2025, 11:23 AM

#

@polar maple is there any problem to implement it in Anki?

cursive badge Apr 13, 2025, 11:43 AM

#

quasi shadow <@142448513622605824> is there any problem to implement it in Anki?

I don't know that it is the only problem, but from the discussion I had with them yesterday speed / sync is one problem.
It doesn't run fast enough to just give it the entire revlog each time you start Anki (~200 reviews/s) so you need to cache the NN internal state.
Caching the internal state causes sync problems when you want to merge non-linear revlogs.

robust hill Apr 13, 2025, 11:45 AM

#

we must carb load jessie

#

to increase our retention

quasi shadow Apr 13, 2025, 11:47 AM

#

derpy So FSRS will survive.

lapis hearth Apr 13, 2025, 12:03 PM

#

unique salmon http://www.incompleteideas.net/IncIdeas/BitterLesson.html TLDR: forget about car...

Wasnt there a time you or someone else wanted to see what maximal accuracy that could potentially be ever achieved by FSRS in numbers❓ Isnt this what neural nets are showing, that there is still a considerable amount of stuff to improve upon, through whichever way you would like

unique salmon Apr 13, 2025, 12:19 PM

#

lapis hearth Wasnt there a time you or someone else wanted to see what maximal accuracy that ...

Yes, hopefully me and Alex will get to the whole "estimate the limits of accuracy on the 10k dataset" thing, but he seems to not be super interested in that
Anyway, the point is that FSRS will (almost certainly) never outperform big neural nets

lapis hearth Apr 13, 2025, 12:21 PM

#

unique salmon Yes, hopefully me and Alex will get to the whole "estimate the limits of accurac...

So in which direction is Anki heading right now

#

FSRS or neural nets

unique salmon Apr 13, 2025, 12:21 PM

#

FSRS

slim hollow Apr 13, 2025, 12:40 PM

#

the NN nets currently used are super small (thousands), if you talk about big neural nets they go into bilions of parameters

unique salmon Apr 13, 2025, 12:41 PM

#

The largest neural net in the srs-benchmark repo has 9k params, but Alex has another with 2.7 million params that he hasn't released yet. It blows all other algorithms out of the water, according to his preliminary tests

cursive badge Apr 13, 2025, 12:44 PM

#

llama 4 behemoth apparently has 2 trillion parameters 😮

slim hollow Apr 13, 2025, 12:45 PM

#

a 2.7m network should be able to run on cpu at ok speed depending on architecture, but anki can currently run on pretty much anything

bold terrace Apr 13, 2025, 12:46 PM

#

Don't know, sure any kind of algorithm that is able to "learn by itself" is impressive, but I also noticed how, difficult they might be to actually make better without become hugely inefficient in terms of energy, and how the black-box aspect of it make it difficult as a dev to make a good feedback loop with them (Train them, test them, improve their weakness...)

#

But it's still a very useful tool in the toolkit

unique salmon Apr 13, 2025, 12:48 PM

#

bold terrace Don't know, sure any kind of algorithm that is able to "learn by itself" is impr...

And they still end up being better. For example, is there anything even remotely close to ChatGPT (or any other modern LLM) that doesn't rely on deep learning and instead has rules entirely specified by the developers? Nope.

#

Same goes for image generation

#

And image recognition too

slim hollow Apr 13, 2025, 12:49 PM

#

what NN are good at is noticing patterns that can be non linear which is hard to achieve if you want to model things like FSRS

bold terrace Apr 13, 2025, 12:49 PM

#

unique salmon And they still end up being better. For example, is there anything even remotely...

Question is, what kind of app GPT is able to completely replaces though ?

unique salmon Apr 13, 2025, 12:50 PM

#

That's a strange question. I was talking about generating realistic, human-like text

#

You are asking a completely different question

#

Oh, btw, this Veritasium video is also about the "bitter lesson", even if he doesn't say it that way
https://youtu.be/P_fHJIYENdI
Neural nets outperformed algorithms made by expert biologists

YouTube

Veritasium

The Most Useful Thing AI Has Ever Done

The biggest problems in the world might be solved by tiny molecules unlocked using AI. Take your big idea online today with https://ve42.co/hostinger - code VE at checkout.

A huge thank you to John Jumper and Kathryn Tunyasuvunakool at Google Deepmind; and to David Baker and the Institute for Protein Design at the University of Washington for t...

▶ Play video

bold terrace Apr 13, 2025, 12:52 PM

#

I mean I work in software development for the past 10 years, went the computer science route, did a few projects on AI, a project on computer vision, and while of course AI is a super super super great tool, I still felt regularly the 2 limitations I explained above

bold terrace Apr 13, 2025, 12:53 PM

#

unique salmon That's a strange question. I was talking about generating realistic, human-like ...

You said they still end up being better, implying better than anything else

#

Which is not necessarly true

#

For problems fundamentally related to probability, with different layers of fact-checking the results, it can be immensely useful

cursive badge Apr 13, 2025, 12:54 PM

#

bold terrace Question is, what kind of app GPT is able to completely replaces though ?

ELIZA ;p

bold terrace Apr 13, 2025, 12:55 PM

#

But for problem requiring a very specific solution, it fall a bit short.

#

I even find it very funny how the most common example of job that could be replaced by AI would be software development, when in fact I think it's probably all the others jobs of this industry (project management, analyst, manager...) that could more easily be

unique salmon Apr 13, 2025, 12:55 PM

#

bold terrace You said they still end up being better, implying better than anything else

Better than anything else for a specific task, yes. Text generation, image recognition, image generation, speech recognition, speech generation, protein folding, chess, go, spaced repetition even

#

We don't have an AI that can do all of that and more and replace all humans...yet 🙂

lapis hearth Apr 13, 2025, 12:56 PM

#

unique salmon Better than anything else for a specific task, yes. Text generation, image recog...

so why not use it for anki then

unique salmon Apr 13, 2025, 12:57 PM

#

lapis hearth so why not use it for anki then

@polar maple you should write a "Why My Neural Net Won't Be Used In Anki" blog post 🤣

lapis hearth Apr 13, 2025, 12:57 PM

#

Yes

cursive badge Apr 13, 2025, 12:57 PM

#

unique salmon <@142448513622605824> you should write a "Why My Neural Net Won't Be Used In Ank...

That sounds boring. Make an LLM do it ;p

lapis hearth Apr 13, 2025, 12:57 PM

#

Not necessarily Alex's

bold terrace Apr 13, 2025, 12:59 PM

#

But hey, I spent already 3-4y in that industry with everyone explaining how blockchain would change absolutely everyting in the society

#

I guess the next 2-3y will be AI hype laughcry

cursive badge Apr 13, 2025, 12:59 PM

#

Blockchain was just entirely dumb from the start though.

#

LLMs at least do something useful.

bold terrace Apr 13, 2025, 1:00 PM

#

Well, it's still a tool in the toolkit, but the problem it could solve were indeed a bit too much specific to really be a broad revolution

unique salmon Apr 13, 2025, 1:01 PM

#

bold terrace I guess the next 2-3y will be AI hype <:laughcry:1018614934386524300>

Oh come on. I get that people love saying "X is a bubble" and "X is just a trend, it's gotta end", but this is AI we're talking about. The only way it won't be a revolutionary technology is if achieving general intelligence is - for whatever reason - so incredibly difficult that it will take 1000+ years and in the meantime we will just have ChatGPT-7-Pro or something

cursive badge Apr 13, 2025, 1:03 PM

#

Blockchain has such vanishingly narrow use cases. Even its big initial example of "decentralised money" never really worked, it was far too unstable to be used as cash. It was so weird seeing so many people trying to shoehorn it into completely irrelevant things.

unique salmon Apr 13, 2025, 1:04 PM

#

And for the record, I don't think that making AGI is going to take a 1000 years

#

Or even 100 for that matter

#

100 starting from today, I mean

bold terrace Apr 13, 2025, 1:05 PM

#

Why people associate LLM and AGI though

#

We're talking AI-hype, LLM replacing humans

#

you talk about AGI

#

I mean, doing prolog was considered AI at some point

#

When I did AI, I did Alpha-Beta/Min-Max

#

The current AI stuff was called Datamining in my classes back then

#

AGI has never had anything to do with all those things

unique salmon Apr 13, 2025, 1:10 PM

#

If you think modern LLMs will never be generally intelligent (fair enough, btw, I won't argue with that), do you envision the future like this?

LLMs plateau around 2028-2030 when the current "just throw in more compute" paradigm runs out of gas as it's physically impossible to build bigger datacenters and produce more chips to train larger models + the entire Internet is used for training, so there is no more unused training data
Instead of the paradigm shifting to something else, all of the progress just dies out and then there are no interesting news about AI for decades

#

Because I ABSOLUTELY do not think that number 2 will happen

bold terrace Apr 13, 2025, 1:13 PM

#

I'm definitely more on option 1. AI/LLM will continue to exist, will continue to solve very very interesting problem with a way nothing else can solve right now, but unfortunately most startup based on it will die out, big companies will find something else to promote to make investors excited

cursive badge Apr 13, 2025, 1:13 PM

#

It might not be no interesting news, but there could be a gap before anything new that gets people widely excited comes out.

bold terrace Apr 13, 2025, 1:13 PM

#

Porn Industry might be an exception though

#

laughcry

#

I mean Zuck' spent I don't know how much million in VR

cursive badge Apr 13, 2025, 1:14 PM

#

I was really excited about AlphaGo. I don't think normal people were 😂

unique salmon Apr 13, 2025, 1:14 PM

#

bold terrace I'm definitely more on option 1. AI/LLM will continue to exist, will continue to...

2 comes after 1, it's not "either or", it's "first, then second"

bold terrace Apr 13, 2025, 1:14 PM

#

2 doesn't have to come after 1

#

I don't see why the progress would have to die

unique salmon Apr 13, 2025, 1:15 PM

#

I thought you will say "1 and 2"

#

Like, I thought you will say "Yes, this is what I imagine"

unique salmon Apr 13, 2025, 1:15 PM

#

bold terrace I don't see why the progress would have to die

Then why are we arguing again? 🤔

#

I thought you will say "Yes, I imagine that the progress will halt for decades"

cursive badge Apr 13, 2025, 1:16 PM

#

unique salmon Then why are we arguing again? 🤔

This is the internet you know. It's what people do.

bold terrace Apr 13, 2025, 1:17 PM

#

unique salmon Then why are we arguing again? 🤔

Maybe we used both AI-hype to refer to different things, I'm more criticizing how the IT industry is right now completely crazy about anything relating to AI

#

recently it's the whole "vibe coding" that everyone talks about

cursive badge Apr 13, 2025, 1:19 PM

#

My biggest personal gripe is AI hype made all the GPUs cost silly money 😦

bold terrace Apr 13, 2025, 1:21 PM

#

cursive badge My biggest personal gripe is AI hype made all the GPUs cost silly money 😦

You know what ? Sometimes I wonder if all that hype is not really just to make people buy GPUs, especially devs haha

#

When I see that nvidia box at 3000$, even as a skeptical about AI and coding, I almost pull the trigger laughcry

#

My Github Copilot right now goes "Enable/Disable" every 10min, it's kinda maniac

#

So when my CEO say "AI will replaces Devs" I'm like "I WISH IT WOULD"

#

I mean, making me saving time

#

But I guess he has better insights than poor me haha

cursive badge Apr 13, 2025, 1:23 PM

#

bold terrace You know what ? Sometimes I wonder if all that hype is not really just to make p...

Can the next fad please involve something does not require GPUs. I want to upgrade my 1080ti at some point before I die.

bold terrace Apr 13, 2025, 1:23 PM

#

lol ! I bought a 4070 Ti S one year ago

#

I'm happy now when I see the 5070 is basically a worst model

#

I also saw on reddit a lot of people are very very disappointed with the latest Llama scout

#

maybe you'll be able to buy a new GPU soon ;D

unique salmon Apr 13, 2025, 1:26 PM

#

We can already make AI that talks in a human-like way and, by some metrics, even outperforms humans. For example, frontier LLMs definitely know more simple facts like "When was Shakespeare born?" than the average person, and are better at solving math problems than the average person. So can we get to general intelligence via more compute, more training data, and some incremental improvements to the Transformer architecture? Or do we need some special sauce?

In the straight-line-goes-brrrr world we just need to work on how we train AI, scale it up even more, and tweak the Transformer architecture. And then we get AGI.

In the secret-sauce world, ChatGPT-7.5.5-Pro-Ulta will be better at answering PhD-level questions than any human alive, yet will be unemployable as a software engineer, let alone as a movie director or a CEO. And things will remain that way for who knows how long.

So the crux is: how straightforward is the path from the current AI (which, again, is in some sense already superhuman, compared to an average Joe) to AGI that is actually undeniably superhuman at everything?

bold terrace Apr 13, 2025, 1:29 PM

#

Replace "Bruno" by "AGI" laughcry

#

But #off-topic anyway I guess

quiet saddle Apr 13, 2025, 2:04 PM

#

quiet saddle I have a question about the parameters I use for FSRS: When I switched to FSRS I...

Just to follow-up on that question I asked yesterday: since I have a specific tag for the card I suspend because they became leeches and I'm pretty sure they're just difficult cards and not badly design cards, I had the option to add those specific cards for the FSRS optimization field. I did that, and rescheduled all cards, and that added ~1000 cards to my backlog.

Today I followed my usual procedure to reduce that backlog, reviewing by decreasing retrievability. It's a little to soon to draw conclusion, but I was surprise by the feeling that many of those cards were on the "edge of being forgotten". Also the scheduler is less optimistic with new cards introduced today, which seems good since it's difficult to judge the difficulty of a card with only one review.

#

So, I'm even more convinced that the usual advice "suspend the leeches" and "don't use the suspended card for the optimizer" are good advice in isolation but don't go well together.

robust hill Apr 13, 2025, 2:24 PM

#

interesting

#

i have never thought about this

#

what if i just make it so fsrs doesnt use leeches for the optimizer

bold terrace Apr 13, 2025, 2:25 PM

#

Yeah very interesting, I also wondered about that

robust hill Apr 13, 2025, 2:25 PM

#

well i already kinda split the leeches out in all my decks

#

probs shouldve left a control deck

bold terrace Apr 13, 2025, 2:25 PM

#

To optimize WITH suspended, you change that to preset:"..." then ?

unique salmon Apr 13, 2025, 2:25 PM

#

bold terrace To optimize WITH suspended, you change that to preset:"..." then ?

Just preset:"Vocabulary"

bold terrace Apr 13, 2025, 2:25 PM

#

OKok

#

Something I also wonder a bit, is when a card was suspended, but now you want to give it a new try, ideally you'd like to reset it, but by reseting it I'm not entirely sure the past revlog will be used or not

#

the counter on the browse view say 0 reps, 0 lapse

#

but in the history you still see the reviews

#

so not sure how they get taken into account or not

robust hill Apr 13, 2025, 2:27 PM

#

i am a bit confused

#

#

#

how do i get an extra 2,000 reviews

unique salmon Apr 13, 2025, 2:27 PM

#

bold terrace Apr 13, 2025, 2:27 PM

#

the suspended I guess no ?

bold terrace Apr 13, 2025, 2:28 PM

#

unique salmon

😅

robust hill Apr 13, 2025, 2:28 PM

#

alright number adds up

#

you are right

#

very interesting scheme i have here

bold terrace Apr 13, 2025, 2:29 PM

#

True that the default might be "Let's consider the suspended with those"

#

Sure you don't review them anymore, but they're still part of your well or not you review things

robust hill Apr 13, 2025, 2:30 PM

#

very interesting

#

unique salmon Apr 13, 2025, 2:30 PM

#

@quasi shadow once again: how does FSRS works with "Reset"? Does it only use the info after the card has been reset?
I promise I will make a card so that I don't ask again 🤣

robust hill Apr 13, 2025, 2:30 PM

#

#

why due so far away

#

if so difficult

#

dr is 92%

bold terrace Apr 13, 2025, 2:30 PM

#

Gimme a sec to think the shortest way to explain it laughcry

#

Basically, most of the time :

D : Lapse in disguise
D : goes up, D never goes down.
Splitting that deck into 2, one with "High D" and one with "Normal D" could benefit your parameters

#

For example, my "normal D" has very good logloss/rmse with even default FSRS parameters

robust hill Apr 13, 2025, 2:32 PM

#

based on that screenshot how should i divide it

#

💀

#

i got like 3 sky highers and 4 wide ones

#

this is language learning

#

yk i probably neeed to divide them depending on the back to front

#

my brains gonna kaboom

bold terrace Apr 13, 2025, 2:33 PM

#

What I did, is I did "prop:d" > 0.80 and played with it to see where I had relatively a good chunk in both (like half half), and that it was clear that NO cards with prop:d>0.80 had lapse under X (5, 6...)

#

At the end I did prop:d>0.9 in my case

#

but looking at you I think the 0.80 -> 1 might make more sense

#

Now in my "normal D" I have a lapse threshold of 6-7, and in my "high D" at 12-14. If I reach the first, I tag them and weekly I move them to the High D

#

in High D, at 12-14, it's auto-suspend

quiet saddle Apr 13, 2025, 2:35 PM

#

I'm not sure I understand what you're doing with you decks @robust hill and @bold terrace, are you making subdecks depending on the difficulty of the cards?

bold terrace Apr 13, 2025, 2:36 PM

#

BTW it's still draft but you can use my new graphs to check how bad the workload of your high lapses/reps are

📎 searchStatsExtended.ankiaddon

bold terrace Apr 13, 2025, 2:36 PM

#

quiet saddle I'm not sure I understand what you're doing with you decks <@329951645507256320>...

Yeah basically I split my main deck into 2

#

The previous one become my "normal difficulty", and the new one gets all the difficult one

#

The RMSE and workload of the normal one was hugely improved

quiet saddle Apr 13, 2025, 2:37 PM

#

but what happens for new cards then?

bold terrace Apr 13, 2025, 2:37 PM

#

New Card still in the normal D

#

after 5-6 lapse they go in the difficult one

#

(as they would in the previous deck)

quiet saddle Apr 13, 2025, 2:38 PM

#

so FSRS will be too optimistic for new cards, I don't see this as an improvement

bold terrace Apr 13, 2025, 2:38 PM

#

Well, workload wise it's been a blessing

#

and my R is still at 90%

quasi shadow Apr 13, 2025, 2:39 PM

#

unique salmon <@449662392314494987> once again: how does FSRS works with "Reset"? Does it only...

It only uses the reviews after the reset.

robust hill Apr 13, 2025, 2:39 PM

#

quiet saddle I'm not sure I understand what you're doing with you decks <@329951645507256320>...

yeah

bold terrace Apr 13, 2025, 2:39 PM

#

quasi shadow It only uses the reviews after the reset.

Thanks for confirmation !

robust hill Apr 13, 2025, 2:39 PM

#

maybe it wont work for some decks

#

but i have a finished deck

#

i have a deck with 1156 cards, no longer new

bold terrace Apr 13, 2025, 2:39 PM

#

In fact there is no many other options than :

Be too optimistic about new card
Be too pessimistic about new card
Be a bit of both

unique salmon Apr 13, 2025, 2:40 PM

#

quasi shadow It only uses the reviews after the reset.

I need some more specific info
Imagine a card with a history like this:
L R R | L R R
where L - "Learn", R - "Review" and | means "This is where the reset happened"

Does FSRS only use the second half for optimization?
Does FSRS only use the second half for scheduling?

quiet saddle Apr 13, 2025, 2:41 PM

#

bold terrace In fact there is no many other options than : - Be too optimistic about new card...

I'm also using Anki for language learning, and I prefer the scheduler to be a bit pessimistic for the first reviews, then of course I expect it to adjust depending on how well the reviews went

bold terrace Apr 13, 2025, 2:41 PM

#

quiet saddle I'm also using Anki for language learning, and I prefer the scheduler to be a bi...

Sure but in a split-deck scenario then you need a discriminant to know when to move it to the easy deck then

#

The Low->High D is quite clear

quasi shadow Apr 13, 2025, 2:42 PM

#

unique salmon I need some more specific info Imagine a card with a history like this: L R R | ...

They are the same. The only difference is how to deal with cards with incomplete review history.

unique salmon Apr 13, 2025, 2:42 PM

#

quasi shadow They are the same. The only difference is how to deal with cards with incomplete...

So yes and yes to both?

quasi shadow Apr 13, 2025, 2:43 PM

#

Yes

quiet saddle Apr 13, 2025, 2:43 PM

#

bold terrace Sure but in a split-deck scenario then you need a discriminant to know when to m...

I'm not conviced by this kind of split-deck at all 🙂 I split decks depending of how I want to review the cards, or how "essential" the card is (so I can change the parameters), but splitting the deck by difficulty feels like doing the scheduler job manually to me 🙂

robust hill Apr 13, 2025, 2:43 PM

#

which language are you learning

quiet saddle Apr 13, 2025, 2:44 PM

#

Korean

robust hill Apr 13, 2025, 2:44 PM

#

i see

#

not sure if my advice would work

#

but at the moment i am splitting my decks into 2 ways

bold terrace Apr 13, 2025, 2:44 PM

#

quiet saddle I'm not conviced by this kind of split-deck at all 🙂 I split decks depending of...

That's fine ! And indeed it is doing a bit the job of what the scheduler shoudl be able to do 🙂

robust hill Apr 13, 2025, 2:44 PM

#

I am learning Greek, so
1 deck with options that encompass anything that the question is English, and I have to say it in Greek
another deck with options that encompass the reverse, so question is Greek and i have to say it in English

bold terrace Apr 13, 2025, 2:44 PM

#

The main initial motivation in my case was the fact I realized the "average stability by repetition" was a purely decreasing function

#

The more I reviewed card, the less their interval seems to be

#

So having more and more workload, didn't resulted really in better stability

quiet saddle Apr 13, 2025, 2:45 PM

#

also I'm probably on the ADHD spectrum, so I need the scheduler to be a little bit pessimistic 😄

bold terrace Apr 13, 2025, 2:45 PM

#

Just the same stability with higher workload

#

On the opposite, the card with long interval, had all at most 1-2 lapse

robust hill Apr 13, 2025, 2:45 PM

#

quiet saddle also I'm probably on the ADHD spectrum, so I _need_ the scheduler to be a little...

me too

bold terrace Apr 13, 2025, 2:45 PM

#

and a very few number of reps, something like 10-20

#

So while mass-repetition feels like "you don't allow it to be forgotten", I was in fact hiding the fact that I wasn't really helping them building higher stability

robust hill Apr 13, 2025, 2:46 PM

#

4 deck options i am making

bold terrace Apr 13, 2025, 2:46 PM

#

Question being : Why didn't they ? A lot of different factors, but it's not reping them every day that will help

robust hill Apr 13, 2025, 2:46 PM

#

English -> Greek
English -> Greek - leeches
Greek -> English
Greek -> English - Leeches

#

🔥

quiet saddle Apr 13, 2025, 2:51 PM

#

robust hill English -> Greek English -> Greek - leeches Greek -> English Greek -> English - ...

I have:

Vocabulary: simple words, audio+written word in Korean-> French definition / audio+hint if needed->writting + French definition
Sentences or collocations: various note types including close, dictation, French->Korean for basic greattings, ..
This second deck as 3 levels of priority: Essential, Normal, Optional

Leeches stay in their decks, suspended until I see them again in the wild and/or decide to try to learn them again

bold terrace Apr 13, 2025, 2:53 PM

#

French is your mothertongue ? It's mine

quiet saddle Apr 13, 2025, 2:53 PM

#

bold terrace French is your mothertongue ? It's mine

amusant 🙂

bold terrace Apr 13, 2025, 2:55 PM

#

Front back for me 🙂 The reading is not shown in the preview, but I have to type it in the front card and it's highlighting mistake in the back

quiet saddle Apr 13, 2025, 2:56 PM

#

bold terrace Front back for me 🙂 The reading is not shown in the preview, but I have to type...

maybe we could switch to the #language-learning channel?

bold terrace Apr 13, 2025, 2:56 PM

#

I removed every single thing from the front because otherwise my brain would memorize words by silly things like the sentence shown, the color of the hint, etc etc

unique salmon Apr 13, 2025, 3:44 PM

#

polar maple also for cpu performance i expect maybe around 200 rows of the revlog / second, ...

I'm curious how much more performant it would be if you distilled it into a smaller model + used int8 for inference instead of fp16/fp32. You could probably make it 10x faster, or even more

lapis hearth Apr 13, 2025, 4:59 PM

#

lapis hearth

poll_question_text

Vote you neeks

victor_answer_votes

2

total_votes

3

victor_answer_id

1

victor_answer_text

Decay -0.01 --> 130000 days at DR =80%

robust hill Apr 13, 2025, 5:25 PM

#

this is not a good voting strategy

#

i voted for .001

unique salmon Apr 13, 2025, 6:15 PM

#

Whatever, Jarrett already agreed to make it -0.1

#

So now nobody will ever get a first interval of a million years, yay!

polar maple Apr 13, 2025, 7:02 PM

#

unique salmon <@142448513622605824> you should write a "Why My Neural Net Won't Be Used In Ank...

rwkv could be problematic for syncing issues but i can't rule out a small nn like LSTM or an even smaller version that works similarly to how FSRS works rn, occasionally the user presses optimize to update the nn weights, if there is a syncing issue just reoptimize from scratch

#

but idk how cpu friendly the training would be for a small nn, would need to investigate

polar maple Apr 13, 2025, 7:03 PM

#

unique salmon I'm curious how much more performant it would be if you distilled it into a smal...

idk how to distill and don't wanna learn rn

unique salmon Apr 13, 2025, 7:12 PM

#

polar maple rwkv could be problematic for syncing issues but i can't rule out a small nn lik...

RWKV would be cooler though - more accurate and no need to optimize if you pretrain it on a large dataset

cursive badge Apr 13, 2025, 7:24 PM

#

At 200 reviews/s RWKV would take ~6.5 mins to process my collection if it had to discard its cache. That's kind of a blocker if sync invalidating the cache cannot be worked around.

polar maple Apr 13, 2025, 7:28 PM

#

cursive badge At 200 reviews/s RWKV would take ~6.5 mins to process my collection if it had to...

there are possible workarounds such as training RWKV on only the last 2k reviews for a reasonable optimization time, perhaps it would still have a decent performance

#

or as i mentioned before, training a robust version of the nn by mangling the revlogs that it is trained on

polar maple Apr 13, 2025, 7:29 PM

#

unique salmon RWKV would be cooler though - more accurate and no need to optimize if you pretr...

yea but for now a LSTM-like model would be way easier to implement, everything is mostly in place in anki already

cursive badge Apr 13, 2025, 7:31 PM

#

👍
I meant in response to Expertium that a small loss in accuracy may be worth it for a large gain in convenience.

unique salmon Apr 13, 2025, 7:33 PM

#

The more I think about it, the more I think it's actually very desirable

We can make R more accurate

We won't have to show parameters, which means one less thing for users to worry about

We can support proper same-day scheduling instead of the current mess

We can throw in new input features, like time of the day, workload, etc. Not just interval lengths and grades

We can remove "Optimize", which means even less stuff for users to worry about

robust hill Apr 13, 2025, 7:42 PM

#

NO

#

do NOT remove optimize

cursive badge Apr 13, 2025, 7:44 PM

#

robust hill do NOT remove optimize

They mean if FSRS is superseded by a NN that does not need optimisation.

robust hill Apr 13, 2025, 7:44 PM

#

oh

ashen light Apr 13, 2025, 8:17 PM

#

I would still like the optimize button to be there, if only as a placebo

bold terrace Apr 13, 2025, 8:19 PM

#

Yeaaaaah I mean

#

People are worrying because they feel losing control with FSRS

#

It's not the existence of parameters that stress, it's the fact you don't understand them

#

so giving them 9000 params ?

#

See how people freak out about hard misuse

#

let's now make them stress about NN thinking on saturday every retention is -10% because they were drunk 2 saturdays in a row laughcry

#

So while I would be super super excited to try that NN

#

I don't think they will be less stressed lol

unique salmon Apr 13, 2025, 8:20 PM

#

Hard misuse would still be a problem btw

#

It can't be solved "inside" the algorithm

bold terrace Apr 13, 2025, 8:21 PM

#

but but but but

#

If NN has no rule about "Hard" being a good answer

unique salmon Apr 13, 2025, 8:21 PM

#

Though, I imagine that for people who misuse Hard a NN would still do better than FSRS

bold terrace Apr 13, 2025, 8:21 PM

#

it could theoritically infer that for some user, Hard might result in a reduced stability ?

#

just like an Again

#

You could almost create new buttons and have your own rules about them ;D

#

"Don't remember" "Misspelled" "Confused"

unique salmon Apr 13, 2025, 8:22 PM

#

bold terrace If NN has no rule about "Hard" being a good answer

It's about the training and how the loss is calculated. Again = 0, Hard/Good/Easy = 1. That is not something that the algorithm can change, it's something the algorithm learns from

bold terrace Apr 13, 2025, 8:22 PM

#

ah yeah

unique salmon Apr 13, 2025, 8:23 PM

#

If Hard=1 but the user uses it as if it was 1, welp...

#

Actually, now I'm really curious if a NN would be better than FSRS in that case
Then again, how do you determine what is "better" if you can't tell corrupted labels apart from good ones?

bold terrace Apr 13, 2025, 8:23 PM

#

I mean

#

sometimes I watch for a few seconds too long a video of Sabrina Carpenter

#

And then Facebook decide I'm her biggest fan

#

and I should have every single ads about her

ashen light Apr 13, 2025, 8:24 PM

#

I will continue to advocate for an "almost" button that is just the again button (maybe +5minutes on the relearn step so people feel like its different)

bold terrace Apr 13, 2025, 8:24 PM

#

But then it realize I'm infact a bit more inclined to watch videos about videogames, so it adapts again

#

Personally I use hard/good/easy based on speed of good answer

#

quick quick good answer, 1-2s -> easy

#

3-4 -> good

#

7 hard

#

who cares about 5-6

unique salmon Apr 13, 2025, 8:25 PM

#

Ok, after thinking about it some more, I have no idea whether we would even be able to tell whether FSRS or a NN is better for people who misuse Hard, if we can't un-corrupt the labels aka if we can't confidently say "Here Hard=1 and here Hard=0"

bold terrace Apr 13, 2025, 8:26 PM

#

Well at least it's the goal

#

In practice it's Good-Good-Good-Oshit

#

IMO who cares about misuse

#

it will fix itself with time

unique salmon Apr 13, 2025, 8:26 PM

#

bold terrace it will fix itself with time

Huh?

ashen light Apr 13, 2025, 8:26 PM

#

bold terrace sometimes I watch for a few seconds too long a video of Sabrina Carpenter

man on youtube any vid that I want to watch but also don't want destroying my recs I watch in a private window

bold terrace Apr 13, 2025, 8:27 PM

#

I think I might have misused hard for maybe ˜500 reviews when I started Anki

#

I stopped worrying after the 50k one

bold terrace Apr 13, 2025, 8:27 PM

#

ashen light man on youtube any vid that I want to watch but also don't want destroying my re...

Got a problem against Sabrina Carpenter ?

ashen light Apr 13, 2025, 8:27 PM

#

I have no idea who that is

bold terrace Apr 13, 2025, 8:27 PM

#

Lucky guy

#

I didn't too, and now even fortnite ads are about her

#

she's like the female equivalent of bieber

south lodge Apr 13, 2025, 8:28 PM

#

unique salmon Ok, after thinking about it some more, I have no idea whether we would even be a...

It would help if the evaluation criteria (for success of the NN) were not also the input (with a time offset) (Later edit: nah, the issue is that the data changes with user error, ie non-stationarity)

unique salmon Apr 13, 2025, 8:28 PM

#

If you mean that people will hear about it from other people and be like "Oh, wow, I didn't know this was a problem, thanks my dude, you saved my Anki life!", I'm afraid that this will be the minority, and most people people who misuse Hard will keep misusing it

#

Not everyone browses r/Anki or watched youtube videos about Anki or whatever

ashen light Apr 13, 2025, 8:28 PM

#

people don't want to fail so they need an almost-fail button

unique salmon Apr 13, 2025, 8:28 PM

#

south lodge It would help if the evaluation criteria (for success of the NN) were not also t...

?

bold terrace Apr 13, 2025, 8:28 PM

#

unique salmon Not everyone browses r/Anki or watched youtube videos about Anki or whatever

Let's make a poll about that

#

The most biased poll of all your history

bold terrace Apr 13, 2025, 8:29 PM

#

ashen light people don't want to fail so they need an almost-fail button

"I failed but it's not my fault"

unique salmon Apr 13, 2025, 8:29 PM

#

Let's ask people on r/Anki whether they browse r/Anki

ashen light Apr 13, 2025, 8:29 PM

#

placebo buttons are important I think

bold terrace Apr 13, 2025, 8:30 PM

#

ashen light placebo buttons are important I think

The "Show Less" on Sabrina Carpenter videos

#

TBH

#

most normal people use Anki for 2 days and then never use it anymore

ashen light Apr 13, 2025, 8:30 PM

#

yeah see everyone has those buttons

bold terrace Apr 13, 2025, 8:31 PM

#

I know 3 colleagues that tried Anki. NEver more than for 2 days

ashen light Apr 13, 2025, 8:31 PM

#

bold terrace most normal people use Anki for 2 days and then never use it anymore

yeah this is the most real take

bold terrace Apr 13, 2025, 8:31 PM

#

Main reason has nothing to do with scheduling, it's the clunky UI

#

They were amazed I had images, sound, I could type answer ...

ashen light Apr 13, 2025, 8:31 PM

#

like I have a friend who married some chinese girl and got anki (of his own motivation) to learn chinese and spent maybe like a week on it before giving up

bold terrace Apr 13, 2025, 8:31 PM

#

#

I mean, this is the first image you get from google image

unique salmon Apr 13, 2025, 8:31 PM

#

Except for that one reddit guy who quit Anki because of interval lengths

bold terrace Apr 13, 2025, 8:32 PM

#

I see that screenshot

#

FIrst thing I wonder is if I need a compatibility mode for windows 95 to run it

#

or some kind of DOS emulator

unique salmon Apr 13, 2025, 8:32 PM

#

Lol
The article with that image is from 2015

ashen light Apr 13, 2025, 8:33 PM

#

unique salmon Except for that one reddit guy who quit Anki because of interval lengths

reddit self-selects for a specific type, 99% of people who quit anki aren't going to go make a reddit post about it

bold terrace Apr 13, 2025, 8:33 PM

#

ashen light reddit self-selects for a specific type, 99% of people who quit anki aren't goin...

Yeah reddit is a very very very tiny echo chamber

ashen light Apr 13, 2025, 8:33 PM

#

honestly, any anki reddit complaint should be treated as an outlier

unique salmon Apr 13, 2025, 8:33 PM

#

Did Anki really look like this in 2015? Or is the article using an even older screenshot?

ashen light Apr 13, 2025, 8:33 PM

#

yeah I think it looked like that back then

bold terrace Apr 13, 2025, 8:34 PM

#

#

The most good linking search tool ever existed

#

People on that page are sure to not download any of those deck, in fear to get russian malware

#

Look how user friendly it is to tweak your own cards

#

Even as a dev it took me 2-3 months before daring trying to do something myself

#

I mean even creating cards ...

#

"Manage Note Type" > "Create Field"

#

So when we're talking "the average user", we should not imagine an "average normal human being"

cursive badge Apr 13, 2025, 8:45 PM

#

Dae is interested in making a sharable template system so it might be easier one day. He wants to do it after Svelte migration so who knows when it will actually be started on though.

unique salmon Apr 13, 2025, 8:46 PM

#

Oh, yeah, and apparently we're not actually getting a two-button mode any time soon

south lodge Apr 13, 2025, 8:46 PM

#

unique salmon ?

I guess what I'm thinking is something like:
Wouldn't it be nice if:

Users had a slider to decide their own interval length when grading
A dataset of many such gradings for specific cards existed
The model only needed to exist for those specific cards

unique salmon Apr 13, 2025, 8:47 PM

#

south lodge I guess what I'm thinking is something like: Wouldn't it be nice if: - Users h...

Users had a slider to decide their own interval length when grading
That defeats the point of having a scheduling algorithm

south lodge Apr 13, 2025, 8:47 PM

#

That would be for the training input, the later implementation would be just a ~~'okay I saw the card'~~ 'next'

unique salmon Apr 13, 2025, 8:48 PM

#

I still don't see how this would be beneficial

#

And it sounds impractical as heck

#

Ideally, we want to make it so that users don't have to think about intervals at all

south lodge Apr 13, 2025, 8:49 PM

#

Yes (the end user would have only one button, next)

cursive badge Apr 13, 2025, 8:50 PM

#

People are probably terrible at guessing intervals.

south lodge Apr 13, 2025, 8:50 PM

#

True

unique salmon Apr 13, 2025, 8:50 PM

#

Training a model to predict what intervals the user wants is an interesting idea, but most people probably want constant intervals and/or very short intervals

#

I imagine if you took a bunch of people and asked them to do this for a year, most of them would end up with intervals that are either constant or grow very slowly

cursive badge Apr 13, 2025, 8:51 PM

#

I did previously think it might be interesting to let people grade cards on more than one axis to give the scheduler more info, but it would be terribly impractical and bad UX.

unique salmon Apr 13, 2025, 8:52 PM

#

Though, it would be interesting just for research purposes, to see what intervals people like

#

Maybe it could help to pick better default FSRS parameters and a better default value of desired retention

south lodge Apr 13, 2025, 9:06 PM

#

cursive badge I did previously think it might be interesting to let people grade cards on more...

Might how long a card takes between front and final grading be a useful datapoint? It exists without any UX overhead. (Or total time spent on a card per that session, same idea.)

unique salmon Apr 13, 2025, 9:07 PM

#

south lodge Might how long a card takes between front and final grading be a useful datapoin...

It could be used, yeah

cursive badge Apr 13, 2025, 9:14 PM

#

Anki currently records the time from first showing the card to you giving it a grade. I think ideally we would also have time to first input (for typed answers) and time to answer/card flip.

lapis hearth Apr 14, 2025, 7:07 AM

#

Hey guys

#

Dont know if this is the appropriate time to mention this

#

When is it time that the R=100 for learning cards be changed

#

Should this not be also changed with FSRS 6

quasi shadow Apr 14, 2025, 7:13 AM

#

R=100?

#

Do you mean the R column in the card browser?

lapis hearth Apr 14, 2025, 7:54 AM

#

quasi shadow R=100?

Yes, R as in Retention column

#

R is 100% every time for learning cards no matter what

#

which is simply not accurate

#

I know that Anki did it in the earlier days as a simplifiying method

#

There was no need for Anki to register it because there was no FSRS back then

#

But with the new decay thing, would it not be worth the short to try and see if it is advantageos

quasi shadow Apr 14, 2025, 7:58 AM

#

lapis hearth But with the new decay thing, would it not be worth the short to try and see if ...

Nope

#

The decay thing is still a long-term thing.

lapis hearth Apr 14, 2025, 7:59 AM

#

FeelsBadAnki

quasi shadow Apr 14, 2025, 7:59 AM

#

The short-term memory model is still inaccurate.

lapis hearth Apr 14, 2025, 7:59 AM

#

Oh well....here is to waiting for however long...

lapis hearth Apr 14, 2025, 8:00 AM

#

quasi shadow The short-term memory model is still inaccurate.

When are you releasing FSRS 6 (totally not excited as a golden retriever)

quasi shadow Apr 14, 2025, 8:01 AM

#

lapis hearth When are you releasing FSRS 6 (totally not excited as a golden retriever)

Maybe this week.

#

I have caught a dozen of bugs since the last weekend.

unique salmon Apr 14, 2025, 8:53 AM

#

lapis hearth When are you releasing FSRS 6 (totally not excited as a golden retriever)

Oh, and Jarrett decided to reduce minimum S by 10 times after all
Then again, without a short-term memory model, it's not an improvement

lapis hearth Apr 14, 2025, 8:57 AM

#

unique salmon Oh, and Jarrett decided to reduce minimum S by 10 times after all Then again, wi...

It is a bandage-over-crack makeshift solution

#

I am aware

#

But how long would it take to find a short-term memory model

#

Weeks, Fortnights, Months, Quartals, Tertials, Years..

#

Idk

unique salmon Apr 14, 2025, 9:00 AM

#

Once I'm done with experimenting with D, I'll see if I can use a neural net for this

unique salmon Apr 14, 2025, 11:17 AM

#

Actually, nvm. The idea with neural D didn't work. I was getting really shitty results and tried a bunch of things, but nothing worked. And on top of getting shit results, now I'm also getting errors sometimes

unique salmon Apr 14, 2025, 11:27 AM

#

unique salmon Once I'm done with experimenting with D, I'll see if I can use a neural net for ...

Idk why I said this, I guess I was hoping that I would fix the errors, but nope

lapis hearth Apr 14, 2025, 11:36 AM

#

Does this mean you will start experimenting with NN on short term memory

unique salmon Apr 14, 2025, 11:37 AM

#

I could, but meh

#

It's probably not gonna work

bold terrace Apr 14, 2025, 11:42 AM

#

unique salmon Actually, nvm. The idea with neural D didn't work. I was getting really shitty r...

What was the procedure ? You give the NN a shitton of revlog, you ask him to give you related D for each card, but what about D increase/decrease based on hard/good/easy ? What was the approach

unique salmon Apr 14, 2025, 11:44 AM

#

bold terrace What was the procedure ? You give the NN a shitton of revlog, you ask him to giv...

I tried two architectures:

I give it the grade, last D (squished between 0 and 1) and R as input, and it predicts new D (also squished between 0 and 1)
I give it the grade, last D (unlimited, from -inf to inf) and R as input, and it predicts the difference between new D (unlimited) and last D (unlimited), I add the difference to last D to obtain new D and then I squish it for it to be used in the S formulas

The first one was shit, the second one was shit AND was throwing errors sometimes for some reason

bold terrace Apr 14, 2025, 12:12 PM

#

I see ! Thanks for the explanation

polar maple Apr 14, 2025, 4:42 PM

#

unique salmon I tried two architectures: 1) I give it the grade, last D (squished between 0 an...

have you done any overfitting tests?

#

freeze fsrs params, make only the D nn learn, overfit on a small amount of data

#

alternatively you should check that with the nn, you achieve a lower training loss than what FSRS-5 achieves

unique salmon Apr 14, 2025, 4:59 PM

#

polar maple have you done any overfitting tests?

Nope
I made a comment here: https://github.com/open-spaced-repetition/srs-benchmark/pull/199#issuecomment-2801388093
Feel free to try it out, or try other approaches

GitHub

implement FSRS5D in other.py by L-M-Sherlock · Pull Request #199 ...

close #198
Edit: I tried several hyper-parameters and structures of the network, none of which performed better than FSRS-5.

polar maple Apr 14, 2025, 5:03 PM

#

at least print out the training loss in the Trainer class and make sure that the nn is training for long enough so that the training loss from the nn is lower than for FSRS-5

unique salmon Apr 14, 2025, 5:26 PM

#

@quasi shadow https://github.com/open-spaced-repetition/srs-benchmark/blob/main/plots/w[11].png
I'm worried about w11, it seems like it barely changes. According to this graph, all values are close to the default value, which is close to 2. You could say "Well, maybe that's just a really good default value and this distribution is just very narrow", but I'm not so sure.
For example, I'm testing a new D function, and I changed the default value of w11 to 10, and I get this (based on 132 users so far):
5th percentile of w[11]=9.790
95th percentile of w[11]=10.020
This means that this parameter barely changes

Maybe it's just that my implementation is flawed, but I suggest you do some tests:

Try different default values of w11, like 3-5 values, with any version of FSRS you want
See how it affects the final distribution of w11 across all users
If the distribution of values of w11 ALWAYS ends up being very narrow and centered around the default value, even if you change the default value by at least a factor of 2, then we have a problem

GitHub

srs-benchmark/plots/w[11].png at main · open-spaced-repetition/srs...

A benchmark for spaced repetition schedulers/algorithms - open-spaced-repetition/srs-benchmark

quasi shadow Apr 14, 2025, 5:37 PM

#

unique salmon <@449662392314494987> https://github.com/open-spaced-repetition/srs-benchmark/bl...

#

It is the diff between without and with L2 regularization.

unique salmon Apr 14, 2025, 5:38 PM

#

quasi shadow

Please try different default values, for example, 1, 2 and 4, and plot the resulting distributions

quasi shadow Apr 14, 2025, 5:38 PM

#

I'm running the benchmark.

unique salmon Apr 14, 2025, 5:38 PM

#

Well, we already have approximately 2, so try 1, 3 and 4, or 1, 3 and 5

quasi shadow Apr 14, 2025, 5:39 PM

#

So my device is not available now.

#

Could you test it?

#

You can modify the init_w in other.py.

unique salmon Apr 14, 2025, 5:39 PM

#

I'm running my own stuff 😅
I'll add it to ideas here: https://github.com/orgs/open-spaced-repetition/discussions/36

GitHub

My ideas/recommendations to Jarrett, gathered in one place · open-...

The purpose of this discussion is for me to link to issues/PRs that are related to FSRS and where I have something to say and don't want Jarrett to forget about it/miss my comment. VERY IMPORTA...

unique salmon Apr 14, 2025, 5:40 PM

#

quasi shadow You can modify the init_w in other.py.

That's how I got this

For example, I'm testing a new D function, and I changed the default value of w11 to 10, and I get this (based on 132 users so far):
5th percentile of w[11]=9.790
95th percentile of w[11]=10.020

quasi shadow Apr 14, 2025, 5:40 PM

#

OK, maybe two or three days later

#

#

The distribution of w[14] is also very narrow, isn't it?

unique salmon Apr 14, 2025, 5:48 PM

#

quasi shadow

So I calculated the coefficient of variation, defined as std(x)/mean(x)
https://en.wikipedia.org/wiki/Coefficient_of_variation
I took the absolute value of the mean just because. I did it for FSRS-5 with an extra parameter for decay
Coef. of variation of w[0]=4.752
Coef. of variation of w[1]=2.254
Coef. of variation of w[2]=1.997
Coef. of variation of w[3]=1.066
Coef. of variation of w[4]=0.051
Coef. of variation of w[5]=0.441
Coef. of variation of w[6]=0.291
Coef. of variation of w[7]=1.333
Coef. of variation of w[8]=0.228
Coef. of variation of w[9]=0.952
Coef. of variation of w[10]=0.306
Coef. of variation of w[11]=0.100
Coef. of variation of w[12]=0.349
Coef. of variation of w[13]=0.420
Coef. of variation of w[14]=0.172
Coef. of variation of w[15]=0.805
Coef. of variation of w[16]=0.230
Coef. of variation of w[17]=0.445
Coef. of variation of w[18]=0.620
Coef. of variation of w[19]=0.387

w[4], w[11] and w[14] have the lowest coefficient of variation

Coefficient of variation

In probability theory and statistics, the coefficient of variation (CV), also known as normalized root-mean-square deviation (NRMSD), percent RMS, and relative standard deviation (RSD), is a standardized measure of dispersion of a probability distribution or frequency distribution. It is defined as the ratio of the standard deviation

...

#

This isn't inherently bad, but again, if their distributions end up being centered around the default value even when you change it by 2-4 times, that suggests that something is wrong with optimization

quasi shadow Apr 15, 2025, 6:17 AM

#

#

@unique salmon I cannot find the source where you ask for it

#

Now the hist only shows values between 2%-ile and 98%-ile.

unique salmon Apr 15, 2025, 8:50 AM

#

quasi shadow Now the hist only shows values between 2%-ile and 98%-ile.

...that's a unique way of saying "98th percentile" 🤣

quasi shadow Apr 15, 2025, 8:57 AM

#

😂 OK, I correct it.

#

I learnt it from https://danluu.com/p95-skill/

#

Btw, it's an interesting post about improving performance.

cosmic hedge Apr 15, 2025, 9:11 AM

#

quasi shadow I learnt it from https://danluu.com/p95-skill/

I was wondering where you manage to find articles like that? Do people send them to you or do you hunt them down?

quasi shadow Apr 15, 2025, 9:12 AM

#

cosmic hedge I was wondering where you manage to find articles like that? Do people send them...

I found it here: https://gwern.net/note/competence

Ordinary Incompetence

Incompetence is the norm; most people who engage in a task (even when incentivized for performance or engaging in it for countless hours) may still be making basic errors which could be remedied with coaching or deliberate practice.

#

Do you know Gwern?

cosmic hedge Apr 15, 2025, 9:12 AM

#

Nope who is he?

quasi shadow Apr 15, 2025, 9:12 AM

#

https://gwern.net/spaced-repetition

Spaced Repetition for Efficient Learning

Efficient memorization using the spacing effect: literature review of widespread applicability, tips on use & what it’s good for.

#

He is the author of the best literature review of spaced repetition.

#

(16 years ago)

#

https://www.lesswrong.com/w/spaced-repetition

Spaced Repetition - LessWrong

Spaced Repetition is a technique for long-term retention of learned material where instead of attempting to memorize by ‘cramming’, memorization can be done far more efficiently by instead spacing out each review, with increasing durations as one learns the item, with the scheduling done by software.

#FSRS Megathread