FSRS Megathread | Anki | Page 3

unique salmon Feb 28, 2025, 10:33 AM

#

Yeah, I feel like we might as well merge them

#

They use the same underlying code anyway

#

CMRR is just less realistic for...no reason

#

Instead of using its own "spherical in vacuum" config, CMRR should copy the simulator config

cosmic hedge Feb 28, 2025, 10:35 AM

#

unique salmon Instead of using its own "spherical in vacuum" config, CMRR should copy the simu...

Do you think it might be worth moving it to the simulator modal just so people can mess around with the settings for CMRR?

unique salmon Feb 28, 2025, 10:35 AM

#

cosmic hedge Do you think it might be worth moving it to the simulator modal just so people c...

Yeah

cosmic hedge Feb 28, 2025, 10:37 AM

#

unique salmon Yeah

Ok I suggested that before i thought about where to put it 😂. Simulator modal is going to get even more crowded if we're not careful.

unique salmon Feb 28, 2025, 11:00 AM

#

cosmic hedge Ok I suggested that before i thought about where to put it 😂. Simulator modal i...

Just add a single "Compute Minimum Recommended Retention" button

#

If it copies the configuration of the simulator, then nothing else is needed

cosmic hedge Feb 28, 2025, 11:05 AM

#

unique salmon Just add a single "Compute Minimum Recommended Retention" button

Do you think we should leave the old one where it is as well to avoid confusing people?

unique salmon Feb 28, 2025, 11:58 AM

#

cosmic hedge Do you think we should leave the old one where it is as well to avoid confusing ...

No, then we would have two buttons. Now THAT would be confusing

#

Just move it where the simulator is

bold terrace Feb 28, 2025, 2:47 PM

#

@cosmic hedge , do you think it would be possible to have a graph with Average Stabiliy over Time ? Right now we have the "Review Interval Time Machine" but it's not that practical to see if the trend goes left or right, and the time machine seems to not affect the default graph Anki has

#

For example here I'm at average stability 1.23

#

I think last week I was around 1.05 or something

#

Average and median would be even better 🥲 But I don't want to ask too much rooAngel

unique salmon Feb 28, 2025, 2:48 PM

#

I actually wanted estimated total knowledge over time, but seems like it won't be implemented natively

#

Only in an add-on

bold terrace Feb 28, 2025, 2:48 PM

#

Ah yes indeed in the addon it's quite nice

#

Sometimes I feel the default/native graph are a bit "Let's give you some graph" without really much thoughts about what kind of interpretation you can make

#

It's nice to have graphs that "answer questions"

#

If AVG stability over time is not possible (I don't think it's stored in the revlog, so it would have to be recomputed each time), there's always the option of doing AVG Interval over time, but that would not really be ideal since DR fluctuation could change it while having the same stability

cursive badge Feb 28, 2025, 3:09 PM

#

On the topic of graphs, I found something interesting/weird when I make the bins really small on my SxR heatmap

#

I've not taken any days off since starting this deck, so it's not just a block of days where I did no reviews

bold terrace Feb 28, 2025, 3:15 PM

#

cursive badge On the topic of graphs, I found something interesting/weird when I make the bins...

I'm not entirely sure why days off would impact this ? Stability something that won't change until you review it again, so having gaps in Stability might just means that based on your params, some stability values might not really be achievable.

bold terrace Feb 28, 2025, 3:17 PM

#

cursive badge On the topic of graphs, I found something interesting/weird when I make the bins...

For examples, there is not much sequence that would allow with default params to get those stabilities ;

cursive badge Feb 28, 2025, 3:18 PM

#

bold terrace I'm not entirely sure why days off would impact this ? Stability something that ...

If some stabilities are just not possible wouldn't it be a straight horizontal line on my chart?
I mentioned days off because you can see it has a shape similar to how cards decay each day.

bold terrace Feb 28, 2025, 3:19 PM

#

Indeed

cursive badge Feb 28, 2025, 3:23 PM

#

Maybe it was to do with a param change, or I had a period of struggling with cards so I never got new cards with those stabilities for at bit. It's a bit mysterious.

bold terrace Feb 28, 2025, 3:27 PM

#

I don't know if it has any link, but I see something strange in my Stabiliy in CArd info :

<= 30d, I see the 30 days

1.03days, I see Month (days)

30 1.03 days, I see Month WITHOUT days aside

#

But it's probably just graphical

#

Like 1.03 is rounded to 30d, so the UI think there's no point showing (30d)

cosmic hedge Feb 28, 2025, 3:28 PM

#

#1282005522513530952 message should be easy enough

bold terrace Feb 28, 2025, 3:28 PM

#

And it doesn't really explain why your hole also goes up with higher stability for higher R

cosmic hedge Feb 28, 2025, 3:29 PM

#

well "easy enough" as in i plan to XD

bold terrace Feb 28, 2025, 3:29 PM

#

You would have to recompute it ?

#

For every review state ?

cosmic hedge Feb 28, 2025, 3:29 PM

#

well yes but the memorised graph already does that

bold terrace Feb 28, 2025, 3:29 PM

#

ah ok 🙂

#

Would be super nice, I'm really wondering how my stability evolve with time

hasty fractal Feb 28, 2025, 3:50 PM

#

cosmic hedge Do you think we should leave the old one where it is as well to avoid confusing ...

IMO, leave a small-sized (grayed out?) text like "CMRR has been moved to simulator"

unique salmon Feb 28, 2025, 5:39 PM

#

hasty fractal IMO, leave a small-sized (grayed out?) text like "CMRR has been moved to simulat...

That would be annoying and confusing for new users

hasty fractal Feb 28, 2025, 5:54 PM

#

That would be removed later ofc

#

let's just have it for one version

#

for smooth transition

polar maple Feb 28, 2025, 11:18 PM

#

#

#

I made a new simple baseline for algorithms that can adapt on the spot immediately after each review

#

its similar to an exponential weighted average except i use backpropagation on the log loss

#

i hope this shows why RWKV-P could achieve such strong results

unique salmon Feb 28, 2025, 11:28 PM

#

polar maple i hope this shows why RWKV-P could achieve such strong results

You're gonna need to explain it for mere mortals 😅

polar maple Feb 28, 2025, 11:32 PM

#

its just that the other algorithms in the benchmark are missing out on a ton of information

#

look at AVG, it optimizes on the same 5-way split as FSRS

#

FSRS is so much stronger than AVG bc it uses an actual memory model

#

now if MOVING-AVG already surpasses FSRS, what happens when we add a memory model on top of it? that's what RWKV-P represents

unique salmon Feb 28, 2025, 11:37 PM

#

That still doesn't explain the difference in optimization and testing and whatnot

#

Like, I still don't get why the moving average is better

polar maple Feb 28, 2025, 11:39 PM

#

unique salmon That still doesn't explain the difference in optimization and testing and whatno...

tbh its the other way around, you'd have to explain to the general audience what this 5-way split is and why we use it since the more natural benchmark is an n-way split

polar maple Feb 28, 2025, 11:42 PM

#

unique salmon Like, I still don't get why the moving average is better

it's somewhat fake performance, occasionally certain users may decide that they're done for the day and just pass the remaining cards

unique salmon Feb 28, 2025, 11:42 PM

#

polar maple tbh its the other way around, you'd have to explain to the general audience what...

...but what is n?

polar maple Feb 28, 2025, 11:42 PM

#

unique salmon ...but what is n?

the size of the revlog

unique salmon Feb 28, 2025, 11:43 PM

#

Oh, you mean optimizing after every review

#

Just call it that 🤣

polar maple Feb 28, 2025, 11:43 PM

#

exactly, i've said that

unique salmon Feb 28, 2025, 11:43 PM

#

Just say "Moving average optimizes after every review"

polar maple Feb 28, 2025, 11:43 PM

#

i said something 99% similar to that

#

lol

hasty fractal Feb 28, 2025, 11:45 PM

#

unique salmon You're gonna need to explain it for mere mortals 😅

bro has mastered talking in memes

bold terrace Mar 1, 2025, 12:10 PM

#

polar maple i said something 99% similar to that

But then isn't it closer than FSRS with recency, just taht you're really super aggressive on recency ?

#

I mean, then it does not really show that it's better than FSRS, just that FSRS should maybe be more aggressive with how recency is weighted

unique salmon Mar 1, 2025, 12:23 PM

#

bold terrace But then isn't it closer than FSRS with recency, just taht you're really super a...

Not quite
FSRS-5 recency doesn't optimize after every review, it just weighs reviews by recency. It's still "optimize on all reviews once". Alex's moving average optimizes after every single review

#

We could benchmark optimizing FSRS after every single review, but it would take an eternity

#

Unless we cut down the dataset by a factor of 100 or something

bold terrace Mar 1, 2025, 12:24 PM

#

I mean I hit the optimize button every day and it didn't change for the past 60 days

#

If every optimize change params, it's a bit strange no ? Or much more aggressive recency ?

unique salmon Mar 1, 2025, 12:27 PM

#

bold terrace I mean I hit the optimize button every day and it didn't change for the past 60 ...

Sometimes new RMSE is worse than before, in which case the old parameters are kept

bold terrace Mar 1, 2025, 12:28 PM

#

But in this new way of doing things, you woudl still select this one even if the new RMSE is higher ?

unique salmon Mar 1, 2025, 12:28 PM

#

@polar maple

bold terrace Mar 1, 2025, 12:32 PM

#

unique salmon Sometimes new RMSE is worse than before, in which case the old parameters are ke...

But question to you also : What explain that after an optimization you get worst RMSE ? Local Minimum vs Global Minimum ?

unique salmon Mar 1, 2025, 12:38 PM

#

bold terrace But question to you also : What explain that after an optimization you get worst...

The optimizer being stochastic, I guess

polar maple Mar 1, 2025, 4:04 PM

#

bold terrace But then isn't it closer than FSRS with recency, just taht you're really super a...

a problem is just how FSRS is benchmarked. It only optimizes 5 times per each user rather than after every review due to computational constraints since we have 10k users to benchmark), whereas my moving average is not limited by such computational constraints so i might as well let it optimize after each and every review

#

another thing is that the moving average does not try to predict the outcome of reviews ahead of time which would be important for scheduling purposes, it only tries to predict the outcome of a review immediately before it happens

#

FSRS mostly does do ahead of time predictions in the benchmark but due to how it was implemented, it can update its predictions 5 times

bold terrace Mar 1, 2025, 4:13 PM

#

polar maple a problem is just how FSRS is benchmarked. It only optimizes 5 times per each us...

But then is it even comparable ? I mean, if FSRS could achieve better result in the simulation with more optimization being done, we're not really comparing the precision of prediction/scheduling with the same constraints ?

Also, for an end user like me, even by triggering a re-optimize every day, my parameters stayed the same for the past 2 month. It's more in those situations, that it would be interesting to see if the moving avg perform really better, no ? Then you would really compare performance of both algos on a comparable basis

#

Otherwise it would mean we're trying to create the best algorithm for the simulator constraints, and not really for people that will actually use it.

polar maple Mar 1, 2025, 4:15 PM

#

bold terrace But then is it even comparable ? I mean, if FSRS could achieve better result in ...

that's right, i'd like to see either a smaller dataset for the benchmark where FSRS can be optimized more often or a version of FSRS that can update more often on the fly

polar maple Mar 1, 2025, 4:16 PM

#

bold terrace Otherwise it would mean we're trying to create the best algorithm for the simula...

many users already don't optimize often or at all, it is not necessarily unrepresentative

#

a possible way to optimize FSRS on the spot is to do something like a gradient step over the last 50 reviews

#

this way it's efficient enough to be done after every review

hasty fractal Mar 1, 2025, 4:36 PM

#

polar maple many users already don't optimize often or at all, it is not necessarily unrepre...

well, hopefully we have auto-optimisation in the future.

#

but still, people who care about benchmarks and numbers are optimising regularly so it's only fair that we try to see how well that performs (too).

unique salmon Mar 1, 2025, 5:19 PM

#

polar maple this way it's efficient enough to be done after every review

Make a PR plz

polar maple Mar 1, 2025, 5:26 PM

#

unique salmon Make a PR plz

still working on RWKV and some other stuff

sonic forge Mar 1, 2025, 6:48 PM

#

Auto-optimization was discussed in the past. It is not an option for people who have a need in tweaking the generated parameters after optimization. So if you want to implement it, there should be a toggle (Enable Auto-optimization).

#

Basically, with always enabled Auto-optimization user can't preserve tweaked weights after optimization, after each optimization it will result in weights, that need to be tweaked: cycle where you can't preserve tweaked weights.

unique salmon Mar 1, 2025, 6:58 PM

#

Well, Alex has an interesting idea for optimizing parameters, so maybe you won't have issues in the future

#

I mean, maybe you won't need manual tweaking

polar maple Mar 1, 2025, 7:11 PM

#

unique salmon Well, Alex has an interesting idea for optimizing parameters, so maybe you won't...

actually im not too interested in it anymore

unique salmon Mar 1, 2025, 7:15 PM

#

polar maple actually im not too interested in it anymore

Why?

polar maple Mar 1, 2025, 7:23 PM

#

i'll explain what the idea was here since it was discussed in dms, the idea is to make an RWKV model predict FSRS params on the spot after every review for potentially better params and also for speed (RWKV can run efficiently on a cpu)

but this would still be strictly worse than just letting RWKV do card predictions directly; imo there is no world where RWKV would be introduced into Anki as a way to predict FSRS params, rather than just doing the scheduling itself

now the other part of the idea is to see how far FSRS formulas can be pushed to the limits but we already know the result of this in a certain sense, if you let FSRS look at the answers by optimizing on the test set, you still get a model that does slightly worse than LSTM

#

and this idea would take a while to implement so its pretty much a waste of time

#

copying from a previous message and added RWKV :

I wanted to see how expressive is FSRS' formulas so I decided to train FSRS on the test set that it would be evaluated on (same 5-way split), and I did the same thing for LSTM

Total users: 100
Number of reviews: 2097825
LSTM (cheat): LogLoss (mean±std): 0.3303±0.1598
RWKV (normal): LogLoss mean: 0.3429
LSTM (normal): LogLoss (mean±std): 0.3546±0.1668
FSRS (cheat): LogLoss (mean±std): 0.3550±0.1698
FSRS (normal): LogLoss (mean±std): 0.3743±0.1767

unique salmon Mar 1, 2025, 7:36 PM

#

polar maple i'll explain what the idea was here since it was discussed in dms, the idea is t...

Man...

#

Oh well

#

We could still try my teacher-student idea, though

#

Or you could find a way to make sure that RWKV can do scheduling properly and doesn't do anything weird, like p(recall) increasing as time passes or the interval for Good being longer than the interval for Easy, that kind of stuff

#

And then we could implement it in Anki instead of FSRS

#

Jarrett wouldn't be happy though 🤣

polar maple Mar 1, 2025, 7:39 PM

#

RWKV (non-P) predicts monotonically decreasing forgetting curves so the first part is automatically satisfied

#

the second part could be a layer on top by the scheduler

#

i'm still hoping to find some simple insights as to where FSRS goes wrong, maybe small changes to FSRS can largely close the gap

unique salmon Mar 1, 2025, 7:40 PM

#

We would have to remove S-related stats, I assume? Since it has 3-4 different S values

polar maple Mar 1, 2025, 7:41 PM

#

you can still compute an S as the x where p(x) = 0.9

unique salmon Mar 1, 2025, 7:41 PM

#

Would that be meaningful for RWKV, though?

polar maple Mar 1, 2025, 7:41 PM

#

nope

unique salmon Mar 1, 2025, 7:41 PM

#

welp

#

Then just remove S stats

unique salmon Mar 1, 2025, 7:42 PM

#

polar maple i'm still hoping to find some simple insights as to where FSRS goes wrong, maybe...

Probably not that much

#

To give some numbers, I would be very surprised if FSRS's RMSE can get below 4%

#

Maybe if you make some massive changes to D, idk

#

If we really, and I mean REALLY want every last bit of predictive accuracy, we're going to need a neural net, I'm certain

polar maple Mar 1, 2025, 7:48 PM

#

i forgot to mention, RWKV as trained right now also does same-day review predictions

#

i only filter those out when finding the stats for the benchmark

unique salmon Mar 1, 2025, 7:50 PM

#

I wonder what Jarrett's and other people's reactions would be like if we announced "Oh, yeah, remember FSRS? We're not going to be using that anymore, we're going full black-box neural net now"

polar maple Mar 1, 2025, 7:51 PM

#

a nn is undesirable

#

lets try to improve FSRS first

unique salmon Mar 1, 2025, 7:51 PM

#

https://expertium.github.io/Algorithm.html
Btw, have you read either my article or the FSRS wiki's "Algorithm" entry?

Expertium’s Blog

A technical explanation of FSRS

Spaced repetition stuff

polar maple Mar 1, 2025, 7:51 PM

#

personally i wouldn't mind a nn, but people want to customize all sorts of things and it would be impossible to customize RWKV beyond setting a desired retention

unique salmon Mar 1, 2025, 7:52 PM

#

polar maple personally i wouldn't mind a nn, but people want to customize all sorts of thing...

Desired Retention Is All You Need

#

(unironically)

polar maple Mar 1, 2025, 7:53 PM

#

unique salmon https://expertium.github.io/Algorithm.html Btw, have you read either my article ...

i never read the specifics of FSRS-5, i did read Jarrett's paper that i assume is similar to FSRS though. I should probably finally learn what FSRS does lol

polar maple Mar 1, 2025, 8:10 PM

#

unique salmon To give some numbers, I would be very surprised if FSRS's RMSE can get below 4%

since RWKV has a similar RMSE to LSTM but a much better log loss, maybe the largest difference is just the shape of the forgetting curve. GRU-P was forced to use the same forgetting curve shape as FSRS

#

actually one of the first experiments i did on srs-benchmark was to make GRU-P learn the decay exponent and as expected it gave much better results than a fixed -0.5

unique salmon Mar 1, 2025, 8:12 PM

#

polar maple since RWKV has a similar RMSE to LSTM but a much better log loss, maybe the larg...

You mean standard GRU? GRU-P doesn't have a fixed curve

polar maple Mar 1, 2025, 8:12 PM

#

yeah standard GRU, oops

#

ok nvm GRU-P does'nt use a forgetting curve so my whole point there is bogus, forget it

unique salmon Mar 1, 2025, 8:18 PM

#

I'm not sure how you intend to use RWKV to improve FSRS. I mean, you can see that it's more accurate, and you can even look at specific cases where difference is especially large, but that doesn't tell you what a good formula should look like

polar maple Mar 1, 2025, 8:22 PM

#

unique salmon I'm not sure how you intend to use RWKV to improve FSRS. I mean, you can see tha...

the goal is to find systematic errors in FSRS. RMSE is actually a decent metric of this

#

RMSE (bins) only measures the bias of each bin, it doesn't really care about the specifics of individual predictions

#

information is lost when taking averages

unique salmon Mar 1, 2025, 8:23 PM

#

Still, it doesn't tell you what a new formula for FSRS looks like

#

How do you go from "For these reviews RMSE is particularly big" to "This is the new formula for FSRS"?

polar maple Mar 1, 2025, 8:24 PM

#

i dont understand why you wonder that, this is just a normal part of data analysis, we just need to fit some curves

#

jarrett this whole time makes many plots of random things

#

i've made LSTM vs FSRS curve plots

#

make plot, make prediction, test it

cursive badge Mar 1, 2025, 8:46 PM

#

unique salmon Desired Retention Is All You Need

Is it thought? A NN might learn something more complex and schedule things at seemingly random R, but be more effective than a static or monotonically increasing DR.

#

For all we know, there may even be windows in which it is better to review. If you miss a window, it might be better to wait for another one than just do backlog as soon as possible.

unique salmon Mar 1, 2025, 8:49 PM

#

cursive badge Is it thought? A NN might learn something more complex and schedule things at se...

We can add a "Dynamic desired retention" toggle, so that the user can switch manual control on/off
I meant that I can't think of anything that is both unrelated to desired retention and equally important

unique salmon Mar 1, 2025, 8:49 PM

#

cursive badge For all we know, there may even be windows in which it is better to review. If y...

That's not possible with a monotonic forgetting curve

#

And I will sooner eat chalk than believe in a non-monotonic forgetting curve

cursive badge Mar 1, 2025, 8:51 PM

#

unique salmon That's not possible with a monotonic forgetting curve

That's what I mean. We make that assumption, but is it strictly true? If I don't do my reviews in the morning when I am awake, is it better to do them tired or wait until I am fresh the next day?

#

It could be all wiggly, not a nice slope.

unique salmon Mar 1, 2025, 8:52 PM

#

That's a different question though, since now you're adding another variable - how tired you are

cursive badge Mar 1, 2025, 8:52 PM

#

It's just really hard to capture the "true retrievability function"

unique salmon Mar 1, 2025, 8:53 PM

#

Anyway, if we can make a NN that doesn't do anything weird with intervals, I'm on board with replacing FSRS, as long as no functionality is lost (minus D and S graphs, but whatever, that's not important)

#

I'm more worried about things like interval(Hard) < interval(Again) or the next interval being x100 times longer than the last one

#

Well, tbf, both are solvable

#

Just add some extra scheduling rules on top of the NN

unique salmon Mar 1, 2025, 8:58 PM

#

polar maple a nn is undesirable

The more I think about it, the more I think it's actually very desirable

We can make R more accurate
We won't have to show parameters, which means one less thing for users to worry about
We can support proper same-day scheduling instead of the current mess
We can throw in new input variables, like time of the day, workload, etc. Not just interval lengths and grades
We can remove "Optimize", which means even less stuff for users to worry about

#

That's a whopping bonanza of advantages

cursive badge Mar 1, 2025, 9:01 PM

#

unique salmon I'm more worried about things like interval(Hard) < interval(Again) or the next ...

I could see a situation where that might be desirable. Sometimes I accidentally reinforce an incorrect memory. You might want a quite long Again interval to deliberately forget a bit and make the memory more malleable.

unique salmon Mar 1, 2025, 9:02 PM

#

cursive badge I could see a situation where that might be desirable. Sometimes I accidentally ...

It would be confusing, since Again is supposed to be "Fail" and Hard is supposed to be "Pass", so Again > Hard would be weird

#

Everyone is already used to Again < Hard < Good < Easy, and it makes sense intuitively

cursive badge Mar 1, 2025, 9:03 PM

#

Sometimes intuition is wrong 🤷‍♂️

unique salmon Mar 1, 2025, 9:03 PM

#

Sorry if that sounds mean, but no, it just means that the way you use answer buttons makes no sense

#

And you need to change your rating habits

cursive badge Mar 1, 2025, 9:06 PM

#

I'm not saying I'm doing any weird manual adjustments to my rating right now. I'm just saying in some circumstances it may actually be more effective in terms of learning/study time to give a bigger interval to something you forgot, than something you remembered but struggled with.

#

I could also be completely wrong. I'm just saying I would not completely discard it as a possibility.

unique salmon Mar 1, 2025, 9:07 PM

#

Even if it's better in some fringe cases, I still doubt that the gains from it are worth making buttons unintuitive

unique salmon Mar 1, 2025, 9:08 PM

#

unique salmon The more I think about it, the more I think it's actually very desirable 1) We c...

Anyway @polar maple thoughts on this?

cursive badge Mar 1, 2025, 9:11 PM

#

I mean if we are being radical you could argue that an ideal scheduling algorithm might not statically schedule a card at the time you review it because its retrievability could be affected by cards you review later.

polar maple Mar 1, 2025, 9:11 PM

#

unique salmon Anyway <@142448513622605824> thoughts on this?

we could hide it right now if you want
short term scheduling is still difficult, i don't think desired retention is the right metric for short term scheduling
auto optimize is not even allowed for FSRS even if you could easily do something like the mini gradient step idea

polar maple Mar 1, 2025, 9:11 PM

#

cursive badge I mean if we are being radical you could argue that an ideal scheduling algorith...

yes, RWKV-P the entire sequence as input

#

RWKV uses the entire sequence as well but it cannot reschedule cards. i can work on such rescheduling later

#

but RWKV-P represents the limit of what's possible when you use most available information

unique salmon Mar 1, 2025, 9:15 PM

#

polar maple 2) we could hide it right now if you want 3) short term scheduling is still diff...

I meant that since RWKV would be pre-trained on the 10k dataset, there would be no need to optimize it

polar maple Mar 1, 2025, 9:18 PM

#

RWKV right now maintains a hidden state at the card level, the siblings level, the deck level, the preset level, and the global level. the way it updates these is kind of equivalent to if we had auto-optimize in FSRS and I think this was unwanted due to syncing issues

unique salmon Mar 1, 2025, 9:18 PM

#

Ah

#

Dang

cosmic hedge Mar 1, 2025, 11:25 PM

#

b-w matrix should be under the memorised graph now if you update SSE

#

hope i didn't screw it up somehow 🙏

cursive badge Mar 1, 2025, 11:39 PM

#

cosmic hedge b-w matrix should be under the memorised graph now if you update SSE

Where exactly is it meant to be? I might be dumb, but I cannot see it.
SSE v1.10.0 from source. I've tried both Anki 25.02 and Anki git main

cosmic hedge Mar 1, 2025, 11:42 PM

#

cursive badge Where exactly is it meant to be? I might be dumb, but I cannot see it. SSE v1.10...

Under a dropdown under the memorised graph

cursive badge Mar 1, 2025, 11:42 PM

#

cosmic hedge Under a dropdown under the memorised graph

On Anki git main?

cosmic hedge Mar 1, 2025, 11:43 PM

#

cursive badge On Anki git `main`?

Nope in search stats extended

cursive badge Mar 1, 2025, 11:43 PM

#

I mean Anki git main + SSE v1.10.0

cosmic hedge Mar 1, 2025, 11:44 PM

#

cursive badge I mean Anki git `main` + SSE v1.10.0

Go to the memorised graph and it should be under there.

#

After you've run it

cursive badge Mar 1, 2025, 11:45 PM

#

cosmic hedge After you've run it

cosmic hedge Mar 1, 2025, 11:45 PM

#

cursive badge

In search stats extended, not the simulator. The past memorised.

cursive badge Mar 1, 2025, 11:48 PM

#

cosmic hedge In search stats extended, not the simulator. The past memorised.

Ok, it's there. I was just confused ~~when you said simulator earlier~~ because I am dumb, and no other reason. ;p

cosmic hedge Mar 1, 2025, 11:49 PM

#

cursive badge Ok, it's there. I was just confused ~~when you said simulator earlier~~ because ...

I've edited it now no one has to know 🤫

cursive badge Mar 1, 2025, 11:54 PM

#

@cosmic hedge the curse continues. The Y axis is really blurry for me on the B-W matrix. The Y axis is still fine for me on the SR Heatmap though.

#

Maybe it's something to do with the viewBox starting at x=-40 instead of 0?

cosmic hedge Mar 1, 2025, 11:57 PM

#

cursive badge Maybe it's something to do with the viewBox starting at x=-40 instead of 0?

Nope that works fine for the other graphs i think

#

If thats the problem I'd rather not reprogram it 🥲

#

Google didn't work, chatgpt didnt work. I'm stumped.

cursive badge Mar 2, 2025, 12:04 AM

#

cosmic hedge Google didn't work, chatgpt didnt work. I'm stumped.

Intriguing. Try setting opacity="1" on the<g> for that axis.

#

It miraculously unblurs for me.

cosmic hedge Mar 2, 2025, 12:15 AM

#

cursive badge Intriguing. Try setting `opacity="1"` on the`<g>` for that axis.

???????
I guess I'll just make those axis not translucent then 😂

cursive badge Mar 2, 2025, 12:20 AM

#

cosmic hedge ??????? I guess I'll just make those axis not translucent then 😂

The only sensible thing I can think of is somewhere there is a globally scoped bit of CSS with filter: blur() that matched your axis. I have no idea where it would be or why though.
That or just a really weird bug in chromium.

cosmic hedge Mar 2, 2025, 12:25 AM

#

nope no filter: blur()

cursive badge Mar 2, 2025, 12:28 AM

#

The raw (opacity="0.5") SVG renders just fine in Firefox, so it is something specific to Anki. It's just really mysterious.

quasi shadow Mar 2, 2025, 4:25 AM

#

unique salmon I wonder what Jarrett's and other people's reactions would be like if we announc...

It's great because it's also open-source.😎 Then I can try to use it in my work.

spring adder Mar 2, 2025, 4:33 AM

#

quasi shadow It's great because it's also open-source.😎 Then I can try to use it in my work.

https://tenor.com/view/mujikcboro-seriymujik-gif-24361533

Tenor

quasi shadow Mar 2, 2025, 7:41 AM

#

polar maple now if MOVING-AVG already surpasses FSRS, what happens when we add a memory mode...

What's RWKV-P?

#

Does it consider all reviews of the collection when it predicts the P(recall) of the next review?

polar maple Mar 2, 2025, 7:52 AM

#

quasi shadow Does it consider all reviews of the collection when it predicts the P(recall) of...

if you iterate over the revlog, RWKV-P predicts the outcome of the row that we are looking at and it has as input all the previous rows

#

btw i have uploaded MOVING-AVG here
https://github.com/1DWalker/srs-benchmark/blob/705bb5084a12f402722c14ea1d02b07d1ce135cf/other.py#L2646

quasi shadow Mar 2, 2025, 7:55 AM

#

Is it possible to draw the forgetting curve for a given card with RWKV-P?

polar maple Mar 2, 2025, 8:00 AM

#

quasi shadow Is it possible to draw the forgetting curve for a given card with RWKV-P?

nope. i have a version RWKV that does curve prediction but i haven't trained it to adapt its prediction over time; in theory I could've made RWKV-P predict a curve but it would still be equivalent to just predicting a P directly since it knows delta_t.

If I want to train a model that can update its curve predictions in a reasonable manner, I would likely have to sample a random point between the last review and the current review as a point to update the prediction. if we represent this time interval as [0, 1] then RWKV is directly at 0 and RWKV-P is at 1. So RWKV-P not predicting curves is not necessarily important, it just says what kind of performance is possible at the right endpoint of the interval

#

also another reasonable time is to make RWKV predict a curve at beginning of the day before the reviews start to lessen the impact of a user's mental s state affecting a day's reviews

quasi shadow Mar 2, 2025, 8:03 AM

#

polar maple nope. i have a version RWKV that does curve prediction but i haven't trained it ...

Fine. What about the simulator? Could we use RWKV-P as the simulation environment?

polar maple Mar 2, 2025, 8:06 AM

#

quasi shadow Fine. What about the simulator? Could we use RWKV-P as the simulation environmen...

probably not well. the RWKV models rn use global-level information, akin to if we optimize FSRS params after every review. The simulation does not support this sort of thing and it would be very out of distribution if we try to mock it, i think

#

but we can always go back to the LSTM nn for the simulation environment

#

LSTM works at a per-card level so it would work well

#

i could also train a RWKV model that locks in its internal parameters and then works as a per-card model afterwards

#

the only thing LSTM makes difficult right now is that it uses the duration of review as an input. I could just remove that feature and then we would have something that would work for the simulator right away

quasi shadow Mar 2, 2025, 8:32 AM

#

Seems like the RWKV-P could capture the impact of interactions among cards.

#

Because it uses the entire sequence as input.

#

If we change the order of previous reviews of other cards before the next review, RWKV-P will give a different prediction, right?

polar maple Mar 2, 2025, 8:37 AM

#

quasi shadow If we change the order of previous reviews of other cards before the next review...

yes

#

this was the goal of such a model, to use as much information as possible

#

the only information that I reject is the parent_id of the deck

#

rwkv uses note, deck, presets

quasi shadow Mar 2, 2025, 8:40 AM

#

Could you draw the calibration graph?

#

I wonder the distribution of p of RWKV-P.

polar maple Mar 2, 2025, 8:43 AM

#

quasi shadow Could you draw the calibration graph?

sure, do you have any code for it that you used for fsrs, please link it

quasi shadow Mar 2, 2025, 8:45 AM

#

polar maple sure, do you have any code for it that you used for fsrs, please link it

https://github.com/open-spaced-repetition/srs-benchmark/blob/4701f339f4cd07d9199dfa2e367a167e3aeafd79/other.py#L3114C9-L3114C19

GitHub

srs-benchmark/other.py at 4701f339f4cd07d9199dfa2e367a167e3aeafd79 ...

A benchmark for spaced repetition schedulers/algorithms - open-spaced-repetition/srs-benchmark

polar maple Mar 2, 2025, 8:49 AM

#

thanks

bold terrace Mar 2, 2025, 10:47 AM

#

polar maple if you iterate over the revlog, RWKV-P predicts the outcome of the row that we a...

Maybe a dumb question, but since you have a finite number of weights in neural networks, does it mean that with too much cards in the revlog, the model would be worst to predict the new rows based on previous, if the number of previous rows is so huge that the number of weights don't allow to map everything ?

#

Personally I found the approach very nice if it's doable on every user computer ... I mean, having each anki cards having their own "little bubble of stability/difficult/retrievability" without considering others always felt like a limitation of all current schedulers. Having different cards potentially being dynamically linked is very very nice

unique salmon Mar 2, 2025, 11:35 AM

#

quasi shadow It's great because it's also open-source.😎 Then I can try to use it in my work.

Based

grizzled cedar Mar 2, 2025, 1:15 PM

#

Hi guys, I'm an FSRS noob. I have about ~1000 reviews and I've literally just activated FSRS and optimised it. I've set my learning steps to '10m 15m.'

So, with that context, my question is as follows:
I used to have 'easy decks.' These are collations of information I find easy to recall (to the point where I'm pressing easy for pretty much every card). I use Anki in preperation for my final exams in October so I make these decks just to ensure I remember all this content by the time that rolls around.
With the old SM-2, I simply just increased the graduating and easy intervals.

So, as you've probably guessed, my question is how can I do this on FSRS.

Side-note: my RMSE seems to be a 5.62%. Is this an issue? The guide says a lower value is better, and showcases a 2.03%.

Thanks in advance

unique salmon Mar 2, 2025, 1:17 PM

#

grizzled cedar Hi guys, I'm an FSRS noob. I have about ~1000 reviews and I've literally just ac...

If you want to increase/decrease interval lengths, adjust desired retention. Higher DR = shorter intervals
Also, I recommend reading the manual: https://docs.ankiweb.net/deck-options.html#fsrs

#

Don't worry about RMSE btw

#

So yeah, desired retentino is the "lever" that you pull to steer FSRS

grizzled cedar Mar 2, 2025, 1:20 PM

#

unique salmon If you want to increase/decrease interval lengths, adjust desired retention. Hig...

DR should be according to how well you want to recall the card, right? I still want to recall these easy decks well, so will changing it lower affect anything other than the interval length?

#

ahhh nevermind, i get it now

#

thanks for your help!

unique salmon Mar 2, 2025, 1:27 PM

#

grizzled cedar DR should be according to how well you want to recall the card, right? I still w...

You'll be able to recall fewer cards when they are due

#

Desired retention is like "How many cards do you want to be able to recall when Anki shows them to you?"

grizzled cedar Mar 2, 2025, 1:28 PM

#

yea lol it seems so obvious now

polar maple Mar 2, 2025, 6:59 PM

#

bold terrace Maybe a dumb question, but since you have a finite number of weights in neural n...

yeah it's possible, just like how LLMs don't do well with very long contexts. But Anki reviews has nowhere near the complexity of human language so it might not apply here. Instead i guess that RWKV just aggregates statistics and having more and more reviews would only benefit more

#

but for random interactions like "on weekend X the user learnt some new cards that were particularly difficult", RWKV would struggle on this more

bold terrace Mar 2, 2025, 9:03 PM

#

Still working on it, but plotting Average Stability over Time gives quite good insights ... For those past 90 days, I stopped adding words, and I increased my DR bit by bit, so my review count doesn't drop that much, but I still wanted to see if my stability was increasing or not.

#

It's also insightful because it shows that even if your R is not dropping much, and you're Memorised curve is growing quickly, adding more and more words still tend to lower the Card Stability (And maybe not just because of new words, but also older cards being replaced in memory)

#

Ofc, If R is similar, and Work Load is bigger, then you can infer that stability is probably dropping, but it's a bit more direct like this

cursive badge Mar 2, 2025, 9:09 PM

#

bold terrace It's also insightful because it shows that even if your R is not dropping much, ...

Maybe an "only mature cards" or "only cards first reviewed before ..." option would be interesting if you wanted to see how new cards affect overall stability.

bold terrace Mar 2, 2025, 9:11 PM

#

That's a nice idea indeed. Having each vertical bar with a lighter and darker green if the stability contributions comes from mature or young

#

I need to refine a bit the graph too, right now I use the "buckets" of stability, so if a card has stability 1.2, it counts as a "1". It doesn't cause too much issues because the default Anki plugin gives me an average of 1.27 months and mine 36.5 (so I guess 1.27 month = 30*1.27 = 38.1d)

#

Ideally I'd like also to add the median because it can be quite different, I have a median stability of 20d while my average is 38d

Very easy card bend the avg ...

#

(Those are the default graph right now, which is nice but doesn't give you a sense of evolution)

cursive badge Mar 2, 2025, 9:19 PM

#

@bold terrace BTW Luc merged my SR Heatmap and released an update to SSE yesterday if you hadn't noticed.

#

I advise Log(S) otherwise all the small S cards get clumped together.

bold terrace Mar 2, 2025, 9:21 PM

#

cursive badge <@304669962608443402> BTW Luc merged my SR Heatmap and released an update to SSE...

I saw it indeed 🙂 Wanted to share earlier a screenshot of mine just to show the same kind of holes appearing at some places 🙂

#

Allowed me to detect that I have a huge bunch of reviews waiting for me in Mai-June 🙂 Those ~80S 94-98% S

polar maple Mar 3, 2025, 6:20 AM

#

@quasi shadow

#

#

#

#

#

#

#

And i also made these where i aggregate users 1 to 500

#

#

#

#

quasi shadow Mar 3, 2025, 6:32 AM

#

I participate in a new repo today: https://github.com/asukaminato0721/visual_novel_recommendation_engine
😎

GitHub

GitHub - asukaminato0721/visual_novel_recommendation_engine

Contribute to asukaminato0721/visual_novel_recommendation_engine development by creating an account on GitHub.

quasi shadow Mar 3, 2025, 6:33 AM

#

polar maple <@449662392314494987>

Wow, RWKV-P calibrates perfectly.

quasi shadow Mar 3, 2025, 6:34 AM

#

polar maple <@449662392314494987>

I wonder why the MOVING-AVG performs so well even if it doesn't have a model about memory.

polar maple Mar 3, 2025, 6:36 AM

#

quasi shadow I wonder why the MOVING-AVG performs so well even if it doesn't have a model abo...

same, but i expect it to do well in a calibration standpoint since i thought it would only predict a narrow range of values and it would exploit the naive RMSE pre-bins formula

#

but unexpectedly it predicts a wide range of probabilities like the other algorithms

#

so its not cheating in that way

bold terrace Mar 3, 2025, 8:23 PM

#

For those who want to already test the Card Stability over Time, I have a PR and a local build

PR : https://github.com/Luc-Mcgrady/Anki-Search-Stats-Extended/pull/32
local build : (attachment)

I'd like to do the median, color-code based on how the avg is coming from young/mature later on, but I won't have time until probably next week-end

📎 searchStatsExtended.ankiaddon

polar maple Mar 3, 2025, 9:12 PM

#

@quasi shadow bad news for SSP-MMC, if you give fixed DR the ability to make the last interval shorter to minimize effort to reach the arbitrary "3 years stability is treated as infinite stability" rule, then the the gap is closed

unique salmon Mar 3, 2025, 9:50 PM

#

polar maple <@449662392314494987> bad news for SSP-MMC, if you give fixed DR the ability to ...

is this with RWKV?

polar maple Mar 3, 2025, 10:07 PM

#

unique salmon is this with RWKV?

no this is with the FSRS memory model

#

but if SSP-MMC-FSRS cannot do better than a fixed DR then theres no point in adding a different memory model

quasi shadow Mar 4, 2025, 2:21 AM

#

polar maple <@449662392314494987> bad news for SSP-MMC, if you give fixed DR the ability to ...

What if we count the number of cards which reach the target stability at the end of simulation?

#

I guess SSP-MMC is still the best.

#

Btw, I have known that the optimization goal of SSP-MMC is not equivalent to maximize the retention at the end of simulation.

polar maple Mar 4, 2025, 2:28 AM

#

quasi shadow What if we count the number of cards which reach the target stability at the end...

for this version i set memorized_cnt_per_day[today] = (card_table[col["stability"]] > s_max).sum() which i think should be counting what you want

quasi shadow Mar 4, 2025, 2:33 AM

#

ANKIPOGGERS Weird. Maybe the eps is not small enough to find the optimal policy?

polar maple Mar 4, 2025, 2:35 AM

#

i think if for the cost you only included cards that reached the target then SSP-MMC would do better

#

otherwise, maybe SSP-MMC is keeping many cards at low R and these cards dont reach the target

#

did i write the right expression to only include the costs for cards that reached the right target?

reached_target = card_table[col["stability"]] > s_max
memorized_cnt_per_day[today] = reached_target.sum()
cost_per_day[today] = card_table[col["cost"]][reached_target & (true_review | true_learn)].sum()```

#

there has to be a mistake, otherwise the fixed IVL doesn't make sense

#

i see, i think i was just looking at the cost for the last review that brings the card to the target stability

quasi shadow Mar 4, 2025, 2:57 AM

#

cost_per_day[today] = card_table[col["cost"]][reached_target & (true_review | true_learn)].sum()
Yeah, this line has some problems.

#

We need to add a new col to the card table to record the total cost per card.

#

Btw, Expertium mentioned you here: https://github.com/open-spaced-repetition/fsrs-optimizer/pull/166

GitHub

Added confidence intervals to the calibration graph, disabled Lowes...

@1DWalker I believe this is what you wanted

#

polar maple Mar 4, 2025, 3:04 AM

#

also "knowledge per minute" is slightly inaccurate since its also multipled by the number of days in the simulation

#

it makes no sense that you would learn 863 items per minute

ashen light Mar 4, 2025, 6:34 AM

#

hey guys I just had a cursed idea: if one of the blockers for fsrs auto-optimize is people who have custom modifications to their params, can we just add a filter hook for fsrs param optimization so an addon can do the modification for them?

cursive badge Mar 4, 2025, 6:51 AM

#

ashen light hey guys I just had a cursed idea: if one of the blockers for fsrs auto-optimize...

A hook could be useful for automated twiddling.
I was under the impression that Dae's main worry was about there possibly being lots of sync conflicts caused by concurrent optimisation on different devices.

#

Clobbering hand crafted params can be simply prevented with an Auto/Manual toggle.

quasi shadow Mar 4, 2025, 8:02 AM

#

https://github.com/ankitects/anki/pull/3840

GitHub

Feat/grade now by L-M-Sherlock · Pull Request #3840 · ankitects/anki

This PR adds a new "Grade Now" feature to the browser, allowing users to grade cards directly without going through the review process. This is useful for quickly adjusting card s...

#

We will have native "Grade Now".

hasty fractal Mar 4, 2025, 10:13 AM

#

jarrett is the best thing that happened to us recently

hasty fractal Mar 4, 2025, 10:32 AM

#

folks https://forums.ankiweb.net/t/bug-retrievability-in-browser-doesnt-match-retrievability-in-stats-histogram/56547?u=sorata

#

some bug with retreivability

quasi shadow Mar 4, 2025, 10:45 AM

#

I have replied just now.

sonic forge Mar 4, 2025, 11:09 AM

#

ashen light hey guys I just had a cursed idea: if one of the blockers for fsrs auto-optimize...

It is a horrible idea. C'mon, Just add the Toggle Switch that enables Auto-optimization and grays out the weight input field (makes it inactive), so that user can not change values in it, but still can copy the weights.
Also, as rossgb mentioned, the main concern with Auto-optimization is sync conflicts.

#

Auto-optimization will not make scheduler any better, it as a copium.

lapis hearth Mar 4, 2025, 11:37 AM

#

I was previously for auto optimisation but now since the 0 problem at params w18 w19 has not yet been resolved, I am reluctant. I don't want it to go back to having zeros there. This is still an issue with just some makeshift bandaid put on it.

cosmic hedge Mar 4, 2025, 11:46 AM

#

I'm guessing someone might have solved this for you in the meantime. but I'm pretty sure you might just be looking at your RMSE instead of your log loss (what FSRS optimizes for)

#

with the defaults your log loss goes up both times

#

I'm not very well aquainted with your problem sorry 🤷‍♂️

unique salmon Mar 4, 2025, 11:59 AM

#

polar maple also "knowledge per minute" is slightly inaccurate since its also multipled by t...

Yeah, we need something more like total knowledge divided by average time per card, not per day

hasty fractal Mar 4, 2025, 12:03 PM

#

opinions on "true retention" versus "retention rate"?

#

I think true retention is a horrible name

#

it's just some historical relic, don't see any reason for naming the stat as such

#

the name is more confusing in some other languages which don't have the habit of adding weird adjectives to nouns to make a cool terminology

unique salmon Mar 4, 2025, 12:43 PM

#

lapis hearth I was previously for auto optimisation but now since the 0 problem at params w18...

@quasi shadow I believe you said you will look into setting the last two params to some small values instead of zeros

unique salmon Mar 4, 2025, 12:59 PM

#

quasi shadow https://github.com/ankitects/anki/pull/3840

So does it change the interval? Because if it does, the UI should reflect it

quasi shadow Mar 4, 2025, 1:40 PM

#

Yeah, I will test my idea this week.

quasi shadow Mar 4, 2025, 1:40 PM

#

unique salmon So does it change the interval? Because if it does, the UI should reflect it

It has the same effect as a normal review.

unique salmon Mar 4, 2025, 3:01 PM

#

quasi shadow It has the same effect as a normal review.

The UI doesn't show intervals

quasi shadow Mar 4, 2025, 3:08 PM

#

unique salmon The UI doesn't show intervals

You can see the interval in the browser after you grade it. And it’s very easy to undo this action.

#

And it’s impossible to show the interval if you grade a bunch of cards.

sonic forge Mar 4, 2025, 3:11 PM

#

quasi shadow It has the same effect as a normal review.

Can you explain how it interacts with scheduler/load_balancer
Here it creates a new state: https://github.com/ankitects/anki/blob/eb1ed140223aca5fec34a2b6b821a9a93a5bf30c/rslib/src/scheduler/reviews.rs#L150
(let states = col.get_scheduling_states(card_id)?;) which contains all scheduler/load_balancer logic in the next_states methods. Then that new state goes to new_state: Some(new_state.into()), with new_state selected by grade rating and with new interval
Then this new state goes to revlog_partial:

        let revlog_partial = updater.apply_study_state(current_state, answer.new_state)?;
        self.add_partial_revlog(revlog_partial, usn, answer)?;

But where this new state/interval becomes the new due for the card?
In the fn maybe_requeue_learning_card the entry is created with card.due

let entry = LearningQueueEntry {
            due: TimestampSecs(card.due as i64),
            id: card.id,
            mtime: card.mtime,
        };

But when exactly the new state/interval get to the card.due?

quasi shadow Mar 4, 2025, 3:15 PM

#

It’s hard to explain. The way I understand the code is inserting a lot of print into everywhere, doing a review and checking the log.

sonic forge Mar 4, 2025, 3:16 PM

#

Ah, it is likely to be in the fn apply_review_state: https://github.com/ankitects/anki/blob/63c2a09ef6760890c03be4bd83f613c03c512d1f/rslib/src/scheduler/answering/review.rs#L12

#

fn apply_<insert_state_here>_state, to be precise

unique salmon Mar 4, 2025, 3:18 PM

#

quasi shadow It’s hard to explain. The way I understand the code is inserting a lot of `print...

Same

#

Just debug by inserting print() everywhere 🤣

sonic forge Mar 4, 2025, 3:28 PM

#

quasi shadow It’s hard to explain. The way I understand the code is inserting a lot of `print...

Classic 😄
I don't have access to my working (for development) machine at the moment, so I am reviewing the code directly in the GitHub 💀
I was checking the PR for disabled load_balancer support. It seems to be ok, because scheduling logic happens in the (let states = col.get_scheduling_states(card_id)?;)
And after that new entry is just saved in the queue

quasi shadow Mar 4, 2025, 4:55 PM

#

polar maple same, but i expect it to do well in a calibration standpoint since i thought it ...

I have an idea. What if we use the p of moving average and the forgetting curve to calculate the “stability”?

#

I wonder how the stability of a given card predicted by moving average changes over reviews.

#

Does it follow an intuitive memory pattern?

polar maple Mar 4, 2025, 7:51 PM

#

quasi shadow I have an idea. What if we use the p of moving average and the forgetting curve ...

I don't think there's anything interesting to find, it really is just looking at the average of the recent reviews. I uploaded the raw file for the first 100 users, you can get a sense of how quickly/slowly the moving average changes to new reviews
https://raw.githubusercontent.com/1DWalker/srs-benchmark/refs/heads/moving-avg/MOVING-AVG_raw.jsonl

lapis hearth Mar 4, 2025, 9:28 PM

#

unique salmon <@449662392314494987> I believe you said you will look into setting the last two...

Yes this problem is extremely annoying. I thought Optimize should always keep params with better RMSE and not change it to worse

#

After optimizing (notice the 0 at w18 and 19)

#

cosmic hedge Mar 4, 2025, 9:49 PM

#

lapis hearth Yes this problem is extremely annoying. I thought Optimize should always keep pa...

but (forgive me for repeating myself) isn't the log loss the one that matters?
that at least explains to me why the last 2 values would optimize to 0.

bold terrace Mar 4, 2025, 10:12 PM

#

cosmic hedge but (forgive me for repeating myself) isn't the log loss the one that matters? t...

Normally there is a check that if the new RMSE is lower it won't use the new parameters, if I recall correctly

#

But maybe it has changed

#

There was some discussion about log loss being a better measurement

unique salmon Mar 4, 2025, 10:16 PM

#

bold terrace But maybe it has changed

It's a new thing. If the last two parameters result in a situation where after all same-day reviews after a lapse the next interval is shorter than before, the last two params are set to zero.
For example: the card had an interval of 10 days -> you forgot the card and pressed Again -> you did a same-day review and pressed Good -> you did a same-day review and pressed Good -> the next interval is 15 days

#

However, setting to 0 is overkill. There should always exist small non-zero values that don't cause this issue. Hopefully, Jarrett will work on it

bold terrace Mar 4, 2025, 10:39 PM

#

@cursive badge @cosmic hedge , I've tried something for the Card Stability over Time with Young/Mature contribution ...

Basically, it's not that the Stability of Young is 5.79 and Mature 30.48 in this example, but it represents the ratio of the young average and the ratio of the mature average to represent the average.

For example : Let say your Young AVG is 10, and Mature AVG is 10. Your Total AVG would still be 10, but since they have both a ratio of 1/1 of contribution, you would have YOUNG Contribution (Ratio) = 5, MATURE Contribution (Ratio) = 5, Stability Total = 10

#

It's a bit confusing at first but it can be very insightful to see if your stability is driven by young or mature card

#

But at the same time, I'm not sure how much it bring info, since in general, younger card wiill have low stability anyway, so apart if you have a 2x more young than mature cards, young avg stability should not really matter much

#

Wait, nevermind ... since it's an AVG, the amount of young card won't change anything .......

#

So I think the Young/Mature split is just useless blobSweat

cosmic hedge Mar 4, 2025, 11:09 PM

#

bold terrace So I think the Young/Mature split is just useless <:blobSweat:423687199511412756...

I'm just impressed you managed to pull it off with my crap code-base 😅

polar maple Mar 4, 2025, 11:57 PM

#

#

#

RWKV curves tends to drop out quickly at the start. I believe this is from a failure to encode the card in memory properly. RWKV was trained to also predict same-day reviews so these could also be from needing to anticipate failed re-learning steps while on the other hand FSRS just assumes that some relearning steps has already happened

#

Regarding the asymptote behavior, we know from the aggregate calibration graphs that FSRS does underpredict for low R, so this suggests that more likely than not RWKV is correct here. RWKV has near-perfect calibration

quasi shadow Mar 5, 2025, 3:48 AM

#

polar maple

How many params does the RWKV forgetting curve use?

polar maple Mar 5, 2025, 3:50 AM

#

quasi shadow How many params does the RWKV forgetting curve use?

it uses a a weighted sum of 4 power curves

def forgetting_curve(self, w, s, d, label_elapsed_seconds):
        return 1e-5 + (1 - 2*1e-5) * torch.sum(w * (1 + torch.max(torch.tensor(1.0),     label_elapsed_seconds) / (1e-7 + s)) ** -d, dim=-1)

some of those numbers are for numerical stability

#

w: [0.01882943883538246, 0.10095643252134323, 0.4705328941345215, 0.4096812903881073], s: [2.923821449279785, 77058.578125, 32219614.0, 1296332928.0], d: [0.4262007474899292, 0.010785482823848724, 2.2498745918273926, 0.988442599773407]
w: [0.03417219966650009, 0.13965930044651031, 0.4500780999660492, 0.37609046697616577], s: [1.9724270105361938, 63864.07421875, 28334582.0, 950819136.0], d: [0.3487342596054077, 0.011508260853588581, 1.5801490545272827, 1.6873809099197388]
w: [0.004207565449178219, 0.24725909531116486, 0.34103667736053467, 0.40749669075012207], s: [5.856897830963135, 811459.875, 29277376.0, 7455286272.0], d: [0.36546364426612854, 0.020167982205748558, 3.5514843463897705, 0.28269582986831665]
w: [0.01884218119084835, 0.17774465680122375, 0.41338032484054565, 0.3900328278541565], s: [11.130600929260254, 97529.4140625, 25254084.0, 1843096832.0], d: [0.36693474650382996, 0.01851370558142662, 1.8968570232391357, 1.2613049745559692]
w: [0.015502666123211384, 0.2253720462322235, 0.25992316007614136, 0.4992022216320038], s: [7.5156378746032715, 88570.5625, 16159777.0, 4799276032.0], d: [0.3775652050971985, 0.017523538321256638, 2.102546215057373, 0.41862088441848755]```
here is a sample of these values, each row corresponds to the first review in a user's revlog that has ~30 days stability. the stabilities are in seconds

#

so to me it seems that w[0] is the immediate dropoff in the curve and w[3] is the asymptote

#

w[1] and w[2] control the main shape of the curve

quasi shadow Mar 5, 2025, 4:15 AM

#

#

Btw, I added moving average into my metric comparison.

#

https://github.com/open-spaced-repetition/spaced-repetition-algorithm-metric/blob/main/metrics_research.ipynb

GitHub

spaced-repetition-algorithm-metric/metrics_research.ipynb at main ·...

Contribute to open-spaced-repetition/spaced-repetition-algorithm-metric development by creating an account on GitHub.

#

#

It doesn't perform well in random sampling data.

#

So, MOVING-AVG really learnt something from the data from real users...?

#

#

Otherwise, it cannot calibrate so well as that.

polar maple Mar 5, 2025, 4:53 AM

#

quasi shadow It doesn't perform well in random sampling data.

lol i also tried this to see if MOVING-AVG was somehow doing well by definition

polar maple Mar 5, 2025, 4:54 AM

#

quasi shadow So, MOVING-AVG really learnt something from the data from real users...?

i think that somehow user's momentum or mood matters a lot

#

that, or the underlying scheduler (i assume SM-2) is able to consistently schedule similar cards together

#

and MOVING-AVG just becomes a calibration step

#

but i doubt sm-2 is any good so idk

#

https://pastebin.com/AAgBMbHK
i made this before, i think it is for user 58 for which RWKV/LSTM/FSRS does horribly at at 0.65+ log loss but RWKV-P/MOVING-AVG does well at 0.33/0.42

ahead refers to RWKV, it has to predict the outcome of the review ahead of time, right after the previous review of this card. imm refers to RWKV-P which predicts the outcome of the review immediately before it happens

you can see how this user has long strings of 0.0s or 1.0s. At the end of the file you can see the imm column creep up and up. MOVING-AVG would also be able to exploit this behavior. So, the performance of RWKV-P and MOVING-AVG is fake in this sense, i don't really think this kind of knowledge is useful for a scheduler

#

actually it just depends on if the user is truthful or not. If the user is truthful and there are long strings of 0s or 1s then the scheduler should be able to adapt on the fly

#

otherwise if the user is not truthful then yeah it is fake performance

#

some users just want to pass all the remaining cards to get their day over with

lapis hearth Mar 5, 2025, 7:26 AM

#

unique salmon It's a new thing. If the last two parameters result in a situation where after a...

And what is soooooo problematic abou this behaviour that it is suddenly decided that when that happens, the last 2 should be set to 0

quasi shadow Mar 5, 2025, 8:14 AM

#

unique salmon <@449662392314494987> I believe you said you will look into setting the last two...

I cannot find the issue about it.

#

Could you open one for that?

#

I have drafted up a PR: https://github.com/open-spaced-repetition/fsrs-rs/pull/297

GitHub

Feat/consider num_relearning_steps when clamp params by L-M-Sherloc...

unique salmon Mar 5, 2025, 8:33 AM

#

quasi shadow I cannot find the issue about it.

I don't think there is an issue specifically about it, just a comment somewhere

quasi shadow Mar 5, 2025, 8:34 AM

#

unique salmon I don't think there is an issue specifically about it, just a comment somewhere

I hope you elaborate the current problem in a new issue or comment it below the PR.

unique salmon Mar 5, 2025, 8:46 AM

#

quasi shadow I hope you elaborate the current problem in a new issue or comment it below the ...

Done

bold terrace Mar 5, 2025, 12:25 PM

#

cosmic hedge I'm just impressed you managed to pull it off with my crap code-base 😅

Your code is not crappy at all 🙂 And for now it's me that is adding a lot of plain-flat-logic when it should be refactored a bit haha. But I like to keep it that way until I'm happy with the result. I was thinkng, maybe the ratio should not be a ratio of the average but a simple ratio of young/mature dividing the avg stability, so you can see the "volume" of reviews potentially impacting how the stability evolve, I'll try that a bit later

#

I was even thinking earlier, how clean and even predictable the avg stability increse is, I'm wondering if that increase is not driven by sheer repetition volume, which means more repetition, even though it might not be "optimal" (in a review/time optimal way), might be how you build more quickly increasing stability

#

Which would then be another justification of why, higher retention than the theoritical optimal one (in terms of knowledge/review), can be a good thing

#

Because the "Memorised" is just a view of "how much words you can have right at a certain point of time". But it does not take in consideration "for how long you will be able to keep them memorised", where stability is exactly that

#

So an optimal scheduling, might more often than not, be not only related to how much you memorized, but how high you were able to build stability.

#

Now the question is, how much to sacrifice one for the other ? The R*S, R*log(S), ....

#

But avg stability and DR are not even sufficient to really determine this. Indeed, the number of new/day, also impact the rate of Increase and even decrease of avg stability over time

#

For example, in my case, I was able to recover my "old avg stability" ~29d, when I stopped adding new words after around 30d of stopping adding new cards

#

Which can be explained partly by the volume of very-low stability cards that had to build over, but still, it's still a long time to recover a stability that was not that crazy in the first place

lapis hearth Mar 5, 2025, 2:06 PM

#

unique salmon I don't think there is an issue specifically about it, just a comment somewhere

Isnt there a way to not just make it happen in the first place. Why should there ever be a need for small values instead of 0s. I am reviewing my cards just fine with default values there instead of 0s

unique salmon Mar 5, 2025, 2:09 PM

#

lapis hearth Isnt there a way to not just make it happen in the first place. Why should there...

I've explained this 3 times already

Basically, it detects whether using same-day reviews to adjust memory stability could result in a situation where your next interval after a lapse is longer than before. For example: 10 days -> you press "Again" -> you have an insane number of re-learning steps -> you do them, S increases -> next interval is 15 days
If the optimizer detects that your re-learning steps and parameters would result in that kind of problem, the optimizer will run for the second time, but the last two parameters will be "frozen", meaning that same-day reviews will have no impact on S

So if something like 10 days -> Again -> (your re-learning steps) -> 15 days can happen, the last two params will be set to 0

For example: the card had an interval of 10 days -> you forgot the card and pressed Again -> you did a same-day review and pressed Good -> you did a same-day review and pressed Good -> the next interval is 15 days

hasty fractal Mar 5, 2025, 2:11 PM

#

lapis hearth Isnt there a way to not just make it happen in the first place. Why should there...

bro 🤣

lapis hearth Mar 5, 2025, 2:12 PM

#

unique salmon I've explained this 3 times already > Basically, it detects whether using same-...

Yes that part I get that.

hasty fractal Mar 5, 2025, 2:12 PM

#

I call it the pass-fail-pass-fail trap

lapis hearth Mar 5, 2025, 2:12 PM

#

I am saying that wasn't a problem beforehand

#

What made it into a problem. It was working just fine

#

It doesn't help like at all

unique salmon Mar 5, 2025, 2:26 PM

#

lapis hearth I am saying that wasn't a problem beforehand

It was a problem for as long as FSRS-5 exists

lapis hearth Mar 5, 2025, 2:34 PM

#

unique salmon It was a problem for as long as FSRS-5 exists

But in what way is it a problem exactly. FSRS 5 was working just fine and still is

#

Now with 0 being practically set every time I optimize at w18, w19, FSRS 5 is basically switched off

cosmic hedge Mar 5, 2025, 2:53 PM

#

bold terrace But avg stability and DR are not even sufficient to really determine this. Indee...

somehow my new cards actually increased my mean stability 😂 (probably because my initial stabilities are 31.3 for good and 100 for easy)

bold terrace Mar 5, 2025, 2:56 PM

#

cosmic hedge somehow my new cards actually increased my mean stability 😂 (probably because m...

The kind of situation where the median can help a bit 🙂

#

But it's funny that your AVG stability is around 60 but Good are 30 and Easy 100

#

Would mean you would fail a lot of those after a few reviews

#

and strange that then FSRS optimizer doesn't learn from it and make the initial stability lower

#

Your DR is at how much ?

cosmic hedge Mar 5, 2025, 2:58 PM

#

yeah it is the initial stabilites I think

#

83% (my dr brother)

bold terrace Mar 5, 2025, 2:59 PM

#

Strange strange that it's so high

#

Because if you really do succeed them 80% of the time

#

those stabilities would get even higher

#

so the average being at 60d feels super low

#

https://open-spaced-repetition.github.io/anki_fsrs_visualizer/?w=0.0549,0.2229,1.8246,16.9145,7.3544,0.6494,2.1151,0.0018,1.1016,0.1495,0.6955,1.8386,0.2018,0.2194,2.5660,0.2329,3.9725,0.6409,0.9903&m=0.90

#

Can you test those sequences with your parameters (and desired retnetion ) ?

#

1331333
1313333
3333133
4313333

#

cosmic hedge Mar 5, 2025, 3:01 PM

#

i think its to do with the fact that re-did all my cards a while ago (hence all the re-introduced (I massively regret not just making new notes))

#

so some of the info i already know

bold terrace Mar 5, 2025, 3:02 PM

#

yes that could explain

cosmic hedge Mar 5, 2025, 3:02 PM

#

bold terrace https://open-spaced-repetition.github.io/anki_fsrs_visualizer/?w=0.0549,0.2229,1...

here

bold terrace Mar 5, 2025, 3:03 PM

#

Yeaaaah got it

#

Basically the model learnt that if you know it already, well, you might at well not review it anymore

#

Personally I also creaet sometimes card for what I already know, but I make sure when I review them I press Good for the one "I just knew based on inference" and Easy "the one I know very well from before"

cosmic hedge Mar 5, 2025, 3:04 PM

#

yeah i've only ever failed 26 cards i initialy did good on

bold terrace Mar 5, 2025, 3:04 PM

#

And I also have a very large stability for first-easy

cosmic hedge Mar 5, 2025, 3:05 PM

#

which makes sense because if going in i knew it who cares

bold terrace Mar 5, 2025, 3:06 PM

#

I know with time my parameters evolved to make it less optimistic

#

I think once you'll fail some of those it will adapt

#

but if you don't after such a long time, it's OK I guess

cosmic hedge Mar 5, 2025, 3:08 PM

#

I'm going to assume when I start encounering the cards I don't know from the start I'm just going to hit "again"

#

so Idk if it will affect it too much

bold terrace Mar 5, 2025, 3:09 PM

#

#

It's just strange I have a better stability for Good then you

#

But I don't have such big initial stabilities

#

If you want to test, I did the split Young/Mature, but be aware that it's looking at stability >= 21, not interval so the ratio will be different than the one presented by anki

📎 searchStatsExtended.ankiaddon

#

#

I think it would be better if Young/Mature was computed on stability and not interval

#

21d is "nothing" with a DR to 70% compared to 90%

#

(also I did not implemented it for median, so only avg with this build)

#

(Hop quick implementation for the contribution ratio with median)

📎 searchStatsExtended.ankiaddon

#

I updated the PR with this build https://github.com/Luc-Mcgrady/Anki-Search-Stats-Extended/pull/32

lapis hearth Mar 5, 2025, 3:47 PM

#

What is this

#

Is this the new helper addon

bold terrace Mar 5, 2025, 3:49 PM

#

No it's the search stats extended from the almighty @cosmic hedge

hasty fractal Mar 5, 2025, 4:22 PM

#

lapis hearth But in what way is it a problem exactly. FSRS 5 was working just fine and still ...

it was a problem for my deck. here's a look into the issue:

I have a new card. I learn it. ivl = 3d.
I fail it after 3d. I relearn the card. ivl = 5d.
I fail it after 5d. I relearn the card. ivl = 10d.
I fail it after 10d. I relearn it. ivl = 14d.
I fail it after 14d. I relearn it. I see next ivl is 21d. I open discord and start spamming Expertium's DMs with a long rant. Then there is a issue opened in the repo. The Jarrett solves it. Yay! Happy ending.

#

I unconsciously wrote "The Jarrett" 🤣

robust hill Mar 5, 2025, 4:31 PM

#

bold terrace I think it would be better if Young/Mature was computed on stability and not int...

honestly agreed

unique salmon Mar 5, 2025, 4:42 PM

#

hasty fractal it was a problem for my deck. here's a look into the issue: * I have a new card...

That's not the same though

#

Your example has nothing to do with same-day reviews, whereas the "set last two parameters to 0" thingy is specifically to deal with same-day reviews

#

Unless by "relearn" you mean "it goes through a bunch of re-learning steps", in which case yeah

hasty fractal Mar 5, 2025, 4:52 PM

#

unique salmon Unless by "relearn" you mean "it goes through a bunch of re-learning steps", in ...

what other "relearn" do you know of?

#

please enlighten me

#

(bruh there's only one relearn)

unique salmon Mar 5, 2025, 5:00 PM

#

hasty fractal what other "relearn" do you know of?

Idk, I was thinking of something like what Sound plots

robust hill Mar 5, 2025, 6:15 PM

#

#

am i winning chat 93% dr

cursive badge Mar 5, 2025, 6:18 PM

#

robust hill

Woooo! Ross' graph mentioned ;p

#

I assume those really low R are suspended cards? You might want to set a custom search at the top to deck:current -is:suspended for nicer bins.

bold terrace Mar 5, 2025, 6:29 PM

#

hasty fractal it was a problem for my deck. here's a look into the issue: * I have a new card...

You mean you fail and right away the interval is longer than the previous one ? What’s your params ?

hasty fractal Mar 5, 2025, 6:29 PM

#

After going through the steps...

bold terrace Mar 5, 2025, 6:30 PM

#

Because if that release was multiple steps, at least in your case you always go to higher stability which is already nice

hasty fractal Mar 5, 2025, 6:30 PM

#

The issue is solved though in the current ver.

bold terrace Mar 5, 2025, 6:30 PM

#

Ah ok !

robust hill Mar 5, 2025, 6:32 PM

#

cursive badge I assume those really low R are suspended cards? You might want to set a custom ...

lets find out

bold terrace Mar 5, 2025, 6:32 PM

#

By fixed you mean what happens now ? You fail and the interval is not reduced ?

robust hill Mar 5, 2025, 6:32 PM

#

cosmic hedge Mar 5, 2025, 6:33 PM

#

robust hill

Ross made it search if you click the squares so that's a quick way to see if you want 😄

hasty fractal Mar 5, 2025, 6:34 PM

#

bold terrace By fixed you mean what happens now ? You fail and the interval is not reduced ?

The next interval is reduced.

hasty fractal Mar 5, 2025, 6:35 PM

#

bold terrace Because if that release was multiple steps, at least in your case you always go ...

we should probably think of it as a miscalculation on FSRS's part. plus, the changes made actually improved the metrics for my deck (although it got slightly worse for the 20k dataset).

robust hill Mar 5, 2025, 6:36 PM

#

cosmic hedge Ross made it search if you click the squares so that's a quick way to see if you...

i see

#

#

for my language learning

#

so what does this mean

#

am i winning

bold terrace Mar 5, 2025, 6:39 PM

#

hasty fractal we should probably think of it as a miscalculation on FSRS's part. plus, the cha...

I see ! because indeed, based on params a failure can sometimes be drastic. In my case, for example, It can be easily a 18-30 factor

#

But it's not a bug in a sense that indeed, many cards in my deck behave like this

#

#

In my case, forgetting is a "very bad" incident 😄

hasty fractal Mar 5, 2025, 6:41 PM

#

brother I have no idea what are you talking about but more importantly, I think u have no idea what I'm talking about either

bold terrace Mar 5, 2025, 6:42 PM

#

hasty fractal it was a problem for my deck. here's a look into the issue: * I have a new card...

Maybe haha. I think you were complaining that you had to redo a lot of relearning steps after a fail

#

relearning not in a anki term, but in a "lot of reviews"

hasty fractal Mar 5, 2025, 6:44 PM

#

bold terrace Maybe haha. I think you were complaining that you had to redo a lot of relearnin...

Not really. I had two steps for relearning. Making a relearning card go through them with two good ratings meant the stability got higher than it already was. Which meant now I had to recall a card 13 days later that I already just failed with a 7 day interval.

hasty fractal Mar 5, 2025, 6:45 PM

#

bold terrace relearning not in a anki term, but in a "lot of reviews"

ye

#

if the interval keeps increasing after every relearning session, no way I'll ever pass such a card.

quasi shadow Mar 6, 2025, 4:16 AM

#

I made a presentation to a group of researchers at Cognitive Computational Neuroscience two days ago. Now I know a fun fact: they didn't do any research about "long-term memory" in the sense that Anki users would understand. In their term, the scope of "long-term memory" is several minutes to hours.🤣

#

😅 Their long-term memory research is my short-term memory research.

spring adder Mar 6, 2025, 5:04 AM

#

quasi shadow I made a presentation to a group of researchers at Cognitive Computational Neuro...

I thought my stability was low, but it turns out my memory is long as fuck boi. 😎

hasty fractal Mar 6, 2025, 5:26 AM

#

quasi shadow I made a presentation to a group of researchers at Cognitive Computational Neuro...

what was the presentation about?

quasi shadow Mar 6, 2025, 5:41 AM

#

hasty fractal what was the presentation about?

My papers.

hasty fractal Mar 6, 2025, 5:43 AM

#

quasi shadow My papers.

how did they respond? were they interested, or they were like "nah our research is good. LTM is a few hours"

quasi shadow Mar 6, 2025, 5:46 AM

#

Their professor is interested. The graduate students feel alien with my papers’ topic.

hasty fractal Mar 6, 2025, 5:47 AM

#

urgh, they gotta use anki

#

it's perhaps more interesting then

quasi shadow Mar 6, 2025, 7:09 AM

#

Anki is not very popular in China.

hasty fractal Mar 6, 2025, 9:18 AM

#

yea, not here either

#

people are kinda stuck with coaching/school and stupid traditional methods

#

guess it's the same with China too

bold terrace Mar 6, 2025, 10:40 AM

#

I mean, Anki is not that ground breaking in the first place. Anki without the full suite of addon/integration is quite ... bland. I have a few colleagues using Anki, when I discussed with them about it, they were surprised how rich my cards were, because they just did the "Anki the normal way" and they get extremely bland cards, at a very high human cost

#

As I was saying in the yomitan discord, it really feels sometimes the Open Source softwares is tools made by devs for devs

#

Which you can embrase and gives us all the shiny things (Like FSRS is doing with simulator, parameters optimizers, etc ...)

#

Or you can try to streamline into "One way of using it" (duolingo-like, you boot up anki, you import a deck, you review, no integration whatsoever, no knowledge about what the scheduler is doing, etc)

#

For example, my wife is a math teacher and we already discussed Anki but it's quite clear it's not something that would really appeal to her students or even her

bold terrace Mar 6, 2025, 3:14 PM

#

@unique salmon / @quasi shadow : Shouldn't difficulty be a bit more reactive to Good/Easy reviews ? It almost seems like the "Ease Hell" is even more pronunced with FSRS. But maybe it's perfectly normal if the prediction is better like this ?

#

Feels like you need 3 Easy to compensate one Again, and ... infinite number of Good 😄

#

Wild idea but please don't kill me too quickly : What if, Difficulty would be outside the realm of otpimization, but more like an adjustement variable on a card-level basis ?

#

For example, if a card doesn't match the DR, the Difficulty would adjust to that, to bend the model for it ?

unique salmon Mar 6, 2025, 3:27 PM

#

That's the weird part about difficulty - it works better like this for some incomprehensible reason. And making reversion to lower D more aggressive makes metrics worse

unique salmon Mar 6, 2025, 3:27 PM

#

bold terrace Wild idea but please don't kill me too quickly : What if, Difficulty would be ou...

How exactly?

bold terrace Mar 6, 2025, 3:34 PM

#

unique salmon How exactly?

Let's try to build an example :

Your DR is 90%. You do 10 reviews, and you fail 3 of them. It means, your actual percentage of retention for that card is 70%. Which means, you have a delta of 20 over a margin of 30, so a 66% difficulty. (Harder than expected)
Your DR is 90%. You do 10 reviews, you fail 1. The delta is 0, the difficulty penalty is thus 0.
Your DR is 90%, you do 10, you fail 0. The delta is -10 over a margin of 10, so you have a -100% difficulty rating. (That word is easier than expected)

Then the question is, how to bend the stability based on that difficulty rating. I don't know 😄

#

Thus, FSRS optimize your "Average" Forgetting Curve, and Difficulty play the role of the "Case-by-Case adjustement variable"

#

(As intended)

#

The benefit is, instead of moving D at every review, it would be adjusted based on the full history of that card. If over 50 reviews you have a 60% success rate on that card instead of your expected DR, something is smelly, right ?

#

Of course, optimization like moving average has to be considered

unique salmon Mar 6, 2025, 3:40 PM

#

@quasi shadow I don't think this is compatible with how FSRS works, but maybe you have something to say anyway

slim hollow Mar 6, 2025, 4:10 PM

#

max D means you are answering thoughtfully, min D means you are cheating and you should be pressing easy; pretty much D in current form is not a parameter that humans would interpret as difficulty

unique salmon Mar 6, 2025, 4:32 PM

#

Maybe I should go back to experimenting with adding R to the D formula, though last time I did that the metrics didn't budge even a bit

bold terrace Mar 6, 2025, 4:52 PM

#

I mean, this is my data : https://docs.google.com/spreadsheets/d/1Eysl4bocAg9KD3YpVCjR28ACkvpMa3D8eMP2x6fWieE/edit?usp=sharing
For each card, I just counted the amount of Success, the amount of Error. I didn't do anything to filter out the good after bad the same day, so of course it wont match exactly "True" Retention.
But still, we see the distribution of success rate looks like a normal distribution.
So you would expect more or less to have a difficulty to follow something like that, with card a bit more problematic, and some a bit less

Google Docs

Success Rate per Card

#

Now the standard deviation si 7.8% ... So indeed, maybe even with no Difficulty handling, you get something good enough, and since the RMSE/logloss seems already quite good (logloss 0.49 and RMSE 3.360), I guess "it won't change much"

#

But, I think right now nothing is really changing much, so maybe handling those outliers could help

bold terrace Mar 6, 2025, 5:14 PM

#

I read your blog here @unique salmon https://expertium.github.io/Algorithm.html

If I understand correctly from the "Changes in FSRS-5", Difficulty is still not based on R but only G right ? Maybe that could be useful then, since a "Fail" doesn't necessarly mean the card was more difficult than previously right ?

I mean, if we take back my example, for DR=90%, a fail every 20 review should even be a sign that the card is easier than expected. So the Delta D would have to take in account R and DR right ?

Expertium’s Blog

A technical explanation of FSRS

Spaced repetition stuff

unique salmon Mar 6, 2025, 5:15 PM

#

bold terrace I read your blog here <@530106856593424407> https://expertium.github.io/Algorit...

Theoretically, yes. I'm just saying that it doesn't seem to matter in practice. But it could also be that my implementation is bad

#

We can't take DR into account, just R, btw

#

DR doesn't exist in the training data

bold terrace Mar 6, 2025, 5:18 PM

#

Hmm indeed

#

And I guess even with Training Data coming from Anki user, the DR is not stored anywhere

#

I think it could be useful, because when you think about it, that R is relative to others card

#

Without DR, we can always look at R of a card, and the mean R of the dataset

#

Would not help if different decks has different DR though ....

quasi shadow Mar 6, 2025, 5:23 PM

#

As I said before, the difficulty is just a mean value for a distribution.
https://l-m-sherlock.notion.site/Personal-spaced-repetition-systems-cannot-eliminate-heterogeneity-135c250163a180e09d3dd605fc095e5e

Jarrett Ye's Notes on Notion

Personal spaced repetition systems cannot eliminate heterogeneity |...

An individual can only engage in a single review for a specific card at any given moment. It's impossible to conduct multiple reviews under identical conditions without parallel universes. This irreproducibility makes precise measurement of memory states impossible.

quasi shadow Mar 6, 2025, 5:27 PM

#

unique salmon That's the weird part about difficulty - it works better like this for some inco...

I guess leech is more common than the feeling of “ease hell”.

#

Difficult cards are unlikely to become easy in most cases.

#

https://github.com/open-spaced-repetition/srs-benchmark/blob/main/plots/w[7].png

GitHub

srs-benchmark/plots/w[7].png at main · open-spaced-repetition/srs-b...

A benchmark for spaced repetition schedulers/algorithms - open-spaced-repetition/srs-benchmark

quasi shadow Mar 6, 2025, 5:31 PM

#

quasi shadow https://github.com/open-spaced-repetition/srs-benchmark/blob/main/plots/w%5B7%5D...

The distribution of param of mean reversion also supports this statement.

bold terrace Mar 6, 2025, 5:47 PM

#

quasi shadow I guess leech is more common than the feeling of “ease hell”.

I think leeches might express themselves differently (Let say DR=80%) :

Leeches : Your Stability won't go very high, even though your DR might be respected with great accuracy. Basically, you HAVE those 80% perfectly predicted, but with still very low stability.
"Difficult" card : You seem to not be able to reach the DR, doing only 60-70% R instead of the Predicted R, and their stability will thus even be lower.

Then of course, saying that "Leeches are then also difficult" is also a valid way of expressing Difficulty.

Maybe the problem with "Difficulty" is how loosely defined it is (Is it related to stability ? inability to respect the DR ? Inability of having a stability converging, and having it all over the place ? etc etc)

#

Maybe Difficulty is then just a word we should stop using and instead refining it into different evaluation of why a card is not "satisfactory" 😅

unique salmon Mar 6, 2025, 5:50 PM

#

In SuperMemo D is actually defined differently, based on "missed expectations" - difference between R and the real review outcome (with smoothing and shit, since the outcome is binary). I've tried that, but it didn't improve FSRS. I could try some more, but considering how many attempts at improving D have failed, I don't feel like doing it anymore, since 99% of the time my ideas don't work.

quasi shadow Mar 7, 2025, 7:09 AM

#

https://psycnet.apa.org/manuscript/2018-42340-002.pdf

#

😂I found more traditional models about the memory.

#

As in P&A, however, PPE does not require a successful retrieval attempt to receive these gains

#

OK, they both are shit.😅

quasi shadow Mar 7, 2025, 8:27 AM

#

https://x.com/muqiuse/status/1897904708615901483

sawa (@muqiuse) on X

For anyone using the FSRS algorithm in Anki, I'd strongly advise against it because of multiple issues:

- Inescapable Ease Hell (default w[7] value is 0.0046, rendering mean reversion useless)
- Optimizing with bad learning habits will actually result in a HIGHER workload

#

Any thoughts?

#

Maybe we need to introduce something like momentum into the formula.

cosmic hedge Mar 7, 2025, 8:47 AM

#

quasi shadow https://x.com/muqiuse/status/1897904708615901483

As good an excuse as any to post my "difficulty time machine"

#

I always just blame my high difficulty cards on my card design, figured fsrs recognises them as "doomed".

slim hollow Mar 7, 2025, 9:31 AM

#

with how currently difficulty is used in formulas:

user uses 2 buttons / content is normal or hard will lead to most card going into 10D
user uses 2 buttons / content is easy will lead to most card going as low as possible so ~5D
These are the most common patterns and many people will fall into 1st tier which looks like difficulty hell, but really isn't as the variable is misnomer

bold terrace Mar 7, 2025, 9:31 AM

#

quasi shadow https://psycnet.apa.org/manuscript/2018-42340-002.pdf

When human tries to even rationalize words in some weird way

hasty fractal Mar 7, 2025, 9:33 AM

#

bold terrace When human tries to even rationalize words in some weird way

is that saying longer words are easier?

#

I guess if it's something like: 〇〇学園筆記試験過去問題集

bold terrace Mar 7, 2025, 9:34 AM

#

quasi shadow Maybe we need to introduce something like momentum into the formula.

What do you mean by momentum ? The previous R observed on a card-by-card basis or something else ?

bold terrace Mar 7, 2025, 9:35 AM

#

hasty fractal is that saying longer words are easier?

No it seems they count char in the romaji

hasty fractal Mar 7, 2025, 9:36 AM

#

link ?

bold terrace Mar 7, 2025, 9:36 AM

#

Scroll up 🙂

slim hollow Mar 7, 2025, 9:38 AM

#

you can rescale the D in FSRS from 0-10 to 0-1 and it brings much more natural distribution but this doesn't improve the prediction

bold terrace Mar 7, 2025, 9:42 AM

#

I might get crucified for saying that but maybe D should be more like a post-processing on top of the FSRS equation more than part of the FSRS equation.
Take a look at GPU, they'll often have different layers to be performant and precise instead of trying to have only one shader doing all the work

#

Difficulty could be like that, an variable not necessarly part of the optimized equation, but something that adjust realtime the prediction based on actual specific feedback

#

I mean, FSRS is already quite precise, RMSE around 3-5% for most of us. So sometimes it might just be a matter of slightly shifting the prediction on a lower side, or higher side, to get a perfect match

#

If I take my own distribution of Success Rate (R for one Card over the whole revlog), sure, most cards are within an acceptable range (.70-.75), but for all those that are around [.50,.65], they clearly deviate from the distribution and while it might not have a big RMSE impact, being able to detect them and adjust their Stability would help them be more centered

#

Because of course, if you optimize D behaviour for the whole set, it'll be optimized to have an average effect helping the whole logloss/RMSE minimization

#

But, is it really what you want ? Or instead, would you like D to be, a compensating variable on a card-by-card basis, quicker to react to what's actually happening right now (instead of what was planned in the training model)

quasi shadow Mar 7, 2025, 9:51 AM

#

bold terrace What do you mean by momentum ? The previous R observed on a card-by-card basis o...

I mean, something like Straight Reward.

#

https://kuroahna.github.io/anki_srs_kai/guide/easeReward.html#default-configuration

Ease Reward - Anki SRS Kai

hasty fractal Mar 7, 2025, 9:53 AM

#

IMHO we should focus on having dynamically selected DRs

#

the predictions are already pretty good

#

well, I did see that ssp didn't interest anyone

quasi shadow Mar 7, 2025, 9:58 AM

#

https://github.com/open-spaced-repetition/fsrs-optimizer/pull/167

GitHub

Expt/straight reward by L-M-Sherlock · Pull Request #167 · open-spa...

Inspired by https://kuroahna.github.io/anki_srs_kai/guide/easeReward.html#algorithm
I added two extra params:

w[19] is the step ease reward
w[20] is the minimum consecutive successful reviews requ...

bold terrace Mar 7, 2025, 9:59 AM

#

quasi shadow I mean, something like Straight Reward.

Yup I see the idea. On the specific example of SRS Kai, it still means the ease "reward" would only be computed on a grade-level, instead of an "history-level".
I'm not sure for example an "Again" should always lead to a loss of ease. If you press Again exactly what was predicte by your DR, to me, your card has a neutral difficulty.

quasi shadow Mar 7, 2025, 9:59 AM

#

I imitate Straight Reward in this PR.

bold terrace Mar 7, 2025, 10:02 AM

#

Tell me if I read it wrong in the code, but isn't it only trying to compensate for cards with R>DR ?
Also, since you give a reward for succession of good reviews, it means someone with DR=95% might get a lot of rewards, when in fact, doing longer chain of 1 is not really him necessarly outperforming the prediction ?

#

I really do think what would lead someone to have a bonus/malus reward should be his actual performance (Retention) based on expected prediction (Desired Retention)

#

Doing 10 "Good" in a row doesn't mean you're really that good if your DR was 99%

#

Doing 1 fail every 4 reviews is outperforming if you had your DR at 50%

#

Now in terms of "momentum", the question would then be : How that actual R should be computed ? The full history ? Only the last portion ? Excluding Same-Day Reviews ? Basically, how to define a Good average to compute actual R (moving average, filtered, global ...)

#

You see, it can change quite quickly, but those phases are still somewhat pronounced in my experience

#

But problem is, to compute that, you'd need to stored the "Predicted R" and then, for each window of average, doing the average of the predicted R vs the actual observed R

#

In my example, you see that D more or less work, since the yellow phase ws at 99%, the red at 100%, the green at 96-97%

quasi shadow Mar 7, 2025, 10:13 AM

#

bold terrace Tell me if I read it wrong in the code, but isn't it only trying to compensate f...

For difficult cards, the interval is short, so the user could reach a high streak in a month.

#

(if the cards turn out to be easy)

#

It's not related to DR-stuff. I just want to test the idea of Straight Reward.

bold terrace Mar 7, 2025, 10:14 AM

#

Sure, fine !

#

I'm doing a quick experiment with the visualizer :
I do a succession of Good/Fail, I plot the Difficulty, and I alter the Desired Retention

#

Increasing/Decreasing DR doesn't change how D move when I enter a 1 or a 3

#

Here, I failed enough time to go to ~97% D, which seems to be my "neutral point".
I do 9 "Good" and only then, I go back to the last Difficulty of 95%

#

No matter the Desired Retention, it stays the same

#

Which means : D is somewhat "locked" to measure my performance based on a DR~90% around 95-99% D

quasi shadow Mar 7, 2025, 10:18 AM

#

Wow, the PR does improve the distribution of difficulty a little in my collection.

#

But the main metrics don't become better than before.

bold terrace Mar 7, 2025, 10:23 AM

#

Yeaah and I'd also be cautious to look at what are those card with D<90%

#

in my case, it's a lot of very very very young card in terms of review number

#

This is mine for my main deck

#

If I check the 65-70D, I get this list of card :

#

Remark how they all have 0 lapses

#

BUT, having lapses should be perfectly fine in a model that predict your 80-90% DR

#

You should, lapse, 10-20% of your review count

#

Over 1350 cards with prop:lapses>3, I have only 2 with prop:d < 0.9

#

I have 180 cards with prop:lapses<3 and props:reps>20, the 46 cards have difficult >89%

#

Hmmmmmm

#

I'm doing
deck:current prop:lapses<=4 prop:reps>20
deck:current prop:lapses<=5 prop:reps>25
deck:current prop:lapses<=6 prop:reps>30

To find the "not too bad one".

#

The one with indeed, a number of lapse around ~ (1-DR) compared to my number of reps

#

And they all seems to be with difficulty 90-100%.

#

Which means D might worked as intended, but it's just that yeah, plotting it with bars of 10% width, won't give much info

#

After all, isn't it normal that Stability/Difficulty have the same kind of curve ?

#

Sorry for the monologue but I realize I might had false expectations 😂

#

Still, I think the current D won't work for DR too low. Your malus with Again answer will completely erase all D bonus you got with "Goods", since D variation is not related to DR

unique salmon Mar 7, 2025, 10:37 AM

#

I've tried the idea with streaks before, so I will be very surprised if it improves metrics.

cosmic hedge Mar 7, 2025, 10:43 AM

#

bold terrace Still, I think the current D won't work for DR too low. Your malus with Again an...

probably a stupid idea but what if instead of again resetting a streak, it multiplied it by (1-R) or something idk

bold terrace Mar 7, 2025, 10:43 AM

#

I know everyone dislike those perfect sequecnes of 3 then 1 fail, but I think it's quite insighftul here, a 80% DR and 90% DR perfect sequence

#

To me ... It seems... actually pretty good

#

The very non intuitive part is the fact that for lower DR, your D will be higher

#

Because since D variation is DR independant, you get "screwed" a bit more

#

BUT, there are many good points :

Lapse after lapse, you'll have shorter cycle, which might help you not fail the next one as fast as the previous
Yeah, everyhthing is still clamped up to 90-100%, but it's not like reviews just get "ignored".
If the D impact is small, maybe it's just because in my case, there's no that much variation about it. Also, it's just a value, maybe a small variation might lead to bigger stability change

#

Which is the case, since that Difficulty being different between cycles, is probably what explain why each new cycles would go to Stability lower than previous one, until a point where it would stagnate

#

Soooo ... Maybe we just want D to be super pretty, super centered around 50% D... but maybe we should trust the optimizer 😂

#

If I change w[4] to 99%, to start right away at 99% Difficulty, we can see the shape is completely different, and it will converge to a lower Difficulty with time and cycles

#

Sooooooooo, yeah, maybe the biggest culprist is the difficulty scale, NOT the difficulty itself

quasi shadow Mar 7, 2025, 10:49 AM

#

So... is this case real?

bold terrace Mar 7, 2025, 10:49 AM

#

quasi shadow So... is this case real?

This is absolutely my case yes

quasi shadow Mar 7, 2025, 10:50 AM

#

I think we should find a method to detect them and draw the calibration graph on them.

#

If the calibration is poor, we may find a systematic weakness of FSRS.

#

Then we can try to fix it via systematic methods.

bold terrace Mar 7, 2025, 10:51 AM

#

Anomaly detection 😄 ?

#

to be fair, I think it might be very simple to detect though. For example, how fast stability grew on those

unique salmon Mar 7, 2025, 10:52 AM

#

I've also tried making f(D) mostly linear, but switch to a power function for extremely hard cards. That didn't help either.
But I guess I'll try adding R to D and report the results

quasi shadow Mar 7, 2025, 10:52 AM

#

bold terrace to be fair, I think it might be very simple to detect though. For example, how f...

Could you provide a dataset?

bold terrace Mar 7, 2025, 10:53 AM

#

like w[2] (good initial stability) / w[0] (again initial stability), or very very low w[0]/w[1]

bold terrace Mar 7, 2025, 10:53 AM

#

quasi shadow Could you provide a dataset?

i'll DM you 🙂

slim hollow Mar 7, 2025, 10:59 AM

#

quasi shadow If the calibration is poor, we may find a systematic weakness of FSRS.

This issue is mostly about the user perception where they expect difficulty to lower with time and the distribution of difficulty to be centered at ~50%. It doesn't matter if the prediction and optimization is best as is if the user feels that it looks wrong 😔

bold terrace Mar 7, 2025, 11:20 AM

#

slim hollow This issue is mostly about the user perception where they expect difficulty to l...

Sure but then it’s something that can be improved not by changing the scheduling but by improving how people can interpret and plot that value.

For example, everyone knows that looking at Card Stability with the “all” toggle will also lead to something very abrupt

#

So adding graphs default and options to reflect the same kind of observation with difficulty would solve the misconception

#

Right now that graph is a kind of “all” one and in comparison to Card Stability one it is probably even more clear and well distributed

#

I think by filtering out the very young one and doing some kind of zoom on 80-100% difficulty people would have a better sense of their difficulty distribution

#

It’s something we can try to implement in the stats plugin before proposing a PR in Anki itself

#

I still do believe that ideally D should have the same “neutral point” between different DR but it’s not a game breaker at all, it’s something that can be explained in legends, docs, blogs

slim hollow Mar 7, 2025, 12:33 PM

#

also a food for thought, instead of blending easy with current response in mean reversion add additional parameters for reversion so that easy/good/hard have their separate optimizable parameters for reduction

vital apex Mar 7, 2025, 12:35 PM

#

quasi shadow So... is this case real?

in the case of studying Japanese, this is a pretty large amount of people. they dive into Japanese and use a core vocabulary deck of the 2000 most common words for example, and then it's natural as a complete beginner they won't know any words

bold terrace Mar 7, 2025, 1:00 PM

#

Yes and more "easy" card (With D <80%) only really happen when you start having words you can somewhat infer from others, which is absolutely not the case for your first 1-3K words

#

To me difficulty in this case represent more how atomic/disconnected the cards are. (Pure) Kanjis might have an even more abrupt curve then mine

hasty fractal Mar 7, 2025, 1:48 PM

#

bold terrace Right now that graph is a kind of “all” one and in comparison to Card Stability ...

toggle button: "show full scale" with the default state being turned off.

#

wdys

lapis hearth Mar 7, 2025, 2:39 PM

#

unique salmon I've explained this 3 times already > Basically, it detects whether using same-...

Are we going to still gloss over this?

unique salmon Mar 7, 2025, 2:41 PM

#

lapis hearth Are we going to still gloss over this?

?

lapis hearth Mar 7, 2025, 2:43 PM

#

I mean pretend that it is not a problem that literally every other deck is rendering fsrs 5 useless post optimizing

#

(without manualling resetting values for 100s of decks which is furthest from being practical)

unique salmon Mar 7, 2025, 2:44 PM

#

Jarrett is working on a fix

lapis hearth Mar 7, 2025, 2:44 PM

#

Yes because my RMSE has definitely worsened. It has gotten from 4% to just shy of - %

#

6%

hasty fractal Mar 7, 2025, 2:44 PM

#

lapis hearth (without manualling resetting values for 100s of decks which is furthest from be...

have u tried a lower number of step?

#

and lower time

lapis hearth Mar 7, 2025, 2:45 PM

#

It literally just gives me back the default value

#

Pre-whatever update that was-update that was absolutely not the case

unique salmon Mar 7, 2025, 2:46 PM

#

hasty fractal and lower time

Only the total number of steps matters

hasty fractal Mar 7, 2025, 2:47 PM

#

ah...

#

k

#

time isn't taken into account

lapis hearth Mar 7, 2025, 2:47 PM

#

And as I was reviewing I became aware of a very noticeable drop in my reviewing performance prior to that update which is odd, since my reviewing habit is consistent for the past 2.5-3years on my Anki

hasty fractal Mar 7, 2025, 2:47 PM

#

jarrett said he'll try something

#

so

unique salmon Mar 7, 2025, 3:18 PM

#

@quasi shadow am I misunderstanding this code or does it only count "Good" and "Easy" for the streak?

#

This one is also strange - are you counting "Hard" as fail?

#

The initial value after the first review should be
new_streak = torch.where( X[:, 1] > 1, torch.ones_like(state[:, 2]), torch.zeros_like(state[:, 2]), )
And then for all other reviews it should be
new_streak = torch.where( X[:, 1] > 1, state[:, 2] + 1, torch.zeros_like(state[:, 2]), )

#

Oh, I see, Straight Reward was made by a madman who counts "Hard" as "not success" but at the same time not as "fail". So in Straight Reward:
Easy = success
Good = success
Hard = not success
Again = fail

#

So I guess you faithfully imitated that, which is not a good idea

#

i'm sorry what

#

Why on Earth do you need RELU here?

#

Oh, that's just the world's weirdest way to do "add a reward if the streak is >= some parameter, don't add anything otherwise"
Still don't get why you use leaky RELU though

#

Instead of the regular one

#

Using torch.maximum(new_streak - self.w[20], 0) instead of RELU would be a lot clearer, btw. Just to make the code more readable

unique salmon Mar 7, 2025, 3:40 PM

#

slim hollow also a food for thought, instead of blending easy with current response in mean ...

Some time ago me and Jarrett agreed to stick to "2% relative improvement per parameter" rule. In other words, if you tweak FSRS and add new parameters, logloss and/or RMSE have to decrease by at least 2% (relatively) per each new parameter. This is just to avoid adding a crapton of new parameters and bloating FSRS for extremely marginal improvements. And I really doubt that what you suggested would be anywhere near 2%

Trust me, D is just that much of a bitch 🤣

#

D doesn't care what you or me think makes sense

#

Even something really obvious, like using optimizable parameters instead of Again=1, Hard=2, Good=3, Easy=4 doesn't do jack shit to improve metrics

#

Btw, I'm currently benchmarking some stuff related to using R in D, will share the results later. I'll try a very simple approach first that doesn't even require adding new parameters, and then I'll try redesigning D entirely

#

I bet 20$ neither will help

bold terrace Mar 7, 2025, 4:01 PM

#

@cosmic hedge / @cursive badge , I'm tinkering about "Lapses" and trying to extract something useful from it, I came up with a "Avg Repetitions / Lapse" that could be useful to detect if Higher Lapses not respect the DR anymore (because they are so difficult, you lapse them more than you need)

I came up with this (see attach)

I have the feeling the second view is more useful since you see how many repetitions you can do in each "lapse", so in my case I can see that the more lapse I have, the lesser the average retention (expressed here in repetitions/lapse)

Any hot ideas before I create a PR so we can improve it a bit more with time ?

#

Maybe a "Lapse Ratings" would be best suited, similarly to

hasty fractal Mar 7, 2025, 4:11 PM

#

bold terrace <@388069992660205588> / <@347088848854974465> , I'm tinkering about "Lapses" and...

not really something useful but

Screenshot_2025-03-07-21-41-03-29_572064f74bd5f9fa804b05334aa4f912.jpg

#

hope ross doesn't mind the ss

bold terrace Mar 7, 2025, 4:12 PM

#

That's a good idea ! But I think it would have to be implemented in something else than simply a few computation/ratio based on card metric

#

I don't know much about anomaly detection too

#

But I see the point

quasi shadow Mar 7, 2025, 4:13 PM

#

unique salmon <@449662392314494987> am I misunderstanding this code or does it only count "Goo...

#

I followed this guide.

unique salmon Mar 7, 2025, 4:14 PM

#

unique salmon Oh, I see, Straight Reward was made by a madman who counts "Hard" as "not succes...

So as I said here

quasi shadow Mar 7, 2025, 4:15 PM

#

unique salmon Why on Earth do you need RELU here?

Without leaky_relu, if new_streak is less than self.w[20], the gradient is alway zero for it.

unique salmon Mar 7, 2025, 4:16 PM

#

Ah, ok

quasi shadow Mar 7, 2025, 4:17 PM

#

It's used to address issues of vanishing gradients.

bold terrace Mar 7, 2025, 4:37 PM

#

bold terrace <@388069992660205588> / <@347088848854974465> , I'm tinkering about "Lapses" and...

I created the PR here, I merged both graph in one with a toggle "Divide Avg Repetition by Lapse"

https://github.com/Luc-Mcgrady/Anki-Search-Stats-Extended/pull/33

I attach a local build for thos who want to try it (it includes also the Stability over Time)

📎 searchStatsExtended.ankiaddon

GitHub

Feature/average repetition by lapse by JSchoreels · Pull Request #3...

Note : This include right now the changes of #32. So we might need to merge that one first, but at least I don't have conflict between both in this one :)
I also added an option in the conf...

#

cosmic hedge Mar 7, 2025, 4:58 PM

#

bold terrace I created the PR here, I merged both graph in one with a toggle "Divide Avg Repe...

I'm going to have to rebase this aren't I 😅

#

I'll hold off on the "make it redder!!!" till you're done XD

#

I also wanted that bar default feature for a while so feel free to default it to "bar" because people tend to like the bars more

polar maple Mar 7, 2025, 5:02 PM

#

quasi shadow So... is this case real?

could be one of those cases where MOVING-AVG does better than FSRS, but it's a real problem if it persists even after FSRS is optimized on these newer reviews

bold terrace Mar 7, 2025, 5:04 PM

#

cosmic hedge I'm going to have to rebase this aren't I 😅

Normally merging should be fine since the PR point to the main 🙂

#

The average stability is already good enough I think, I'm working right now on making it more precise (for ex, using stability = 2.3 instead of 2 from the bin value)

#

buuut it won't change much

polar maple Mar 7, 2025, 5:05 PM

#

bold terrace If I take my own distribution of Success Rate (R for one Card over the whole rev...

even if FSRS is perfect wouldn't you still expect a healthy spread of values due to randomness? So it's hard to tell from just this

cosmic hedge Mar 7, 2025, 5:07 PM

#

bold terrace Normally merging should be fine since the PR point to the main 🙂

yeah but i squash merge so i think that turns out poorly (?)

unique salmon Mar 7, 2025, 5:09 PM

#

polar maple even if FSRS is perfect wouldn't you still expect a healthy spread of values due...

To some degree, yes. You can easily calculate the probability that a card will be failed n times in a row at DR=x%, it's just (1-x)^n.
For example, at DR=90%, the probability of failing a card 3 times in a row =(1-0.9)^3=0.1%. So if you have 1000 cards and DR=90%, on average one card will be failed 3 times in a row

#

(unless I'm bad a math)

polar maple Mar 7, 2025, 5:11 PM

#

unique salmon To some degree, yes. You can easily calculate the probability that a card will b...

pretty much just look at the binomial distribution (divided by n) for what the spread is expected to be

unique salmon Mar 7, 2025, 5:11 PM

#

Maybe we can find leeches like that, actually. If a card has been failed n times in a row at some DR level, we can calculate how likely it is. And it's super unlikely (for example, <0.1%), then it's tagged as a leech

unique salmon Mar 7, 2025, 5:11 PM

#

polar maple pretty much just look at the binomial distribution (divided by n) for what the s...

And yeah, we can do binomial distribution shenanigans to extend this to failures that didn't happen in a row

#

It's just that math is simpler if all failures are one after the other

#

This gets problematic if DR changes, though

#

Since Anki doesn't store it anywhere

#

It doesn't store DR at the time of the review in Card Info

#

And I'm not sure if binomial distribution math even works if p(success) changes

bold terrace Mar 7, 2025, 5:16 PM

#

cosmic hedge yeah but i squash merge so i think that turns out poorly (?)

Yep, never use rebase or anything using rebase on a shared branch 😄

#

It might sounds/look prettier, but it's rewriting the story, different commit id, it's a mess

unique salmon Mar 7, 2025, 5:16 PM

#

bold terrace Mar 7, 2025, 5:16 PM

#

It's even called the "Golden Rule of Git" https://www.atlassian.com/git/tutorials/merging-vs-rebasing#the-golden-rule-of-rebasing

polar maple Mar 7, 2025, 5:18 PM

#

unique salmon And I'm not sure if binomial distribution math even works if p(success) changes

if the user changes DR then you just add up the different distributions per DR

unique salmon Mar 7, 2025, 5:19 PM

#

Btw, any card that has been failed 6 times in a row would be considered a leech at any DR that is used in Anki, if we use 0.1% as a cutoff

#

BINOM.DIST(0;6;0.7;TRUE) = 0.073% (Excel)

#

0 successes, 6 trials, p=70%

unique salmon Mar 7, 2025, 5:20 PM

#

polar maple if the user changes DR then you just add up the different distributions per DR

Can elaborate how you would calculate it?

polar maple Mar 7, 2025, 5:22 PM

#

the user does 150 reviews at 90% DR.
the user does 20 reviews at 70% DR
the expected distribution is (Binomial(150, 90%) + Binomial(20, 70%)) / (150 + 20)

unique salmon Mar 7, 2025, 5:23 PM

#

No, I mean, calculate the probability that a card has been failed k times out of n reviews at different DRs

polar maple Mar 7, 2025, 5:27 PM

#

unique salmon No, I mean, calculate the probability that a card has been failed k times out of...

can compute this by iterating over the exact number of failures at each specific DR

polar maple Mar 7, 2025, 5:39 PM

#

bold terrace Yup I see the idea. On the specific example of SRS Kai, it still means the ease ...

the models work by averages over a distribution of possible cards. The forgetting curve is just an average of individual forgetting curves drawn from this distribution, and success/failures should in theory update on these individual cards rather than the average. For example, maybe the model expects a user to learn 1 hard card for every 9 easy cards. Then even after just 3 successes for a new card, the model can update its belief; it's highly likely that this card is one of those easy cards and can schedule a longer interval accordingly. So yeah it's reasonable for ease to update after every review

#

RWKV curves at a fixed stability shows how much this underlying hidden distribution can affect the average distribution

#

hasty fractal Mar 7, 2025, 5:39 PM

#

shouldn't this become a different channel already at this point. we will have more interaction from people this way. who knows some random passer-by might get interested.

#

(deleted and reposted cuz didn't want to destroy Alex's message)

#

mods noticed me :blushed:

bold terrace Mar 7, 2025, 7:01 PM

#

polar maple the models work by averages over a distribution of possible cards. The forgettin...

I'm not arguing about updating after every review, I'm saying that the overall performance of that specific card might be interested to take into account (more than just the previous grade)

polar maple Mar 7, 2025, 7:08 PM

#

on another note, we must prioritize using log loss especially when developing models to account for average performance. e.g. if a card with 0.6 overall retention was just unlucky, we don't want to overcompensate when the 0.7 predictions were already correct. I've shown before that RSME (bins) encourages overcompensation for mistakes so we cannot use it to measure improvement benefits for this

polar maple Mar 7, 2025, 7:10 PM

#

unique salmon Some time ago me and Jarrett agreed to stick to "2% relative improvement per par...

time to come up with an equivalent for log loss?

unique salmon Mar 7, 2025, 7:13 PM

#

polar maple time to come up with an equivalent for log loss?

Sure. 0.02% improvement 🤣

#

Well, if we're being realistic, 0.001 improvement in absolute terms, I guess. Like from 0.327 to 0.326

#

So 0.001 improvement in logloss per parameter

#

Maybe 0.002, if we're being conservative

polar maple Mar 7, 2025, 7:17 PM

#

if you compare GRU-P-short to FSRS's RMSE difference and try to linearly map that to log loss then you get something like 0.00075 per 1% improvement

#

so i guess 0.0015 per 2% RMSE

unique salmon Mar 7, 2025, 7:18 PM

#

Alright, let's say 0.0015 absolute improvement in logloss per new parameter
@quasi shadow new rule just dropped 🤣

bold terrace Mar 7, 2025, 7:20 PM

#

Having said that ... If GRU-WHATEVER allow you to translate your DR=80% model to DR=70% without a big loss of prediction precision ...

#

... 😄

#

Would open a lot of doors for future algorithms 🙂

unique salmon Mar 7, 2025, 7:21 PM

#

bold terrace Having said that ... If GRU-WHATEVER allow you to translate your DR=80% model to...

Alex's RWKV would actually have a ton of advantages over FSRS

The more I think about it, the more I think it's actually very desirable

We can make R more accurate

We won't have to show parameters, which means one less thing for users to worry about

We can support proper same-day scheduling instead of the current mess

We can throw in new input features, like time of the day, workload, etc. Not just interval lengths and grades

We can remove "Optimize", which means even less stuff for users to worry about
RWKV would be pre-trained on 10k users, so it wouldn't need further optimization

#

In that sense, it would be more like ChatGPT - it Just Works™ out of the box

polar maple Mar 7, 2025, 7:23 PM

#

reminder that it would still be optimizing, but it would just be optimizing on the spot

#

well it would be doing the equivalent to optimizing on the spot is what i should say

#

you wouldnt say that an llm is optimizing on the spot

#

and we don't know how well it translates DR=80% to DR=70%, the only thing we have is faith in its better log loss

bold terrace Mar 7, 2025, 7:25 PM

#

Is it possible to test such a DR=80% into a DR=70% ?

#

I mean, evaluating how good an algorithm is at that

#

I never actually really understood how we actually test if a prediction is correct or not

#

I mean, you see the user entered "Again" when you predicted a 60% retention, how do you know if it's a good or not prediction ?

unique salmon Mar 7, 2025, 7:28 PM

#

bold terrace I mean, you see the user entered "Again" when you predicted a 60% retention, how...

If it's "Again", predicted value should be as close to 0% as possible
If it's not "Again", predicted value should be as close to 100% as possible

And you use some math function to calculate the "distance" between the real review outcome (0 or 1) and the predicted value

#

And then the optimizer tweaks parameters to minimize that "distance"

#

If the algorithm always outputs 100% for every non-Again and always outputs 0% for every Again, then it will have a "distance" of 0, since predictions are exactly equal to real data

bold terrace Mar 7, 2025, 7:32 PM

#

Ok thanks ! I got it now 🙂 Couldn't guess that it's minimizing the cost between the prediction at again and 0%

polar maple Mar 7, 2025, 7:32 PM

#

bold terrace Is it possible to test such a DR=80% into a DR=70% ?

some things to try:

find gaps in the review history. if someone took a 1 week break then you would expect the retentions to be lower and we can measure the performance of the curve this way. This could be how a 80% DR gets shifted to 70% DR. If someone took a break for the year we can measure the far end of the curve.
trust that the underlying reviews have a high variance in retention. this is probably true since I expect the underlying scheduler to have been SM-2. To illustrate this, if a perfect scheduler produced the data at 90% DR then the data is not enlightening at all, the perfect model would in turn just predict 90% all the time with no curve. But if the underlying scheduler that produced the data sucks then there will be plenty of data to work with already.

#

but i havent tried 1) yet and 2) is just an assumption

bold terrace Mar 7, 2025, 7:34 PM

#

polar maple some things to try: 1) find gaps in the review history. if someone took a 1 week...

Oh wait I think there is a miscommunication from my part, what I mean is :

FSRS is good in my case at predicting my 80% DR. Very, very good. But if I ask FSRS to predict my 70%, in fact I'll be at 60%.

#

The forgetting curve doesn't really match how fast I forget between 80 and 70

#

It does predict 80 very well, but it's too optimistic for 70, and too pessimistic for 90

#

I do my review by descending retrievability, in general, I have around ~95-98% for all my DR=90%

#

~60% for all my DR=70%

unique salmon Mar 7, 2025, 7:35 PM

#

We actually found that different curves are better at different retentions. And now we have no idea what to do with that information 😅

polar maple Mar 7, 2025, 7:36 PM

#

bold terrace Oh wait I think there is a miscommunication from my part, what I mean is : FSRS...

i think i'm talking about the same/similar thing. When you set a lower DR in FSRS it just multiplies all intervals by a constant factor, and to measure how well this does using the current data we can find gaps in the review history to simulate this effect

#

the best way of course is to run an actual experiment with users but we don't have the resources for that

bold terrace Mar 7, 2025, 7:37 PM

#

Oh OK now I get the gap idea

#

But if your training set is with people that used SM2, don't you have "by default" gaps ? Since for example, some might have a too high ease factor and it creates gap that FSRS would have filled with reviews ?

unique salmon Mar 7, 2025, 7:40 PM

#

unique salmon We actually found that different curves are better at different retentions. And ...

Btw, we have no clue whether:

We are misunderstanding our own data
This is a quirk of the optimizer
This is a fact about the forgetting curve: there is no universal shape, the shape has to depend on retention

polar maple Mar 7, 2025, 7:41 PM

#

bold terrace But if your training set is with people that used SM2, don't you have "by defaul...

yeah that's related to point 2), if the underlying scheduler that produced the data produces enough varied data when we don't have a problem at all

polar maple Mar 7, 2025, 7:42 PM

#

unique salmon We actually found that different curves are better at different retentions. And ...

if we assume that the underlying schedule was SM-2 and not people choosing a lower DR with FSRS then a lower average retention speaks more as to the difficulty of the cards and the distribution of cards

#

and RWKV curves suggests that this might matter for the shape of the curve

bold terrace Mar 7, 2025, 7:44 PM

#

polar maple and RWKV curves suggests that this might matter for the shape of the curve

Now that you mention it, it's true that the whole "Forgetting Curve drops faster in the beginning and slower at the end" might just not be true for some knowledge

#

It's not impossible that for some cases, you get more something like this

#

Typically, knowledge that "hold together" with 2-3 mnemo, that could hold together short term, but suddenly drop when they all dropped

unique salmon Mar 7, 2025, 7:46 PM

#

We could try passing D into the forgetting curve directly instead of using it only to calculate S. But that would screw up a lot of things, especially how we calculate S0 and the interpretation of S itself

polar maple Mar 7, 2025, 7:47 PM

#

interpret S as the point where retention is 0.9 and all problems are solved

polar maple Mar 7, 2025, 7:47 PM

#

bold terrace It's not impossible that for some cases, you get more something like this

i could try to come up with formulas that could allow this behaviour, but currently the nn models all cannot possibly replicate such a forgetting curve

unique salmon Mar 7, 2025, 7:48 PM

#

I mean this
We would no longer be able to do this since the forgetting curve would also depend on D, so there is another dimension now

polar maple Mar 7, 2025, 7:49 PM

#

bold terrace Now that you mention it, it's true that the whole "Forgetting Curve drops faster...

but i also don't think this is true, but possible in the sense that a user might continuously review their new cards outside of Anki in a way that keeps the memory strength high

polar maple Mar 7, 2025, 7:49 PM

#

unique salmon I mean this We would no longer be able to do this since the forgetting curve wou...

can change that to an average instead

unique salmon Mar 7, 2025, 7:49 PM

#

unique salmon I mean this We would no longer be able to do this since the forgetting curve wou...

The interval here would also depend on D is what I'm trying to say

#

The main issue is S0, btw

#

Since it requires choosing a fixed shape of the curve to estimate

polar maple Mar 7, 2025, 7:51 PM

#

bold terrace Oh wait I think there is a miscommunication from my part, what I mean is : FSRS...

i wonder if this is due to mental fatigue as well. It seems that RWKV could in fact give longer intervals than FSRS would if you set the DR to be very low, but in practice i'm sure that you will run out of mental strength first. Mental energy might be very important as is seen by how MOVING-AVG is better than FSRS for more users than not

bold terrace Mar 7, 2025, 7:52 PM

#

polar maple i wonder if this is due to mental fatigue as well. It seems that RWKV could in f...

Another theory I had was the fact that, by reducing DR, and keeping the same load, you start to crank up the new/day, which create a lot of interference knowledge, which is also a part of how you "forget things" (in this case, why you get them wrong)

#

I mean, take this exampel :
You remember 駅 as "Station". Now, you have also 訳, as "Reason". Next time you see 駅, you hesitate between "Reason" and "Station", and you get it wrong.

#

Can we consider you have forgotten the word ?

#

Or should we instead just say you got it wrong

#

forgotten != wrong, my point is

#

In this case, you could have a "long stable knowledge" that becomes a "very short stability one", now because of how you forget, but how you had too much simplification of the knowledge in your brain, by not experiencing similarities

unique salmon Mar 7, 2025, 7:55 PM

#

bold terrace forgotten != wrong, my point is

You know, I tend to say that there is no such thing as "overthinking", but you are very good at convincing me that there is...

bold terrace Mar 7, 2025, 7:56 PM

#

unique salmon You know, I tend to say that there is no such thing as "overthinking", but you a...

I mean, in #memes I would agree, but here I think we're not here for the memes haha

#

And even if we can't do anything about it, it's still something interesting, because it shows that if you see in practice that the "Forgotten Curve" is a lie, it might "seems like it", simply because "Forgotten" is not super well defined

polar maple Mar 7, 2025, 7:57 PM

#

bold terrace Another theory I had was the fact that, by reducing DR, and keeping the same loa...

this reminds me that i want to test RWKV for this sort of behavior. Give a new card and look at its forgetting curve. If I add a couple of new cards, will the first card's forgetting curve significantly change? I'm sure that it will

bold terrace Mar 7, 2025, 7:58 PM

#

And maybe one day, the succcessor of Anki, instead of having "Again/Hard/Good/Easy", could have "Forgot/Confused/Recalled/Slow Recall"

unique salmon Mar 7, 2025, 7:59 PM

#

bold terrace And maybe one day, the succcessor of Anki, instead of having "Again/Hard/Good/Ea...

I'm sure users will find it intuitive to use and there will be no misudnerstandings whatsoever

bold terrace Mar 7, 2025, 8:00 PM

#

unique salmon I'm sure users will find it intuitive to use and there will be no misudnerstandi...

I think as long as the user keep a consistent behaviour, it should be fine in thise case

#

For example, I use "Easy" as "I already knew it", and FSRS adapted its value quite well to it

#

The issue is that you still have to mark an option as "good or bad", like "Hard" that is for some people "good" and others "bad"

#

Or you have something so flexible like neural network that it will compute for you if "Hard" was used as an Again or a Good 😄

unique salmon Mar 7, 2025, 8:03 PM

#

It would be cool if we didn't have to treat the grades as binary when calculating the loss, but idk how to do that

#

We could make a neural net that outputs 4 different probabilities (one for each button), but idk if it would be advantageous in any way, and I anticipate all sorts of issues

polar maple Mar 7, 2025, 8:05 PM

#

RWKV-P predicts 4 probabilities, i believe it should help with improving gradient information

#

but idk how you would do this for a forgetting curve

unique salmon Mar 7, 2025, 8:05 PM

#

Wait, really?

#

Huh

polar maple Mar 7, 2025, 8:06 PM

#

yeah i'm even considering predicting the duration of the review

unique salmon Mar 7, 2025, 8:06 PM

#

polar maple but idk how you would do this for a forgetting curve

Yeah, that's the thing

#

It's not that predicting multiple probabilities is impossible, it's that there is no way to combine that with the "forgetting curve" approach

bold terrace Mar 7, 2025, 8:08 PM

#

To be fair right now, for me a simple linear regression would be a good enough forgetting curve hahahaha

#

I don't play with DR anymore, PTSD from last time

#

I do increment it 1% every month more or less

polar maple Mar 7, 2025, 8:10 PM

#

polar maple RWKV-P predicts 4 probabilities, i believe it should help with improving gradien...

https://arxiv.org/abs/1707.06887
the same idea is used in RL

arXiv.org

A Distributional Perspective on Reinforcement Learning

In this paper we argue for the fundamental importance of the value distribution: the distribution of the random return received by a reinforcement learning agent. This is in contrast to the common approach to reinforcement learning which models the expectation of this return, or value. Although there is an established body of literature studying...

unique salmon Mar 7, 2025, 8:10 PM

#

What if we have 3 forgetting curves - one for Hard, one for Good, and one for Easy? 🤣
And p(Again) is just 1 - p(Hard) - p(Good) -p(Easy)

#

Idk how we would use that Frankenstein monstrosity for scheduling, though

#

Oh, and their sum must always add up to 1, which is problematic

polar maple Mar 7, 2025, 8:12 PM

#

if we make certain assumptions then it should be doable

bold terrace Mar 7, 2025, 8:12 PM

#

Yeaah ! .01 RMSE earned ! New params day baby !
😄

polar maple Mar 7, 2025, 8:13 PM

#

nice, you will learn 0.01% faster now

bold terrace Mar 7, 2025, 8:13 PM

#

Well it seems in the past a "hard" first was similar than a "again" first

#

Nows a Hard first is actually better !

unique salmon Mar 7, 2025, 8:13 PM

#

Anki users tweaking 13456890 settings to learn flashcards 0.01% faster be like

#

(I am an Anki user)

bold terrace Mar 7, 2025, 8:14 PM

#

I mean, we're geeks

#

I won't start to feel guilty for it at almost 34

#

(I had to double check, I thought I was already 34)

#

(I know my RMSE better than my age)

unique salmon Mar 7, 2025, 8:15 PM

#

lol

#

Too bad no amount of Anki and parameter tweaking can help me get a gf

#

Well, I guess theoretically I could make my own dating app with my own algorithm, but that is not going to happen

bold terrace Mar 7, 2025, 8:17 PM

#

We're together for now 10 years, she's a math teacher, and I can tell you this : even her doesn't care the slightliest about Anki or FSRS

#

It's made for the geeks, for the geeks

#

We should just celebrate it together

polar maple Mar 7, 2025, 8:17 PM

#

unique salmon What if we have 3 forgetting curves - one for Hard, one for Good, and one for Ea...

give weights to hard, good, easy, such that the sum of the weight is <= 1

unique salmon Mar 7, 2025, 8:20 PM

#

And make them learnable on a per-user basis

#

Hmmm

unique salmon Mar 7, 2025, 8:20 PM

#

bold terrace We're together for now 10 years, she's a math teacher, and I can tell you this :...

Me who has never kissed at the age of 28

#

https://tenor.com/view/troll-trollge-reverse-troll-sad-troll-reverse-gif-19858090

Tenor

#

https://tenor.com/view/ryan-gosling-sad-sad-gosling-blade-runner-snow-ryan-gosling-blade-runner-sad-gif-10329809086636681181

Tenor

ashen light Mar 7, 2025, 8:24 PM

#

just find an anki deck for that topic

ashen light Mar 7, 2025, 8:24 PM

#

bold terrace And maybe one day, the succcessor of Anki, instead of having "Again/Hard/Good/Ea...

gonna make my own anki, with blackjack and hookers

ashen light Mar 7, 2025, 8:32 PM

#

unique salmon Too bad no amount of Anki and parameter tweaking can help me get a gf

alternatively, maybe tweaking the variables less would help 🍃

bold terrace Mar 7, 2025, 8:40 PM

#

ashen light alternatively, maybe tweaking the variables less would help 🍃

IMO, but I'll just send 1 message about this topic otherwise this will go from FSRS to "#date-advice" very quickly, being a geek/overthinker is not that much an issue, and as long as you build an "healthy sense of self confidence" it's always great to see passionate people (as long as they are also able to open up to other's passion lol). I always add the "healthy sense of" because you might land on r/TheRedPill, wearing long coats and stuff 😂 .

ashen light Mar 7, 2025, 8:44 PM

#

maybe what he needs most is a long coat

unique salmon Mar 7, 2025, 8:47 PM

#

unique salmon https://tenor.com/view/ryan-gosling-sad-sad-gosling-blade-runner-snow-ryan-gosli...

Unironically this coat would go hard af

#

#

bold terrace Mar 7, 2025, 8:48 PM

#

Yeaaah but (OK let's go off-topic a bit :D), you CAN FEEL people when they identify to those exterior things. It's almost like you see someone from Peaky Blinders coming to you

#

And the light, and the posture, and ryan gosling

ashen light Mar 7, 2025, 8:48 PM

#

the problem is you'd probably look like a dork wearing that 🍃

bold terrace Mar 7, 2025, 8:48 PM

#

It's like those chinese clothes on Amazon model pictures vs on a random russian in the review section

unique salmon Mar 7, 2025, 8:48 PM

#

kek

unique salmon Mar 7, 2025, 8:49 PM

#

ashen light the problem is you'd probably look like a dork wearing that 🍃

Fair enough. I'm not Ryan Gosling

ashen light Mar 7, 2025, 8:50 PM

#

none of us are

bold terrace Mar 7, 2025, 8:51 PM

#

Do a bit of gym, but don't build your whole identity around it
Try to wear nice clothes, but don't become a parody of Karl Lagerfield
Try to be positive and nice, but don't a boot-licker

ashen light Mar 7, 2025, 8:51 PM

#

become a monk and join a monastery

bold terrace Mar 7, 2025, 8:51 PM

#

https://www.youtube.com/watch?v=Mqfwbf3X8SA&ab_channel=LynyrdSkynyrdVEVO

YouTube

LynyrdSkynyrdVEVO

Lynyrd Skynyrd - Simple Man - Live At The Florida Theatre / 2015 (O...

Available on
► Digital: http://smarturl.it/LynSkyFADigital
► DVD: http://smarturl.it/LynSkyFloridaDVD
► BR: http://smarturl.it/LynSkyFloridaBR
► CD: http://smarturl.it/LynSkyFloridaCD
Earlier this year Lynyrd Skynyrd performed their first two studio albums, “Pronounced 'Lĕh-'nérd 'Skin-'nérd” and “Second Helping”, live in their entirety for t...

▶ Play video

#

Be a simple man

#

#

WEll, maybe not like them

#

But you get the idea

bold terrace Mar 7, 2025, 8:53 PM

#

ashen light * become a monk and join a monastery

"Have fun between boys"

#

"But not with the little ones"

unique salmon Mar 7, 2025, 8:53 PM

#

Wait for AI to become advanced enough that I can have an AI gf

bold terrace Mar 7, 2025, 8:53 PM

#

"Or to change church every year, you'll be doomed"

ashen light Mar 7, 2025, 8:53 PM

#

unique salmon - Wait for AI to become advanced enough that I can have an AI gf

you can do that today

bold terrace Mar 7, 2025, 8:54 PM

#

I might have found on some private trackers some first AI porn

#

It might have been quite interesting

#

When I see post on linkedin about how AI will change the world, I just think about how much they were taking about VR during Covid

#

So many VR Headsets sold for "VR Experiences beyond imagination"

#

With more install of DeoVR than Metaverse

bold terrace Mar 7, 2025, 9:56 PM

#

The "fsrs4anki_scheduler.js", is it a file from the playbook or the anki github ? can't find it somehow

unique salmon Mar 7, 2025, 9:58 PM

#

bold terrace The "fsrs4anki_scheduler.js", is it a file from the playbook or the anki github ...

https://github.com/open-spaced-repetition/fsrs4anki/blob/main/fsrs4anki_scheduler.js

GitHub

fsrs4anki/fsrs4anki_scheduler.js at main · open-spaced-repetition/f...

A modern Anki custom scheduling based on Free Spaced Repetition Scheduler algorithm - open-spaced-repetition/fsrs4anki

bold terrace Mar 7, 2025, 9:59 PM

#

Ok, the step 2.2 gave me something compatible with anki though

#

so probably not necessary anymore to do that step

#

and just cc-cv them in deck options

bold terrace Mar 7, 2025, 11:43 PM

#

Difficulty with 1% granularity

#

Let's be honest ... It doesn't really bring much much more value at all 😂

cosmic hedge Mar 7, 2025, 11:47 PM

#

bold terrace Let's be honest ... It doesn't really bring much much more value at all 😂

Guess it does show that theyre not all just 100%

cosmic hedge Mar 7, 2025, 11:50 PM

#

cosmic hedge As good an excuse as any to post my "difficulty time machine"

I'm doing a difficulty time machine but because of ts-fsrs its not 100% accurate

bold terrace Mar 7, 2025, 11:50 PM

#

Telepathy haha

#

I was thinking "Maybe a Time machine would be better"

#

Doing an average over time, except if you reaaaally zoom on 90-100, you won't see anything

cosmic hedge Mar 7, 2025, 11:51 PM

#

Did you try it?

bold terrace Mar 7, 2025, 11:51 PM

#

no

cosmic hedge Mar 7, 2025, 11:52 PM

#

If you look at mine you can see difficulty really starts going up

#

Maybe i should add an "average difficulty%" just at the bottom instead of as a seperate graph

bold terrace Mar 7, 2025, 11:55 PM

#

I kinda like being able to see trend

#

Especially like stability when you see very little steps compounding

#

#

Btw my draft : https://github.com/Luc-Mcgrady/Anki-Search-Stats-Extended/commit/a0b28fce51f2d859d16f841b2f7ee3ca77a046fd

GitHub

Draft of Difficulty viewer · Luc-Mcgrady/Anki-Search-Stats-Extended...

#

But it should be elsewhere

#

I started as a Pie but then I was like let's use a ScrollBar to configure bins=100

cosmic hedge Mar 7, 2025, 11:58 PM

#

While i was doing the pies I remember the d3 tutorial had something like "never use pie chats" in it

#

But hey they make the addon look a little less like the bar-chart-fest that it really is XD

bold terrace Mar 8, 2025, 12:00 AM

#

Yup 🙂

#

Anyway, enough for today

#

No big eureka for this time

#

I think D is probably something better left unseen

#

lol

#

I mean Difficulty right

#

By the way, Github Copilot is atrocious from time to time for JS ...

#

Full full hallucinations

unique salmon Mar 8, 2025, 12:12 AM

#

Wait, I have a genius idea
@quasi shadow
Right now we estimate S0 like this:
` def loss(stability):
y_pred = self.forgetting_curve(delta_t, stability)
logloss = sum(
-(recall * np.log(y_pred) + (1 - recall) * np.log(1 - y_pred))
* count
)
l1 = np.abs(stability - init_s0) / 16 if not SECS_IVL else 0
return logloss + l1

        res = minimize(
            loss,
            x0=init_s0,
            bounds=((S_MIN, INIT_S_MAX),),
            options={"maxiter": int(sum(count))},
        )`

If we want to make decay depend on D, we could either just assume some fixed value of D...or estimate it from the data!

The formula for converting D into decay is very simple. decay=-0.1×D. That's it. When D=1, decay=-0.1, the curve is flat. When D=10, decay=-1.0, the curve is steep.
The modified minimize function should look like this:
res = minimize( loss, x0=[init_s0, init_d0], bounds=((S_MIN, INIT_S_MAX), (1, 10)), options={"maxiter": int(sum(count))}, )
Now it will fit both S0 and D0 rather than just S0. So now we have a way to estimate D0 (kind of) directly from the data.
In pretrain's loss you can do this:
def loss(params): stability, difficulty = params[0], params[1] y_pred = self.forgetting_curve(delta_t, stability, difficulty)
In the forgetting curve itself you can do this:
def forgetting_curve(self, t, s, d): DECAY = -0.1*d FACTOR = 0.9 ** (1 / DECAY) - 1 return (1 + FACTOR * t / s) ** DECAY
Then you remove D from the formulas of S and pass D into the forgetting curve instead.

So now we have a flexible curve that can adapt to difficult material. Now we aren't just adapting S for difficult material, we are adapting the curve itself.

bold terrace Mar 8, 2025, 12:12 AM

#

And then you tell me I overthink 😄

unique salmon Mar 8, 2025, 12:13 AM

#

bold terrace And then you tell me I overthink 😄

I think about technical things that improve FSRS. You think about "forgot != wrong". We are not the same.

#

Usually "we are not the same" is said ironically, but I am being 100% serious

#

You were getting into sorata level "energy and force are non-physical" crap

#

Actually, I'm not sure which one makes me want to say "shut up and calculate" more - your "forgotten != wrong" or sorata's "energy and force are non-physical"
Both make me want to say "Look guys, don't do this. Just don't. Stop. For your own good and for everyone else's good, stick to crunching numbers, please."

#

And before you say "But 'forgotten != wrong' actually makes sense because..." - stop. Sit down. Take a deep breath. Do it three times. Now...numbers. Focus on the numbers. Or do your reviews. Or go outside. Or do anything else but this.

bold terrace Mar 8, 2025, 12:22 AM

#

I was just about to suggest you to take a deep breath ahah. Breath in, breath out, everything is fine 🙂

#

You're enough @unique salmon ❤️

ashen light Mar 8, 2025, 12:22 AM

#

unique salmon And before you say "But 'forgotten != wrong' actually makes sense because..." - ...

...same to you?

unique salmon Mar 8, 2025, 12:23 AM

#

ashen light ...same to you?

I'm not the one who says "Force and energy are mystical" or "Forgetting the card and getting the card wrong are different things"

ashen light Mar 8, 2025, 12:23 AM

#

you need to go outside though

unique salmon Mar 8, 2025, 12:23 AM

#

Ok, fair 🤣

ashen light Mar 8, 2025, 12:24 AM

#

I do agree with sound that forgetting something completely and "is it A or B" are two distinct things

bold terrace Mar 8, 2025, 12:24 AM

#

Oh no

ashen light Mar 8, 2025, 12:24 AM

#

like doesn't supermemo have like 3 grades of failure?

bold terrace Mar 8, 2025, 12:24 AM

#

Don't trigger a new chain reaction 🥲

unique salmon Mar 8, 2025, 12:24 AM

#

...I'm going to sleep

bold terrace Mar 8, 2025, 12:24 AM

#

Good 😄

ashen light Mar 8, 2025, 12:25 AM

#

I think anki needs a two types of failure buttons: "how did I even get here" and "in a 50/50 I am wrong"

bold terrace Mar 8, 2025, 12:26 AM

#

Yeaaah but in the end it doesn't matter that much since FSRS will just predict when you were enable to get it right (including both situation)

#

The main point was to describe why sometime "forgetting" in Anki can be brutal

tepid spoke Mar 8, 2025, 12:27 AM

#

I have so many bloody cards that are stuck in forever-leech-mode cause I consistently lose the 50/50

ashen light Mar 8, 2025, 12:27 AM

#

me too!

#

hence why I say this is an important distinction!

bold terrace Mar 8, 2025, 12:27 AM

#

Because it's not really forgetting, it's more like suddenly, you realize you learnt something not the "full way"

tepid spoke Mar 8, 2025, 12:27 AM

#

99% of the time for me it's the fault of rendaku

#

Or the random lack thereof

bold terrace Mar 8, 2025, 12:28 AM

#

It's also a danger of doing too much reviews I think, the more reviews you do, the more discriminant features will survive to recognize something, so if you reviewed only a subset of a learning domain, your brain will have reduced the patterns to something that is NOT sufficient

#

Once again, I remembered 駅 because it was "The R shape on the right"

#

Without even looking at the left after a few reviews

ashen light Mar 8, 2025, 12:29 AM

#

the solution obviously is just to add 20 more cards with that word on it

bold terrace Mar 8, 2025, 12:29 AM

#

But then 訳 comes and now, I built dozens of reviews "recognizing the R shape"

bold terrace Mar 8, 2025, 12:29 AM

#

ashen light the solution obviously is just to add 20 more cards with that word on it

Yes basically that's when I realized that core decks based on words frequency are not always that that smart

tepid spoke Mar 8, 2025, 12:30 AM

#

ironically, just looking at the right side is how most people read :D

#

Cause the right side most of the time denotes the reading, and then you just read the words

bold terrace Mar 8, 2025, 12:30 AM

#

I also something very interesting that japanese are able to recognize kanji even if the center is blurred

#

It's like in their brain, the outside shape of a kanji is sufficient to recognize them

tepid spoke Mar 8, 2025, 12:31 AM

#

It's just generic pattern recognition

#

in the end confusing two kanji is also a non-issue, since you rarely ever read a Kanji in Isolation and out of Context

#

And then you get into "recognizing entire words by their shape" territory

bold terrace Mar 8, 2025, 12:32 AM

#

Yeah sometimes it can be nasty things though

#

社会 vs 会社

#

If your brain learnt one by remembering the association of both

#

when the second comes, you're good to re-learn them a bit

#

That's why sometimes adding more cards can help remembering older ones

#

More rooms for connections I guess

#

And less room for too-simplistic/bad-pattern recognition

tepid spoke Mar 8, 2025, 12:35 AM

#

I can pretty much just read them to their sounds, and then the words are obvious

bold terrace Mar 8, 2025, 12:38 AM

#

You see

#

Basically you build another recognition pattern, based on pronunciation/reading/etc

tepid spoke Mar 8, 2025, 12:38 AM

#

階段 vs. 段階 is a much meaner one imo

bold terrace Mar 8, 2025, 12:38 AM

#

So learning is sometimes a lot of iteration over the same material

#

(But not simple bruteforced iteration, more like remodeling knowledge all the time, until you stabilize it)

tepid spoke Mar 8, 2025, 12:40 AM

#

I feel like some of the words I'm grinding right now I'll never truely stabilize

#

since they're so incredibly rare, I might never see them outside of Anki

bold terrace Mar 8, 2025, 12:40 AM

#

That might be a challenge indeed

ashen light Mar 8, 2025, 12:41 AM

#

the solution obviously is just to add 20 more cards with that word on it

tepid spoke Mar 8, 2025, 12:41 AM

#

Like, the latest levels of WaniKani have started teaching historical names, and Kanji that appear only in that single name

bold terrace Mar 8, 2025, 12:41 AM

#

I know in the first month I add trouble remembering じょうきょう (状況) until ONE TIME, I heard a character say it in an anime, and since then, its voice is associated to that word and the meaning with it

#

Same with とにかく that I always hesitated between "BTW" and "Anyway"

#

Until that character in Violet Evergarden say it 20x times per episode

tepid spoke Mar 8, 2025, 12:42 AM

#

yeah, having proper connections to the meaning of words it vitally important

#

And Anki can't easily provide that

#

though I do remember words I learned from my Grammar-Deck substantially better than the blank vocabs I learn

#

Cause I learned them in some other context...

bold terrace Mar 8, 2025, 12:43 AM

#

ashen light the solution obviously is just to add 20 more cards with that word on it

I know some people do some contest of Kanji recognition, for those it's a bit difficult because it's really standalone stuff

tepid spoke Mar 8, 2025, 12:43 AM

#

I also now have the reverse problem this causes on FSRS

#

Since I obviously know a lot of words by heart by now, those always get a good rating

#

But since those get optimized together with the random other ones, those get pushed away too quickly now

bold terrace Mar 8, 2025, 12:45 AM

#

aaaah indeed

#

yeah

#

Basically, that's why even if it might sounds a bit philosohical, even if FSRS tomorrow had a perfect prediction function ... Won't really change much about Anki itself

tepid spoke Mar 8, 2025, 12:45 AM

#

So I can either just accept that I will never properly learn those words cause of that, or make my parameters harder and be shown the easy ones way too often

#

Currently opting for the last one, since when I just let it optimize as it pleased, it was actually harmful

bold terrace Mar 8, 2025, 12:46 AM

#

There's a Youtuber that always insist a lot on how SRS can be super tempting, but in long term fall short against more mind mapping/contextual learning of concepts

#

https://www.youtube.com/@JustinSung

YouTube

Justin Sung

Hey there! I'm Justin Sung, a learning coach (for the last decade), former doctor, top 1% TEDx speaker, education author, and social entrepreneur.💡

I'm also the co-founder and Head of Learning at iCanStudy, where we've pioneered the world's first cognitive retraining program, focusing on self-regulated higher-order learning (i.e. learning to le...

tepid spoke Mar 8, 2025, 12:47 AM

#

SRS is just a helpful tool

bold terrace Mar 8, 2025, 12:47 AM

#

Justin Sung

tepid spoke Mar 8, 2025, 12:47 AM

#

But to actually acquire a language, it's just not enough on its own

bold terrace Mar 8, 2025, 12:47 AM

#

tepid spoke SRS is just a helpful tool

An extremely helpful one ! .... But still just a tool 😄

#

Yup

tepid spoke Mar 8, 2025, 12:47 AM

#

Though tbf, I did learn japanese entirely in Anki to a degree that I could watch native content

#

But that is very much just shoehorned into Anki, and kinda abuses the SRS system a bit

#

as indicated by this

bold terrace Mar 8, 2025, 12:50 AM

#

hehe

#

For my vocab deck though

#

This is the whole jazz

#

But problem is, sure I know a few words, but how sentences are built, meanings, nuances, I just don't learn them well with Anki

tepid spoke Mar 8, 2025, 12:51 AM

#

Well, you can

#

But it doesn't exactly fit into an SRS system :D

bold terrace Mar 8, 2025, 12:52 AM

#

Those past weeks I've been dedicating 30-60min per day really pausing subs, analyzing, and my overall understanding really improved

tepid spoke Mar 8, 2025, 12:52 AM

#

That's why my stats are so ridiculous on that deck

bold terrace Mar 8, 2025, 12:52 AM

#

Sure you can !

#

But I think then you're basically trying to recreate the outside world in Anki

tepid spoke Mar 8, 2025, 12:52 AM

#

The JLab deck really works well

bold terrace Mar 8, 2025, 12:52 AM

#

Generating sentences, Generating Audio, randomizing cloze ...

tepid spoke Mar 8, 2025, 12:52 AM

#

but you absolutely must stay on default parameters when using it with FSRS

#FSRS Megathread