#FSRS Megathread

1 messages · Page 12 of 1

robust hill
#

no they just

#

dont

#

bro i introduced one of my friends to anki

#

and he had no clue deck options existed

#

i literally told him over call so many times

rotund summit
#

peak anki dank

robust hill
#

i just turned desired retention from 93 to 90%

#

then i rescheduled with fsrs helper addon

#

dropped me from 170 to 79

#

then i did it again cause why not

#

it went to 81

#

then 82

#

then no more

tepid spoke
#

Learning steps are the only practical way imo

soft skiff
#

Thank you, this is very helpful. With the help of this, the conclusion can be drawn that even cards above 1k per day are OK?

tepid spoke
#

What good would an algorithm be that tells you you need to learn that card in 2h, that one in 3h, the other one in 1.5h, and maybe that one in 7h?

unique salmon
tepid spoke
#

When you only got one or two timeslots in your day where you can Anki

soft skiff
tepid spoke
#

With learning steps I can configure it to adhere to my schedule

unique salmon
tepid spoke
#

1000 new cards per day is pure insanity oO

unique salmon
#

Tbf, even if we're talking "per 50 years", Woz (the author of that article) seems to imply that motivation plays a larger role than raw brainpower

robust hill
#

lowkey forgot which reschedule is better to do

tepid spoke
#

I also legitimately still wonder how accurate any algorithm can be in reality. Since you inevitably end up encountering and using what you learn elsewhere.

#

And then any forgetting curve the algorithm is assuming gets messed up

soft skiff
unique salmon
tepid spoke
#

Like, I experience that in my deck all the time

#

There are some oddball words I only ever see in Anki

#

And it's usually an Again, unless they're odd enough to be memorable

#

While a lot of others I hear almost every day

unique salmon
#

Btw, once FSRS-6 comes out, splitting material into different decks and presets will be more beneficial, thanks to the shape of the curve adapting

robust hill
#

for retention

#

does load balance change anything

unique salmon
#

nope

tepid spoke
#

Also, today I had 17 "Agains" where I got a Rendaku wrong.

#

Most of those leeches. Because of Rendaku.

#

I need a 5th button just for "Wrong Rendaku" at this point

robust hill
#

to load balance or not to load balance

unique salmon
robust hill
#

so then

#

whats the difference with fsrs herlper add on reschedule

#

and normal

unique salmon
#

What version are you using?

robust hill
#

i forgot

#

25.02

unique salmon
#

@quasi shadow

robust hill
#

yea probably

#

i trust this goat

#

heres your anki card from you 1y ago

tepid spoke
#

Wasn't rescheduling via the AddOn superior?

robust hill
#

no idea

#

just tell me whats better so i can do my cards

#

now i am cooked

#

paralyzed

tepid spoke
#

Why is rescheduling this important for that?

robust hill
#

because

#

i have 307 reviews to do

#

and one rescheduling changes that to 160

#

and the other to 85

tepid spoke
#

Just do the 307 then, safest option!

robust hill
#

noooooo

#

i must be most optimal

#

yes master...

#

i will do it

tepid spoke
#

Unless they take your really long

robust hill
#

okay what about for a new deck with clean fsrs options

tepid spoke
#

307 would be a normal day for me, but I only take 10~15 seconds per card

robust hill
#

started it 5 days ago with avg 30 new a day

#

should i reschedule with the normal one

tepid spoke
#

30 new a day leading to 300+ in just 5 days sounds odd

robust hill
#

this is for a different deck

tepid spoke
#

ah, whoops

#

missed that line

robust hill
#

no worries

tepid spoke
#

For a new deck there is very little point rescheduling

#

You're most likely still on default parameters anyway, and have been the whole time

robust hill
#

yea i am

#

thats what im trying to say

tepid spoke
#

So I'd expect it to do nothing

robust hill
#

when i optimize

#

with which should i reschedule

tepid spoke
#

From what I remember, you want to reschedule via the AddOn

#

it does it in a nicer way somehow

#

Forgot the details, but I think the native one puts an entry into the revlog, while the addon actually recalculates stuff

robust hill
#

noooo not the revlog

unique salmon
# tepid spoke I also legitimately still wonder how accurate any algorithm can be in reality. S...

FSRS-5 with a flatter curve on 1000 users
In case you don't know how to read this: ideally, predicted retention should match observed retention for some group of cards. So we put cards into sufficiently many small groups and measure the average retention within each group, as well as the average of FSRS predictions. They should match as closely as possible. If FSRS predicts an average probability of recall of 99% for these cards, you should recall 99% of those cards. If FSRS predicts an average p(recall) of 50% for these cards, you should recall 50% of them.
Orange line - perfect algorithm
Blue curve - FSRS-5 with an extra flat curve. FSRS-6 will be better thanks to an adaptive (not fixed) curve + one more parameter for same-day reviews

So as you can see, on a large dataset FSRS performs very well. Then again, it may perform poorly for some individual user or deck

soft skiff
#

People often say that studying over 1,000 flashcards a day is really difficult, but is there a way to make it easier? If there is, that would be amazing—it would let us finish learning as fast as possible, which is definitely a good thing.

tepid spoke
#

1000 new ones or just in total?

soft skiff
tepid spoke
#

That still seems like insanity to me

#

At least for me, I'd be so fatigued after not even half of them, that I'd remember jack all

robust hill
#

considering that

#

when i do 1 new card it takes around

#

2.8 reviews

quasi shadow
robust hill
#

so

unique salmon
robust hill
#

1000 new ones a day is around 2800 reviews from just new alone

#

and then take into account the reviews which go up to 8x if i recall correctly from that one rule

#

you would be doing around 10000 ish reviews a day

#

dont know if anyone does that

tepid spoke
#

I'd guess it'd be 1000 new ones once?

robust hill
#

if you had the entire day to anki yea you could do that

tepid spoke
#

1000 new ones a day for extended periods of time would make you run out of hours in a day fast

unique salmon
robust hill
#

11.95s per review on my avg so if i had to learn 1000 new

#

2800 rev times 12s

#

around 9 hours

#

for me to do 1000 new

#

💀

#

close enough

unique salmon
#

I'd love to plot this for 1000 users for all versions of FSRS

#

That would be graph porn

#

It would give us a clear visualization of how much FSRS is wrong, on average

#

Would be nice if it was added to the benchmark code

#

Plotting the averaged calibration graph, I mean

quasi shadow
unique salmon
bold terrace
#

So when you have free time : Do more real-life exposure, don't do more anki

robust hill
#

they are not the same

bold terrace
#

Except if your workload is very very low

robust hill
#

because one rescheduling gives different things than the other

#

but i digress

robust hill
#

its the best way to measure my success

#

instantly

bold terrace
robust hill
#

im all for delayed gratification but if im studying for my exam

quasi shadow
unique salmon
#

I vaguely remember this being discussed

robust hill
#

cooked

bold terrace
#

This a very, very great point I never thought about but illustrate well why "short term memory model" in terms of "seconds, minutes, hours" might not really mean anything in the first place.

quasi shadow
unique salmon
#

And it's still not 100% consistent?

robust hill
#

i would cram so much new material

#

take a nap

#

and cram more new material

#

then sleep, and it worked pretty effectively
until one day the nap didnt nap and i was on my phone for 4 hours.

bold terrace
# robust hill take a nap

A nap like 10min or 1h30 ? Because if it mess with your normal sleep schedule, might backfire a bit no ?

robust hill
#

it would be 30 minutes

quasi shadow
robust hill
#

but it would be no later than 3pm

unique salmon
#

crap, even

#

shit, if I may

robust hill
#

a 90 min nap can be good

#

if you are sleep deprived

quasi shadow
lapis hearth
lapis hearth
#

Why not, when you basically just said FSRS 6 would benefit more from this

unique salmon
#

Unclear how to do it algorithmically + users already can do it themselves based on the content of the cards

lapis hearth
#

Jarrett asked me to open this as an issue on github. I guess this will not come to fruition...

#

It has been a while

soft skiff
#

Hey guys, I came across this thing about tackling knowledge fragmentation called incremental reading. Not super clear on it, but I think it’s some kinda card-making process. Why does this even help with the issue?
Like, say i make cards from a PDF, then use pdf.js to display ‘em in a card interface—wouldn’t that give me the same vibes?

unique salmon
#

Oh, wait, I have the rights to close other people's issues?

unique salmon
#

Permissions are wack

#

I am not worthy of such power

lapis hearth
quasi shadow
#

😎 Leave it open. Maybe some random guys would solve it when I quit.

cursive badge
bold terrace
#

I already said it a few times so sorry if I'm like a broken record, but I see in Anki the relation between reps and stability is negative : Cards with higher reps have lower stability, and even if it might just be a matter of inherent difficulty, even if I look at cards I rep'ed a lot, it feels each lapse doesn't necessarly grow that much between cycles

#

They look like this

cursive badge
#

That was kind of how I was trying to detect leeches in my January attempt. I was looking at the largest passed intervals in "pass chains" and seeing how they changed / how much total time you spent studying to reach your last peak.

bold terrace
#

Yep that's a nice idea

cursive badge
#

I still think that something like that needs to be mixed in with the Poisson Binomial stuff to get a good detector.

bold terrace
#

But I think to detect those cards, it might be as simple as to say : "What are the cards you already lapses 3 times" cat_sad

#

And I know I was the first one to say it was meaningless to just watch lapse with FSRS that predict probability

#

But I think I was terribly wrong lol

cursive badge
#

I was trying to find an "objective" measure of leechness after the fact so you could try training a NN to detect them early.

bold terrace
#

Lapse Distribution, Average Load by Lapse ... it feels as soon as I start to lapse, things don't get better

robust hill
#

its so over..

#

millions must not lapse

bold terrace
#

Yeah I think effort put in this is quite useful, don't get me wrong @cursive badge

#

Because clearly, I think some cards might go from "leechy state" to "healthy state" at some point

#

But I have very very few example

#

I search for things like "prop:s>N and prop:reps>M", and it's depressing

#

Over 1789 mature card (prop:s>20), the 1th percentile (17 cards) with the most reps has 25 reps with 840 cards that have more than 25 reps

cursive badge
#

The perfectionist in me wants Anki to have a full version control system so we can do things like go back in time and see if tweaking card templates/field info affected card leechness.

bold terrace
#

We need a DeLorean

#

That's why I do those search, I try to see what triggered that switch between a leechy state to a high stability one

#

I'm even wondering if the answer is in the revlog

#

The only 4 match I have for prop:s>20, prop:reps>28, are :

  • 眠る : I think I failed it a lot until I just got used to see so much 寝る that I stopped confusing both, so it's 寝る that in fact stopped me confusing it again and again
  • 何とか : I think I was never sure if it was なんとか or なにとか, which now sounds awefuly off when I say it aloud.
  • 恋 : I confused it with 愛 since they had the same meaning, and later with 窓 because I felt it looked the same, but with time I just realized recognized more easily those 2, which led me to more easily recall this one
  • 攻撃 : That one I think it increased simply because I had 4-5 cards with 撃 in it so I cross-review it withiout realizing a lot (爆撃、砲撃、出撃、直撃....)
#

Sooo it seems what helped the most were OTHER related cards

quasi shadow
#

Because you have the Write right.

bold terrace
quasi shadow
bold terrace
#

Because I think it make sense, when I look at cards with high stability, the only once that had lapse, are the one that sure had a few lapse, but then had a long non interrupted sequence of just good answer

cursive badge
bold terrace
#

ok

cursive badge
#

I seem to come back to leeches about once a month 😅

bold terrace
#

Damn I'm looking into it, on 1077 cards with prop:s>60, I have 250 that has lapsed a leat once, and indeed afterwards they never got lapsed again

cursive badge
#

This month is cursed though, so no PoC this month.

bold terrace
#

(I mean, they lapsed their number of lapse, but clustered in the beggining)

#

WHen they were good to go, they just never failed anymore

#

Lot of issues IRL ?

#

This week end I want to finish the graphs I was doing (Workload by Lapse/Repetition 5-percentile), but I was curious to create my first addon from scratch. I might take a look

cursive badge
#

One family member has cancer and had to have invasive surgery. Another had to be rushed to hospital in an emergency and have surgery for other reasons. ☹️

bold terrace
cursive badge
#

In more positive news I did your prop:s>20 prop:reps>28 search and have 303 cards. So some leeches may recover.

bold terrace
#

Trying to brainstorm that idea

#

Don't know if there are things that could contradicts the last 2 points

cursive badge
#

I started writing a whole article (about leeches) and then remembered I hate writing and abandoned it 😂

bold terrace
#

Yeah same laughcry

#

But I like to write a few chunks just to gather/reflect/discard

#

"Maybe add a factor to avoid unleeching too fast ? For ex if the new successful interval is 5d instead of 4d."

unique salmon
#

just use my detector man
(it doesn't currently exist in an easy to use form)

bold terrace
#

No it's not really good

unique salmon
#

It sure sounds better than only counting lapses

bold terrace
#

not really

unique salmon
#

yes really

bold terrace
#

It marked as leech things that lapses 2-3 times in a row in the first days but never failed for many days afterward

#

it's also focusing on rep probability, not really increasing interval

#

I used it, tweaked it, experimented with it, the things marked were not useful

#

Don't mean that as an offense but it's really not great unfortunately

#

To be fair I was also convinced it could lead to something great

#

I remember being the one to always push for that idea at first

#

so a bit sorry if now I say the opposite, but it's when you use that you realize if the idea was good or not

#

and it was not 😦

cursive badge
#

I feel like it gives you useful info, but it needs to be combined with other things to make it a "leech detector".
At the moment it's just a "something is wonky with the historical FSRS predictions detector" which I do not think is exactly the same thing.

bold terrace
#

And for example if your model overestimate your early retention, a lot of things just get leeched because you got them wrong 2-3 times in a row

ashen light
#

I think the benchmark of "is it better than the current system" is the best way to approach this, otherwise nothing will ever happen because nitpicking is fun for everyone

cursive badge
#

I'm just not convinced it has passed that yet. It can give some odd results.

ashen light
#

I'm not convinced the current system is useful in any capacity

cursive badge
#

It feels silly to spend time refactoring the reviewing code, insert the Poisson Binomial stuff and then not be sure it is doing much better than manually reviewing cards you have lapsed several times.

#

I'm not saying you shouldn't do it if you are convinced. I'm just saying I'm not.

ashen light
#

I'm not convinced that dae would be convinced

#

and so I have not worked on it

polar maple
#

@unique salmon why do we use the median of parameters as the default rather than something like training a set of parameters that minimizes average loss on the combined revlogs of 100 users?

polar maple
#

i think as we add more and more parameters, the median of parameters makes less sense

#

because the parameters can interact in strange ways

unique salmon
#

Ask Jarrett to do this for some subset of users (100 is too few IMO)

#

I ain't going down the rabbit hole of combining revlogs

polar maple
#

you can just concatenate them

#

but remove the code for recency or precompute the recency weights before concatenation

unique salmon
polar maple
#

you don't need to store it into a file

#

concatenate them in memory and train on them in the same program execution

unique salmon
#

Then it's an EVEN BIGGER pain

polar maple
unique salmon
#

😭

polar maple
#

thats how GRU was trained

#

you might be able to just plug in FSRS and it might work

unique salmon
polar maple
#

i mean it like probably 95% of the code is ready but you might need to change a line here or there

#

median of parameters makes some sense for optimization, but we could keep the default parameters separate from the parameters that the optimization starts with

unique salmon
polar maple
#

it's just two different sets of default parameters, not too bad

#

i mentioned that LSTM's default parameters cannot be used without training, this is just a similar idea

#

the set of parameters that optimizes performance after training, is not necessarily the same as the parameters that performs best without training

polar maple
#

huh? not sure why such a small change would be overengineering

unique salmon
#

I doubt that the gain in accuracy would be significant, if exist at all. And it would require adding extra logic in Anki, so you'd have to convince Jarrett

#

Like, for using one set of parameters as default for users and the other as a starting point

#

I expect that the set of parameters that performs better as default is also better as a starting point, btw. Aka I expect that this doesn't work

polar maple
#

i think accuracy might increase, iirc there are users where simply increasing # of epochs would increase performance

#

and even if this is not the case, if we can decrease training time for the same performance then it would still be worth it

#

like -20% training time for the same perf would be great but that's a stretch

polar maple
#

you can optimize directly for initial parameters that do well with whatever LSTM uses

#

for now if we can get a set of parameters from joint optimization, we can check if it then does better or worse after finetuning

#

if it does much better or much worse then we would know that the optimization is sensitive to initial parameters, would be worth exploring more with an actual meta-learning algorithm

unique salmon
#

I'm trying to think when exactly parameters that are the best as a starting point are not the best as default
So imagine a high-dimensional parameter space. The closer the starting point is to the optimal point in the parameter space, the faster optimization will converge. So we want to pick the starting point that is as close to optimal as possible.
At the same time, if two points are close in the parameter space, their respective loss must also be close. Unless there are some insanely sharp valleys such that a small difference in parameters results in a large difference in loss.
So what I'm trying to say is that as long as there are no crazy sharp valleys, I don't see how a closer-in-the-parameter-space starting point would result in worse loss somehow

#

Aka I'm assuming that loss doesn't look like this

#

Ok, here's a better illustration because I suck as explaining it

#

As you get closer in the parameter space, the loss also gets more similar
Hence "Closer to the optimal point in the parameter space" = "Has a lower loss"
Aka "if it's a better starting point for optimization, it will also be better for users in terms of accuracy"

#

Unless the loss landscape looks like a jagged mess

polar maple
#

fsrs now has 20+ parameters its probably quite complex

#

@unique salmon https://arxiv.org/pdf/1803.02999
take a look at "4 Case Study: One-Dimensional Sine Wave Regression" for a toy problem where joint optimization is obviously much worse

unique salmon
#

why tf it takes 1.5 hours to concatenate users...

#

bro
This isn't training, THIS IS JUST CONCATENATION

#

That is tragic

#

it's 20 minutes now, yippie

#

Still seems two-three orders of magnitude slower than what I would expect

cursive badge
#

Wow, is that just for loading the parquet files into memory?

unique salmon
#

yep

#

Actually, wait, will I even have enough RAM for 500 users...

#

oops

#

Guess we'll see

cursive badge
#

I'm sad to say that discord has no good GIFs for "download more RAM" 😢

cosmic hedge
# unique salmon

#1282005522513530952 message do this then group by user id maybe idk

polar maple
cursive badge
polar maple
unique salmon
#

Alright, I'll do it on 10 users. I'm restarting anyway because >150 will make my RAM explode

#

I HATE YOUR JARRETT

#

NOT EVEN ONCE

#

HAVE I MANAGED

#

TO RUN

#

THE

#

FUCKING

#

BENCHMARK

#

CODE

#

ON

#

THE

#

FIRST

#

TRY

#

AAAAAAAAAAAAAAAA

cursive badge
#

Cursed 😂

unique salmon
#

This isn't even modified pretrain, just other.py

#

Like, this is literally unmodified

#

@quasi shadow just add FSRS-6 to pretrain so I you can test the idea of optimizing FSRS parameters on a combined revlog to obtain better default parameters, I will have a mental breakdown trying to modify anything in the benchmark code

#

...or even run it at all

polar maple
unique salmon
#

Oh, yeah, I keep forgetting that I need to explicitly specify the full path, otherwise it doesn't work
if DEV_MODE: sys.path.insert(0, os.path.abspath("../fsrs-optimizer/src/fsrs_optimizer/"))
This is the code that works for Jarrett but not for me, so I have to modify it every time

#

bro why

#

I updated the optimizer AND specified the full path

#

I made 100% sure I'm using the latest optimizer code and the benchmark code from the FSRS-6 branch

polar maple
#

maybe add a print to the optimizer code to see if it shows up

unique salmon
#

Memo for Jarrett:
The idea is to optimize FSRS on a combined revlog of a lot of users (I wanted to do 500 btw), get parameters from that and see if they work better than the current median parameters

  1. Pretrain FSRS-6 on 200 or 500 or whatever users
  2. Get the parameters from that
  3. Run FSRS-6 without optimization ("dry run"), with new parameters as defaults
  4. See if log loss/RMSE are better with these parameters than with median parameters
quasi shadow
#

I don’t know why you still have that error.

#

Please use git to manage your code.

quasi shadow
#

Btw, if you are using VSCode, I can help you debug the code.

quasi shadow
#

added

#

It's worse.

#

I used the parameters pretrained in first 100 collections as the default parameters.

#

But the algorithm performs worse in first 100 collections.

polar maple
#

and make sure to test on users that were not pretrained on

polar maple
quasi shadow
#

It has been there.

polar maple
#

but were those default parameters from the median of parameters?

quasi shadow
polar maple
#

the idea is to try to find a better set of default parameters from joint training only for users who never press Optimize

#

it is normal that the set of parameters that performs best for users who never press Optimize, is not necessarily the same as the set of parameters that does well after user-specific optimization

#

this is the whole point of why LSTM uses the Reptile algorithm rather than doing joint optimization like GRU

quasi shadow
#

OK, it performs better on users pretrained on.

#

I will evaluate it on users not pretrained on later.

#

Interesting

#

It truly works better.

#

I have excluded the first 100 users from the evaluation.

polar maple
#

nice!

quasi shadow
#

😂 So we will have two sets of parameters.

#

A set of parameters for initial parameters of optimization.

polar maple
#

yep

quasi shadow
#

A set of parameters for users who haven't optimize the parameters.

#

Should I pretrain the parameters with more collections?

polar maple
#

yeah as much as your RAM allows

#

plenty of users use the default parameters

polar maple
# quasi shadow

for the median of parameters method, why is the decay exactly 0.2?

quasi shadow
polar maple
#

icic

bold terrace
#

@polar maple , do you have any ressource to recommend to learn a bit all this ?

polar maple
bold terrace
polar maple
#

so it seems that initial parameters might be worth looking into more, I'll try the reptile algorithm on FSRS later

bold terrace
polar maple
#

maybe decrease logloss by 0.001...

bold terrace
#

But it seems it's only the first first step towards building something useful

cosmic hedge
polar maple
#

but depending on how jarrett implemented the test for the joint optimization params, that code might have prevented parameters from properly reaching optimal user parameters

#

since the joint optimization params will have values that are further than the median

#

so maybe you're into something

bold terrace
#

I also remember to do gradient descent you can have variable "step size" to converge faster at first (and reduce them when you're getting closer to the optimal)

#

I don't know if it's applied here

polar maple
#

but with the fixed computational budget I think it doesn't reach optimal params for some users

unique salmon
#

Please post the results on all 10k users once it's done. And add it to the readme as FSRS-6 def. param, of course

#

And don't exclude those 100 (or however many, if you plan to pretrain FSRS on more) from evaluation, since for the table in readme we want all algorithms to be evaluated on exactly the same data

#

"As much as your RAM allows", as Alex said

quasi shadow
quasi shadow
#

PID PGRP USER PRI NI VIRT RES S CPU% MEM%▽NLWP TIME+ Command (merged)
45793 45793 jarrettye 17 0 448G 54.9G ? 98.3 55.9 21 1h19:08 python3.9│python pretrain.py --algo FSRS-6

#

It takes ~55GB RAM😅

unique salmon
quasi shadow
#

[0.1251, 0.7461, 1.8218, 9.947, 7.3195, 0.8448, 3.5258, 0.001, 1.9802, 0.156, 0.8337, 0.5343, 0.001, 0.4795, 0.8497, 0.7124, 1.0011, 0.7391, 0.3847, 0.1437, 0.1373]

unique salmon
quasi shadow
#

w[16]?

unique salmon
#

Easy bonus

#

It's 1.0011

quasi shadow
#

yeah

#

it's werid

#

[0.2172, 1.1771, 3.2602, 16.1507, 7.0114, 0.57, 2.0966, 0.0069, 1.5261, 0.112, 1.0178, 1.849, 0.1133, 0.3127, 2.2934, 0.2191, 3.0004, 0.7536, 0.3332, 0.1437, 0.2]

#

If we compare it with the median

#

w[11] and w[14] are significantly lower than the median

#

fine, I will do more research in the next week.

unique salmon
quasi shadow
#

median parameters:

1:again, 2:hard, 3:good, 4:easy

first rating: 1
rating history: (1,3,3),3,3,3,3,3,3,3,3
interval history: 0.0d,0.0d,0.0d,1.0d,2.0d,6.0d,16.0d,1.3m,3.1m,7.1m,1.2y
factor history: 0.0,0.0,0.0,0.0,2.00,3.00,2.67,2.50,2.35,2.26,2.14
difficulty history: 0,7.0,7.0,6.9,6.9,6.9,6.9,6.8,6.8,6.8,6.7
stability history: 0,0.2,0.3,0.5,2.3,6.1,16.1,40.0,94.4,211.6,454.2

first rating: 2
rating history: (2,3,3),3,3,3,3,3,3,3,3
interval history: 0.0d,0.0d,0.0d,2.0d,6.0d,18.0d,1.6m,4.2m,10.0m,1.9y,4.1y
factor history: 0.0,0.0,0.0,0.0,3.00,3.00,2.72,2.55,2.41,2.29,2.18
difficulty history: 0,6.2,6.2,6.2,6.2,6.1,6.1,6.1,6.1,6.0,6.0
stability history: 0,1.2,1.5,1.8,6.1,17.8,49.0,125.3,301.5,687.9,1496.9

first rating: 3
rating history: (3,3),3,3,3,3,3,3,3,3,3
interval history: 0.0d,0.0d,4.0d,14.0d,1.5m,4.5m,1.0y,2.6y,6.3y,14.5y,31.6y
factor history: 0.0,0.0,0.0,3.50,3.21,3.00,2.76,2.57,2.42,2.29,2.18
difficulty history: 0,4.9,4.9,4.9,4.8,4.8,4.8,4.8,4.8,4.8,4.7
stability history: 0,3.3,3.5,13.7,45.2,134.6,371.9,957.4,2316.1,5301.7,11546.9

first rating: 4
rating history: (4),3,3,3,3,3,3,3,3,3,3
interval history: 0.0d,16.0d,2.2m,7.9m,2.1y,6.4y,17.6y,45.2y,100.0y,100.0y,100.0y
factor history: 0.0,0.0,4.06,3.65,3.27,2.99,2.76,2.57,2.21,1.00,1.00
difficulty history: 0,2.5,2.5,2.5,2.5,2.5,2.5,2.5,2.5,2.5,2.5
stability history: 0,16.2,65.4,236.5,775.7,2321.7,6413.6,16500.8,36500.0,36500.0,36500.0
#

pretrained parameters:

1:again, 2:hard, 3:good, 4:easy

first rating: 1
rating history: (1,3,3),3,3,3,3,3,3,3,3
interval history: 0.0d,0.0d,0.0d,1.0d,2.0d,6.0d,17.0d,1.4m,3.3m,7.2m,1.2y
factor history: 0.0,0.0,0.0,0.0,2.00,3.00,2.83,2.53,2.33,2.15,2.03
difficulty history: 0,7.3,7.3,7.3,7.3,7.3,7.3,7.3,7.2,7.2,7.2
stability history: 0,0.1,0.2,0.4,2.2,6.5,17.2,43.0,99.5,214.9,436.0

first rating: 2
rating history: (2,3,3),3,3,3,3,3,3,3,3
interval history: 0.0d,0.0d,0.0d,1.0d,5.0d,17.0d,1.7m,4.7m,11.6m,2.2y,4.7y
factor history: 0.0,0.0,0.0,0.0,5.00,3.40,3.06,2.71,2.48,2.28,2.13
difficulty history: 0,6.0,6.0,6.0,6.0,6.0,5.9,5.9,5.9,5.9,5.9
stability history: 0,0.7,1.0,1.4,4.7,16.9,51.6,140.9,348.9,796.9,1698.0

first rating: 3
rating history: (3,3),3,3,3,3,3,3,3,3,3
interval history: 0.0d,0.0d,2.0d,12.0d,1.8m,6.6m,1.8y,5.1y,13.1y,31.0y,68.2y
factor history: 0.0,0.0,0.0,6.00,4.42,3.75,3.24,2.87,2.59,2.37,2.20
difficulty history: 0,2.9,2.9,2.9,2.9,2.9,2.9,2.9,2.9,2.8,2.8
stability history: 0,1.8,2.2,11.5,52.9,198.7,644.6,1849.0,4781.2,11324.5,24883.8

first rating: 4
rating history: (4),3,3,3,3,3,3,3,3,3,3
interval history: 0.0d,10.0d,1.8m,7.9m,2.4y,7.6y,21.5y,55.0y,100.0y,100.0y,100.0y
factor history: 0.0,0.0,5.40,4.37,3.69,3.19,2.83,2.55,1.82,1.00,1.00
difficulty history: 0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
stability history: 0,9.9,53.9,236.4,870.4,2776.7,7853.7,20062.7,36500.0,36500.0,36500.0
unique salmon
unique salmon
#

I feel like pretrain is bugged. Easy bonus being almost exactly 1.0 is too sus. And the fact that w[19] is the same

unique salmon
old sedge
#

why can't fsrs find my reviews?

#

all of my decks have a preset (appropriately named FSRS) set to them which has fsrs enabled, and as you can see here one of my add-ons lets me view which preset is applied to where easily

#

sorry i'm very new to fsrs (and anki)

#

am i doing something wrong guys

unique salmon
old sedge
#

ya. quite a few in fact

#

i'm using pre-made decks if that makes sense

unique salmon
#

welp
idk
DM me and send me your collection, if you want

unique salmon
old sedge
#

it's probably this

old sedge
#

hang a sec

#

DM sent.

#

note that i didn't include media in the collection i exported to you

unique salmon
# old sedge DM sent.

All of your reviews were today. You don't have cards that have been reviewed over the course of >1 day

old sedge
#

yea i just started anki

unique salmon
#

This is an edge case I've never considered before 😅
In 1.5 years of FSRS being a thing, you are the first one to run into such a problem

#

Basically, just do more reviews

old sedge
#

i literally just told someone i downloaded anki and i got told "great, now setup fsrs"

#

so i did it XDDDDD

#

how many more reviews mate

unique salmon
#

You'll need 1-2 dozen probably. I mean, for optimization to do anything

#

And I don't mean "review more new cards today"

#

As I said, you need to review cards multiple times (at least 2) over the course of multiple days

old sedge
#

i'll give it two weeks

unique salmon
#

TLDR: many reviews, many days

old sedge
#

hopefully two weeks of consistent anki reviews should give me enough stuff for fsrs to work with

unique salmon
#

yep

old sedge
#

unfortunately that means i'll have to suffer in the beginning with the archaic scheduling system

unique salmon
#

Nah, just use FSRS with default parameters

#

That's what they are for

old sedge
#

the default parameters are there if i don't input anythig in that box, right

unique salmon
#

yes

old sedge
#

rght

#

fsrs should be enabled

cursive badge
old sedge
#

what globe icon? 🤔

cursive badge
#

It's also on a few other settings like "Limits start from top"

#

"Desired retention" and "FSRS parameters" are per-preset (they don't have the globe icon) but enabling FSRS is global, you cannot have some presets use FSRS and others use SM2.

old sedge
#

oh understood

severe storm
#

Does FSRS weigh all reviews the same?

unique salmon
#

Or 24.11

severe storm
#

Where can I put suggestions for FSRS

unique salmon
#

I don't remember if recency weighting was added in 24.11 or in 25.02

cursive badge
#

I think it was 25.02

unique salmon
severe storm
#

Forums seem scary

#

I don't know anything about coding etc, so tell me if it's weird/dumb, but this is my idea.
FSRS could maybe be more optimal if it ignores reviews from cards that are the easiest and most difficult.

severe storm
unique salmon
#

Ignoring easy/hard cards isn't a good idea. It would be better to improve the calculation of difficulty, but we don't know how. We've tried a bunch of things and never managed to significantly improve the difficulty formula, beyond very minor tweaks

robust hill
#

if you are sleep deprived/tired

#

is it better to do the reviews and risk lower retention bc of sleep deprivation

#

or to leave them until next day

unique salmon
#

I just chose the latter 🤣

#

Literally rn

#

going to sleep

hasty fractal
#

most of us don't code anyway so dw

robust hill
lapis hearth
#

Where did dae disappear to

quasi shadow
#

Maybe travel.

#

For Easter

unique salmon
lapis hearth
#

I really hope he makes a build soon

#

Guess it was on his schedule that he releases the security update before easter

unique salmon
#

Alex's idea of training FSRS on a combined history from 500 users is good, but then we get really weird parameters

#

I feel like pretrain is bugged. Easy bonus being almost exactly 1.0 is too sus. And the fact that w[19] is the same

lapis hearth
#

The last params also seem quite close as well

#

Isnt this the same problem from before

#

Did this not get fixed

unique salmon
lapis hearth
#

The last 2 params for me always end up the same as default or just turns 0 whenever i optimize

unique salmon
#

Ah, no, that's a different matter

#

The last two parameters (they won't be the last anymore, but you get what I'm talking about) will not be set to 0 anymore

lapis hearth
#

okay what about them being almost always the same as default then

#

0.5166, 0.6621 I almost know these numbers by heart now

unique salmon
#

Idk about that. But basically, they will have dynamic max. values in a way that prevents them from becoming too large and causing the "the interval after Again and a few learning steps is longer than before Again" issue

unique salmon
quasi shadow
#

Maybe it’s due to the batch size.

#

I use a very large batch size in pretraining.

bold terrace
unique salmon
#

Idk why the hell batch size of all things would cause easy bonus to be 1, but might as well try because why not

glass dune
# lapis hearth The idea of finding the best way to group cards to base a preset on has died dow...

For Automatic Preset Assigning, has anyone tried grouping presets based on similar retention in the past month or so? I have a preset with a desired retention of 85%, with a true retention of 85.3% in the past week for all decks in the preset combined, but one deck has a retention of 94.4% in the past week (102 passes, 6 fails) and another deck has 78.9% in the past week (165 passes, 44 fails).

#

I know some Med students have a shit ton of different decks to review, so perhaps being able to select which parent deck to assign the presets under for all of its subdecks based on similar retention in the past month could be a convenient feature to make sure each individual deck hits the desired retention rate more accurately

lapis hearth
#

Grouping cards similar in difficulty into one group

#

other difficulties in other groups

#

Imagine a deck with 90 easy(-ier) cards and 10 hard(er) cards. The preset will be skewed by the easy cards and the intervals scheduled will never be close to appropriate enough for the hard cards

robust hill
#

i havent done it by retention but ive done it by difficulty

lapis hearth
#

So they become lapse prone

#

That was the main premise of my idea

robust hill
#

lets see how my retention has changed actually

#

it seems my mature retention has shot straight down but my young has gone up slightly

#

if we zoom in tho its like

#

but it has dropped bc the amount of mature cards used to be on avg like 20 a day

#

now its like 3 a day

wet nexus
#

I don't think heterogeneity pulls in all the other cards to "compensate." I'd say that FSRS groups together the different card cases that shares similar behavior to a certain extent and treats them differently from the rest, as long as there's enough data on these cards.

If, on the other hand, the data on heterogeneous cards is scarce, then this can be problematic, and they will tend to be leeches. Therefore, perhaps if these cases are scarce in a preset, it would be better to move them to other preset. Otherwise, I don't think it's necessary. But of course, I'm no expert.

bold terrace
robust hill
#

what is heterogeneity in this case

bold terrace
#
  • Slightly Less Workload (10%)
  • The low D cards have a Difficulty curve that is more useful than a lapse-proxy
  • The low D have good precision even with low default params
  • The High D have very very low interval so I see them way more
#

It's also quite easy to operate :
Low D : Lapse at 6, tag only

#

High D : Lapse at 12, suspend auto

#

once per week I move the tagged one from Low D to High D

#

once per week, I reset the High D, and I set a "new card/day" count to 1, and for that card, I really focus on improving my understanding of it (outside Anki)

#

Retention has been still quite predictable by FSRS, still at 90%, and even today my first time at 96%

#

The 96% being more about how I treat cards i fail now

#

Spent the whole day coding some algo to analyze the lapses yesterday

#
Card : 1710712242101 Past Lapses : [0, 2, 20, 25] Current Max Interval : 42
Card : 1708207159229 Past Lapses : [0, 1, 0, 3] Current Max Interval : 91
Card : 1723541612090 Past Lapses : [4, 8] Current Max Interval : 28
Card : 1708259347988 Past Lapses : [0, 0, 1, 0, 0, 5, 6, 0, 27, 30, 21] Current Max Interval : 12
Card : 1708787872864 Past Lapses : [1, 28, 9, 3, 27, 15, 0, 1, 14] Current Max Interval : 8
Card : 1716748875647 Past Lapses : [2, 9, 7, 9, 10, 0, 2, 3, 0, 2, 18] Current Max Interval : 0
Card : 1715717287839 Past Lapses : [0, 0, 5, 9, 4, 8, 8, 32, 9, 4] Current Max Interval : 8
Card : 1711230892107 Past Lapses : [42, 38, 6, 2, 5] Current Max Interval : 1
Card : 1708440946044 Past Lapses : [0, 0, 0, 0, 4, 15, 11, 0, 14, 6, 5, 19, 15] Current Max Interval : 9
Card : 1727897994100 Past Lapses : [0, 0, 0, 1, 0, 1, 0, 1, 6, 12, 0, 9, 7, 5] Current Max Interval : 5

For each lapse cycle, I isolate the biggest duration successful

#

Clearly shows that the one I lapse the most, does not even respect a somewhat-increasing pattern

#

They might have a success at 28d interval, then failing after a 4d interval, etc etc

#

So while we focus a lot on time stability, I think there is also some stuff to think about in terms of knowledge stability. (The low K Stability would be cards with some kind of "coin flip" errors, while T Stability would be more like "normal memory degradation")

unique salmon
#

We could model it so that a card has some maximum p(recall)<1, but then scheduling at high retentions would be impossible

#

How do you schedule a card with DR=99% if the card's maximum R is 95%?

bold terrace
#

Sure but you see, if some knowledge has a terrible K-Stability (Sorry for the made-up name, but will be easier like that), meaning : If reviews are coin-flips more than truly recalling information in a logical way, then no matter the DR, you might fail it everyday, and it might lead your T-Stability (the one FSRS tries to predict) to drop very low, when in fact, you just have a big bunch of very low K-Stability card

#

Would explain why 40-50% of my cards almost never lapse, and when they start to lapse, they can just go to >10 lapse

#

By splitting deck by Low-D/High-D, in fact that's us trying to separate those

#

ANd I think it's easy to fall in the Low K-Stability trap in Anki, because you can fail it because you lose your daily coin-flip, but feel afterward confident because you recalled it very easily 5min later... When in fact, you just compensate your terrible K-Stability with the fact you just had a very very short term review.

#

Sorry if it's a bit vague but it's still a bit fresh in my head too 😆

#

What I try to achieve with my code above, is the fact to observe if we're in front of something that behave accordingly to our expectation of a T-Stability ([0, 2, 20, 25], [0, 0, 5, 9, 4, 8, 8, 32}), or if we're in front of something super super messy that make no sense in terms of "recall of stable information" ([1, 28, 9, 3, 27, 15, 0, 1, 14])

#

But it needs to be analyzed outside the realm of FSRS, because by definition, here we're trying to assess things not in terms of probability, but in terms of actual performance

#

Also the leech detector with the poisson stuff is a bit different, because if FSRS predict each time 90% recall, but the user fail accordingly to those 90%, but never with long term increase of T-Stability, it won't be detected as a leech, while in fact, in terms of leech being "Cards that doesn't seem to have a increasing T-Stability over time", they would be leeches

#

So it's a bit of a different goal I think

#

But it's still interesting because it shows that T-Stability might not be the only form of Stability

unique salmon
#

...

#

gimme something i can code, man

#

Or something that Jarrett can code

bold terrace
#

Well, what could be interesting is based on those sequence, have function that would detect unstability 😄

#

Linear Regression that has slope <0 ?

#

Threshold based on a cost function comparing actual perf to that regression ?

#

I'll continue next week on it 🙂 Saturday's new ritual 😆

#

My hope is that if we can cluster cards based on previous performance on different "K-Stability" cluster, then FSRS could run better on each of those

#

That K-Stability would be some kind of Difficulty rating in some way

cosmic hedge
#

@unique salmon using the simulate config for CMRR works a small bit I think 🎉

#

haven't tried changing the formula at all but hey maybe I don't need to

#

still seems to be a much smaller value than before though

unique salmon
#

I do recommend doing the integral thing that I described

#

this

#

Remove the part for handling decay=-1, it won't be needed

#

Use an integral over the next 3 or 5 years

unique salmon
cosmic hedge
#

i'd have to change fsrs-rs for that

unique salmon
#

man

#

well, alright

cosmic hedge
#

i guess i could try and dash to do it before fsrs-rs 3.0.0 releases XD

unique salmon
#

In the final implementation it definitely should

#

What about loss aversion? The dumb 2.5 multiplier that should not be used when plotting time?

#

I was thinking of not using it when plotting time, but still using it for CMRR

#

Since for plotting time we want accurate real time

cosmic hedge
#

i might be wrong but i'm pretty sure loss aversion is gone already

unique salmon
#

I don't think so. IIRC it's in fsrs-rs and not in Anki itself, so it should be gone only when the PR with FSRS-6 is merged

cosmic hedge
#

I merged the fsrs-6 pr into the CMRR pr while i was testing it so if its gone with FSRS-6 then it will be there for the FSRS-5 ones and gone for the FSRS-6 ones

unique salmon
#

Ah, ok

cosmic hedge
#

i'd guess its because i'm using a deck thats "done"

#

wait i can just add more cards hold on XD

#

nope that doesnt fix it either

unique salmon
#

welp

#

Maybe Jarrett was right

cosmic hedge
wet plume
#

task

robust hill
#

these parameters

#

are very interesting

#

for my deck with low difficulty

#

fair enough

bold terrace
#
Total Cards reviewed : 3080
Cards That lapsed at least once: 1905 (61.85%)
Cards That had 0 dropping lapse: 1159 (37.63%)
Cards That had 1 dropping lapse: 457 (14.84%)
Cards That had >1 dropping lapse : 289 (9.38%)

(Lapse Distribution : {10: 1, 9: 5, 8: 26, 7: 55, 6: 95, 5: 152, 4: 214, 3: 269, 2: 365, 1: 579, 0: 144})
(Dropping Lapse Distribution : {2: 212, 3: 65, 4: 11, 5: 1, 1: 457, 0: 1159})

Damn damn damn damn

#

Some examples of those "recurring" droppers

Card : 1732830933345 Past Lapses : (8) [5, 4, 2, 7, 4, 3, 5, 3] (now:4)  Drops:5 BiggestDrop:3 Mean:4.12 Median:4.0
Card : 1708383941132 Past Lapses : (8) [1, 22, 7, 13, 6, 28, 11, 7] (now:12)  Drops:4 BiggestDrop:17 Mean:11.88 Median:9.0
Card : 1708787872864 Past Lapses : (8) [1, 28, 9, 3, 27, 15, 1, 14] (now:8)  Drops:4 BiggestDrop:19 Mean:12.25 Median:11.5
Card : 1709156003043 Past Lapses : (7) [3, 15, 10, 8, 3, 52, 9] (now:8)  Drops:4 BiggestDrop:43 Mean:14.29 Median:9
Card : 1711663821086 Past Lapses : (8) [5, 11, 3, 17, 19, 11, 9, 3] (now:7)  Drops:4 BiggestDrop:8 Mean:9.75 Median:10.0
Card : 1711664036084 Past Lapses : (8) [1, 23, 6, 13, 6, 5, 4, 13] (now:11)  Drops:4 BiggestDrop:17 Mean:8.88 Median:6.0
Card : 1711750511210 Past Lapses : (7) [1, 22, 10, 21, 14, 5, 4] (now:8)  Drops:4 BiggestDrop:12 Mean:11.00 Median:10
Card : 1717096940018 Past Lapses : (8) [1, 8, 4, 15, 9, 6, 5, 6] (now:5)  Drops:4 BiggestDrop:6 Mean:6.75 Median:6.0
Card : 1727125905731 Past Lapses : (7) [2, 4, 2, 21, 8, 4, 3] (now:6)  Drops:4 BiggestDrop:13 Mean:6.29 Median:4
Card : 1727810008064 Past Lapses : (5) [15, 9, 8, 5, 2] (now:7)  Drops:4 BiggestDrop:6 Mean:7.80 Median:8
Card : 1729017180622 Past Lapses : (9) [1, 1, 4, 3, 4, 2, 12, 9, 4] (now:6)  Drops:4 BiggestDrop:5 Mean:4.44 Median:4
Card : 1730934966226 Past Lapses : (7) [8, 5, 5, 2, 8, 5, 2] (now:6)  Drops:4 BiggestDrop:3 Mean:5.00 Median:5
Card : 1708463669829 Past Lapses : (8) [4, 8, 3, 4, 8, 5, 29, 17] (now:20)  Drops:3 BiggestDrop:12 Mean:9.75 Median:6.5

It's interesting to see that some, could already be detected at Lapse Count of 5

#

If I sort by lapse descending, I have this

Card : 1732919779145 Past Lapses : (10) [1, 2, 4, 6, 6, 3, 1, 1, 1, 7] (now:6)  Drops:2 BiggestDrop:3 Mean:3.20 Median:2.5
Card : 1716748875647 Past Lapses : (9) [2, 9, 7, 9, 10, 2, 3, 2, 18] (now:0)  Drops:3 BiggestDrop:8 Mean:6.89 Median:7
Card : 1729017180622 Past Lapses : (9) [1, 1, 4, 3, 4, 2, 12, 9, 4] (now:6)  Drops:4 BiggestDrop:5 Mean:4.44 Median:4
Card : 1730241729680 Past Lapses : (9) [1, 1, 1, 2, 10, 4, 4, 6, 5] (now:8)  Drops:2 BiggestDrop:6 Mean:3.78 Median:4
Card : 1731449637757 Past Lapses : (9) [2, 5, 4, 5, 9, 2, 4, 3, 6] (now:6)  Drops:3 BiggestDrop:7 Mean:4.44 Median:4
Card : 1732573034249 Past Lapses : (9) [1, 5, 2, 2, 1, 3, 6, 4, 8] (now:6)  Drops:3 BiggestDrop:3 Mean:3.56 Median:3

I have the feeling that I should also sort by "Increase of Interval" even when there is not much drops

#

Lapsing 10 times with only 2 drops, but never going above 6d interval is pretty bad

#

Or maybe instead of talking about "drops", talking about "non increasing performance through lapse" (so the condition would be strictly increasing perf)

Card : 1719523268011 Past Lapses : (5) [2, 2, 4, 4, 19] (now:7)  Drops:0 BiggestDrop:0 Mean:6.20 Median:4
Card : 1722962900727 Past Lapses : (5) [1, 1, 2, 7, 18] (now:14)  Drops:0 BiggestDrop:0 Mean:5.80 Median:2
#

Or maybe something like "Card with Lapse Cycle #i, have in average, a max interval of #ivl"

#

Hmmm no, nevermind, a card could have a rought start but recover later.

bold terrace
#
Cards That had [2/3, 1.0] failed outperformance ratio lapse: 253 (61.63%)
Cards That had [1/3, 2.3] failed outperformance ratio lapse: 478 (25.09%)
Cards That had [0.0, 1/3] failed outperformance ratio lapse : 253 (13.28%)

Card : 1708778862604 Past Lapses : (2) [55, 9] (now:11)  Drops:1 BiggestDrop:46 Mean:32.00 Median:32.0 FailedOutperforamnce:1 FailedOutperforamnceRatio:100.00
Card : 1708786226533 Past Lapses : (2) [166, 11] (now:4)  Drops:1 BiggestDrop:155 Mean:88.50 Median:88.5 FailedOutperforamnce:1 FailedOutperforamnceRatio:100.00
Card : 1709075502403 Past Lapses : (2) [52, 20] (now:29)  Drops:1 BiggestDrop:32 Mean:36.00 Median:36.0 FailedOutperforamnce:1 FailedOutperforamnceRatio:100.00
[...]
Card : 1719510273220 Past Lapses : (4) [1, 13, 11, 2] (now:4)  Drops:2 BiggestDrop:9 Mean:6.75 Median:6.5 FailedOutperforamnce:2 FailedOutperforamnceRatio:66.67
Card : 1719525561629 Past Lapses : (4) [2, 2, 12, 15] (now:6)  Drops:0 BiggestDrop:0 Mean:7.75 Median:7.0 FailedOutperforamnce:2 FailedOutperforamnceRatio:66.67
Card : 1723392698570 Past Lapses : (4) [2, 25, 9, 11] (now:0)  Drops:1 BiggestDrop:16 Mean:11.75 Median:10.0 FailedOutperforamnce:2 FailedOutperforamnceRatio:66.67
Card : 1723392771593 Past Lapses : (4) [2, 22, 14, 5] (now:7)  Drops:2 BiggestDrop:9 Mean:10.75 Median:9.5 FailedOutperforamnce:2 FailedOutperforamnceRatio:66.67
[...]
Card : 1730241623530 Past Lapses : (4) [3, 5, 12, 9] (now:8)  Drops:1 BiggestDrop:3 Mean:7.25 Median:7.0 FailedOutperforamnce:1 FailedOutperforamnceRatio:33.33
Card : 1730241741880 Past Lapses : (4) [1, 4, 10, 4] (now:15)  Drops:1 BiggestDrop:6 Mean:4.75 Median:4.0 FailedOutperforamnce:1 FailedOutperforamnceRatio:33.33
Card : 1730495865747 Past Lapses : (4) [1, 4, 1, 2] (now:23)  Drops:1 BiggestDrop:3 Mean:2.00 Median:1.5 FailedOutperforamnce:1 FailedOutperforamnceRatio:33.33
Card : 1730563415299 Past Lapses : (4) [2, 4, 17, 6] (now:8)  Drops:1 BiggestDrop:11 Mean:7.25 Median:5.0 FailedOutperforamnce:1 FailedOutperforamnceRatio:33.33

Failed Outperformance Ratio looks a bit better 🙂

#

Cards with high number of lapse also fall in the [1/3, 2/3] ratio of failed outperforming lapses

Card : 1730066278081 Past Lapses : (8) [2, 2, 5, 5, 10, 4, 2, 12] (now:5)  Drops:2 BiggestDrop:6 Mean:5.25 Median:4.5 FailedOutperforamnce:4 FailedOutperforamnceRatio:57.14
Card : 1731854761426 Past Lapses : (8) [1, 4, 18, 4, 5, 2, 3, 2] (now:4)  Drops:3 BiggestDrop:14 Mean:4.88 Median:3.5 FailedOutperforamnce:4 FailedOutperforamnceRatio:57.14
Card : 1732461461198 Past Lapses : (8) [1, 1, 1, 2, 10, 3, 1, 9] (now:6)  Drops:2 BiggestDrop:7 Mean:3.50 Median:1.5 FailedOutperforamnce:4 FailedOutperforamnceRatio:57.14
Card : 1732919779145 Past Lapses : (10) [1, 2, 4, 6, 6, 3, 1, 1, 1, 7] (now:6)  Drops:2 BiggestDrop:3 Mean:3.20 Median:2.5 FailedOutperforamnce:5 FailedOutperforamnceRatio:55.56
robust hill
#

question

#

if i have 2 cards that is like

#

glucagon does what to cAMP levels
(increase)
insulin does what to cAMP levels
(decrease)

today i got the 1st wrong

#

i said decrease, when it should be increase, now obviously after like 5-10 other cards, i know that since the 1st one is increase now it will be decrease

#

should i make this card wrong because of recency bias? i wouldve gotten it wrong if i didnt have the exposure

ashen light
#

I'd say hit good and move on with your life

robust hill
#

or maybe i should rework that card

ashen light
#

the shame of hitting good will stick in your mind

robust hill
robust hill
#

luc mcgradyyyy

#

i clicked the stats button

bold terrace
#

I kinda like to put one at "Again" and one at "Good" to not keep them in sync forever

#

But really depends

robust hill
bold terrace
#

I did a bit of everything, both Again, one Again/one Good, one Again/one Hard ... didn't really noticed much difference

cosmic hedge
robust hill
#

yes

#

its all good now

#

i uhh

#

somehow

#

did not read the 2nd line

#

💀

cosmic hedge
#

its ok 😂

robust hill
#

i blame the anki overlords

cosmic hedge
#

i'm pretty blind myself most of the time

bold terrace
#

Truth is, as IT guys, rebooting when in trouble is our #1 reflex

robust hill
#

yea

bold terrace
#

And if a reboot is not helping, a second one sometimes is

robust hill
#

i had rebooted anki 15 times today

#

so it kinda skipped my mind

#

@bold terrace you ever get an answer thats like half right and half wrong

#

like you say it

#

the card asked me where is this molecule located in the cell

#

most of the answers will be
cytosol or mitochondria, depending on the molecule, this time, i said
"mitosol" 🔥

#

mitosol does not exist.

bold terrace
#

Yeah ...

#

I think that's the kind of good example on how knowledge itself can be not stable

#

And you can be super strict on yourself and mark it as Again even if you guessed it right ...

#

... But what will make your future you not try to guess it again ?

robust hill
#

wasnt even a guess is the worst part

bold terrace
#

When we do Anki as a chore, we just burst through reviews thinking the sheer number of reviews will fix everything

robust hill
#

me rn

bold terrace
#

But then you get people that has 1M reviews at 2s/review in average that ask for a short memory model that allow steps <10min

robust hill
#

💀

#

need milisecond steps

bold terrace
#

I spent the last week taking lot of times to each time I answer wrong a question, note what I answered in a field, when I hesitated, taking a few minutes to check examples on the internet, etc etc

#

ANd I don't know if I'll be able to replicate it, but my 96% retnetino of today instead of 90% seems to be a sign it was not just time lost

robust hill
#

i mean

#

i am trying to be like this

#

but i think to myself "ok ill start doing this tomorrow" everyday

bold terrace
#

No worries

#

Took me 14-15 months before realizing it too

robust hill
#

atleast for now when the answers wrong, i rewrite it correctly

#

and speak it outloud

#

like physically rewrite it

bold terrace
#

But then you have cards with 200 reviews, 20 lapses, and stability of 5-6d

#

And you start to wonder if you just didn't really lost more time doing the hamster wheel instead of doing the things you had to do

bold terrace
#

Even sentences etc

robust hill
#

would be nice

bold terrace
#

Make me type faster and faster in japanese haha

robust hill
#

but a majority of my cards are not handmade

bold terrace
#

To make a field typ-able it's quite easy, you just have to add type: before the tag

#

{{type:answer}} for ex

#

{{type:Front}}

#

For example

robust hill
#

yeah but the thing is

#

this deck im using is very strange

#

genuinely i have no idea why he made it like this

bold terrace
#

You'll hate me

robust hill
#

its so strange

bold terrace
#

But IMO if you wrote your cards you might even get a better retention right on

#

😄

robust hill
#

I know

#

but i cant

#

for this deck

#

im not looking at this deck's retention yet

#

because i started it 5 days ago

#

and its made to some videos, basically, i watch the video for the first go, and then go do the deck for this video

quasi shadow
#

😅 Pretraining FSRS is mysterious.

#

I get three values of w[16] like 3.0004, 1.7976 and 2.4303 in first three collections with pretraining.

#

Once I concat them into one collection and run the pretraining, I get 1.0898.

#

It's smaller than all values of w[16] of the first three collections.

polar maple
#

reminds me of Simpson's paradox

quasi shadow
#

It also happens on the w[15].

quasi shadow
#

[0.1643, 1.2527, 2.1041, 10.311, 7.0821, 0.8304, 3.478, 0.001, 2.3956, 0.1686, 0.5738, 0.6228, 0.001, 0.4833, 0.8923, 0.7427, 1.0, 0.6286, 0.001, 0.1439, 0.1736]

#

I fixed several bugs, but the pretrained parameters still look weird.

viscid verge
#

I started using FSRS some time ago to increase my retention but that didnt happen. I didnt change the number of new cards a day but the daily reviews are now twice as many as they were before. Should I just turn it off again or will that break something?

unique salmon
viscid verge
#

Yes

unique salmon
#

Do you know what your retention was before?
I suggest downloading Anki 25.02 (if you haven't yet), going to Stats, scrolling all the way to the bottom and screenshotting the True Retention table

viscid verge
#

Im not done with my reviews for today yet so today might still be inaccurate

unique salmon
#

And what's your desired retention?

viscid verge
#

90%

unique salmon
#

Last question: how long ago have you switched to FSRS?

viscid verge
#

Probably 3-4 weeks ago. Before I had around 200-250 reviews, now its 390-450 and still rising

unique salmon
#

welp

viscid verge
#

And I thought 250 were a lot already but this is brutal rn

unique salmon
#

Idk how to help

viscid verge
#

oof

unique salmon
#

Do you use Hard as "fail"?

viscid verge
#

Maybe FSRS just failed me ig

viscid verge
#

I did change one thing tho

unique salmon
#

@quasi shadow I found one of the 0.6% users for whom FSRS works worse than SM-2 🤣

unique salmon
viscid verge
unique salmon
#

I recommend descending instead (review cards that you are least likely to forget first), but it probably doesn't matter unless you have a backlog

viscid verge
#

Hmm thx. I never have backlog so I didnt think it mattered much

slim hollow
#

you have over 10% gap of real vs desired retention this will give you a lot of reviews

viscid verge
#

Meaning I should lower my desired retention?

#

until its reached and then increase it again?

slim hollow
#

if you want less reviews then lower to about 80-85%, if you want to try to reach 90% then keep the reviews up and try to "remember better"

viscid verge
#

I obviously want better retention but the reviews are skyrocketing and my retention isnt

#

Like its not changing at all

#

Its not feeling like they will stabilise

slim hollow
#

you fail a lot of mature cards which is very common for sm2 as doesn't show them often enough, your mature cards aren't really mature

viscid verge
#

Hmm meaning at some point the increase of reviews has to stop right?

#

But my young retention is also not even close to 90%

unique salmon
#

To see if things get better as more cards use FSRS scheduling

slim hollow
#

young cards can be unstable I would say you need 1-3 months of no new cards to reach some kind of equilibrium of fsrs desired retention

viscid verge
#

Damn

#

I sadly cant afford to drop new cards at all so I will probably have to endure it for now FeelsBadAnki thanks for the help nonetheless

slim hollow
#

from the workload perspective, less new cards then lower retention if there is too many reviews, you can find both values that work for you

#

since you just switched it will take some time to switch from sm2 intervals to fsrs intervals, you can also reschedule but this will most likely show you thousands of cards to review right now

viscid verge
#

I already rescheduled so It should already be doing that, no?

#

Should I turn on "reschedule cards on change"?

slim hollow
#

yeah, if you switch desired retention in fsrs you can reschedule to decrease the daily load immediately

viscid verge
slim hollow
#

reschedules all cards after optimization, otherwise the change is only done during reviews

viscid verge
#

Ohh so this is the option that will give me thousands of reviews?

#

Also is this something i need to worry about?

slim hollow
#

generally yes, because the bigger retention gap you have and the higher desired retention you set the more reviews will get moved to right now which results in huge backlog

viscid verge
#

Alright, im on break rn so this is probably a smart thing to do right now

slim hollow
viscid verge
#

Ah okay

slim hollow
#

also get fsrs helper addon which has useful options for rescheduling / pushing review forward etc

viscid verge
#

I enabled the reschedule cards on change thingy but if i save and reenter the options menu its disabled again lol

cursive badge
#

I guess to stop you accidentally rescheduling all your cards when you make other changes later.

viscid verge
#

But how do i turn it on now

cursive badge
#

You just turn it on and save your settings. It will reschedule your cards when you save, but go back to "off" when you look at the settings next time.

viscid verge
#

Oooooo

#

My review count for tomorrow didnt change at all though

cursive badge
#

Maybe it didn't do it because your params/DR did not change? I use the FSRS Helper addon to reschedule cards because it is a bit nicer (and does not add junk to your card revlog).

viscid verge
#

Hehe I have unleashed hell doglaugh

#

Now if i dont fix this and just bruteforce it, wont I have thousands of reviews tomorrow as well? ankieyes

cursive badge
#

Not necessarily thousands tomorrow. Most of that will be cards that are overdue according to FSRS (backlog). It may still be higher than you had previously for a while.

viscid verge
#

Alr wish me luck on finishing this today lol

cursive badge
#

If you look at "Future Due" in stats and click "backlog" you will see a lot of cards due in the past.

#

At least that is how it used to work. I'm not sure if we try to do something fancy now to help manage the backlog.

viscid verge
#

Well this is going to be one heck of a day ig

#

I see that the fsrs helper rescheduled my cards to the past meaning i have this small backlog but ill just go with it and hope I dont have to go through the same tomorrow again

#

I mean its just 4k thats only like 10x my usual daily reviews

cursive badge
#

N.B. this is where "retrievability descending" can shine. If you cannot fully complete the backlog you at least should maintain your DR on the ones you can manage (at the cost of worse results on the ones you don't get to). "retrievability ascending" can lead to you doing badly on lots of cards.
Apparently the stats say that "retrievability descending" will lead to better progress on your backlog.

viscid verge
#

Yep I hope it will be manageable

#

Thx for helping

cursive badge
#

👍

slim hollow
#

retrievability descending is also good for morale as you get the cards you remember best first, so it goes fast at first

viscid verge
#

Yeah but the later cards will kill my spirits probably

unique salmon
# quasi shadow

Idk man, try it on only 2 users with similar retention and a similar number of reviews

unique salmon
#

I found IDs of 2 very similar users
'id': (reviews, retention)

'513': (6975, 0.9733333333333334)
'664': (6933, 0.973316024808885)

#

Or these
'43': (44119, 0.8620548969831592)
'186': (43434, 0.8619975134687111)

#

The idea is to try the pretrain code on users with very similar data. If EVEN THEN the issue with parameters arises, then either FSRS is borked or the code that combines multiple revlogs into one giant revlog is borked

unique salmon
#

@quasi shadow I ran my similarity score code on the entire dataset, here are two most similar users:

'449': (5468, 0.8433691236215902, 2.869341265235055, 14.920850261172374, 1.753688261706222)
'2066': (5509, 0.8433691236215902, 2.869341265235055, 14.914320951828207, 1.7668377164849263)

Format: {'id': (ln(reviews), retention, avg. rating, avg. interval length, avg. number of reviews per day, excluding same-day reviews)}

{'7876': (8.61794309451638, 0.843607388627309, 2.8693227091633466, 43.37703435804702, 1.761707550175215)}
{'3350': (8.617762246337932, 0.8433516801853997, 2.8689889918887603, 43.37746427925484, 1.7619502868068833)}

I suggest running pretrain on these two
If even on these guys parameters are weird, then we're screwed

unique salmon
#

Also, here's a .json file with similarity scores (lower = more similar) for every pair of users, just because why not
https://drive.google.com/file/d/1MpGLzZhhwAs_Q3XqSZULC8J_SyZztE4M/view?usp=sharing
Assuming we don't see anything weird in parameters optimized on the two users I mentioned above, we can use this to gradually try pretrain on less and less similar users to see when the problem occurs. Maybe parameters only go haywire if there is a large enough difference between users

unique salmon
#

Actually, I should use ln(n reviews) rather than n reviews. And exclude same-day reviews for calculating the avg. interval. Oh well, time to re-calculcate this

unique salmon
#

Ok, here are the new two most similar users
Format: {'id': (ln(reviews), retention, avg. rating, avg. interval length, avg. number of reviews per day excluding same-day reviews)}

{'7876': (8.61794309451638, 0.843607388627309, 2.8693227091633466, 43.37703435804702, 1.761707550175215)}
{'3350': (8.617762246337932, 0.8433516801853997, 2.8689889918887603, 43.37746427925484, 1.7619502868068833)}

polar maple
#

ill go take a look

#

or is this just birthday paradox going on lol

unique salmon
#

But I thought "nah, ain't no way" and printed the closest users with non-zero similarity metric

polar maple
#

are there more user pairs like this?

unique salmon
#

you can check the .json file

#

I get MemoryError with pandas when trying to open it 🤣

quasi shadow
#

[0.1317, 1.5555, 2.4396, 11.1197, 4.6111, 0.7629, 2.9693, 0.0013, 1.7683, 0.1634, 0.7567, 0.9032, 0.001, 0.4082, 0.6454, 0.5804, 1.3254, 0.627, 0.038, 0.1525, 0.1665]

#

I pretrain FSRS-6 in first 500-1000 collections.

#

Now the w[16] is not very close to 1.0.

#

😅 Fine. It's still a mystery.

robust hill
#

if i have a leech

#

i just thought of an idea

#

cant i duplicate it and have 2 cards of that leech

#

sure it might be ineffective if it is a lot of leeches

robust hill
quasi shadow
#

@polar maple @unique salmon ... The weird parameters have the best performace.

#

We need a better method to optimize the default parameters.

south lodge
#

What [vibe/connection] do those numbers communicate, that they might be weird?

quasi shadow
south lodge
#

I think I'm okay with that. (But what do I know.) (Based on my reading of the discussion that follows, I have no idea.)

polar maple
#

Maybe only a few users regularly use Easy and they are heavily affecting the result

quasi shadow
quasi shadow
polar maple
#

yeah looks like there are basically none.. weird!

#

would the results be similar if you completely removed wd?

quasi shadow
quasi shadow
#

For example, the avg w[16] in the first 10 collection is ~3.

#

However

#

Wait...

#

I find another bug(

#

I forget to set the wd

#

😅

polar maple
#

🙏

quasi shadow
#

nope

#

It doesn't matter

polar maple
#

how long does it take to run the training code? Maybe you can binary search to find the first user such that including this user makes w[16] become 1.0 after pretraining. Then we can investigate this user further, but this assumes that the problem is from specific users rather than it being a gradual process.

quasi shadow
#

wd only works for these models.

#

So...

polar maple
quasi shadow
#

1 user -> 2 users -> 3 users

polar maple
#

🤔

#

are these users 1 2 & 3?

quasi shadow
#

yep

#

pretrain_num = 3
pretrain_users = [i for i in range(1, pretrain_num + 1)]

polar maple
# quasi shadow

maybe you want to confirm this but i just checked and found that user 3 has no 'easy' grades, strange that w[16] is so different between 2 users and 3 users

quasi shadow
#

yeah

polar maple
#

arguably user 3 should have no effect

#

unless i am missing something?

quasi shadow
#

user3 changes other parameters.

polar maple
#

true

south lodge
#

If it's a don't-care term for a specific user, is there any insulation against contaminating the aggregate analysis? (The output to a don't-care input value is undefined, so any instance would be noise added to the larger collection of data, if there isn't a culling/ameliorating method in place, which there might be.)

quasi shadow
#

It's the calibration graph for user 1 & 2 with last rating = easy.

#

It's the calibration graph for user 1, 2 & 3 with last rating = easy.

#

It's the calibration graph for user 1 & 2 with last rating = good.

#

It's the calibration graph for user 1, 2 & 3 with last rating = easy.

#

@polar maple user 3's retention is higher than user 1 & 2.

#

I guess it increases the average stability of the dataset.

polar maple
#

yeah i was thinking whether the people who use the Easy button just have lower stability overall

quasi shadow
#

So the stability with last rating = easy is lower than average stability.

#

Then, w[16] will tend to 1.0 or less.

polar maple
#

i wonder how low w[16] can go if you remove the clipper

quasi shadow
#

It is close to 0 when I debug the code.

polar maple
#

i changed the initial value of w[16] to 1.5 instead and got this

#

but i dont think this is necessarily related

quasi shadow
#

It's the calibration of the user 1 with all reviews.

#

It's user 2.

#

It's user 3.

#

😅 OK, these distributions are very different.

polar maple
#

an alternative method for pretraining that could result in more reasonable parameters but would definitely perform worse:

  • take the final parameters from the separately pretrained users
  • take a set of card histories by random sampling or some other method
  • for each card history: for each parameter set, compute R. Then take the median of these computed Rs, to get a final card history -> R association
  • optimize FSRS on these pairs to summarize the results
#

but this is only for if we really need to try something else

quasi shadow
#

The stability of user 1, 2 & 3.

#

I think our guess is true.

polar maple
#

👍

#

btw for other.py i can't reproduce the result for FSRS-6

#

nvm i can't for FSRS-6-recency either

quasi shadow
polar maple
#

yeah, i'm testing the first 3 users

quasi shadow
#

Do you upgrade fsrs-optimizer to v6.0.0?

polar maple
#

i'll try that now

quasi shadow
#

I changed the sorting method in v6.0.0 to make it consistent with the rust version.

#

So the result would be slightly different.

#

😅 The default sorting of pandas is unstable.

#

It costed me an entire day to be aware of it 😅

polar maple
#

it is still slightly different, python other.py --algo FSRS-6 --recency on fsrs-optimizer v6.0.0 and branch Expt/S-decay-for-short-term-memory

#
{"metrics": {"RMSE": 0.37155, "LogLoss": 0.443219, "RMSE(bins)": 0.063689, "ICI": 0.02619, "AUC": 0.680656}, "user": 2, "size": 35900, "parameters": {"0": [0.0578, 2.156, 9.0984, 18.6335, 6.977, 0.4176, 2.8603, 0.001, 1.2517, 0.4143, 0.8218, 1.6942, 0.127, 0.2813, 2.6004, 0.1842, 1.7016, 0.6977, 0.2398, 0.1437, 0.1]}}
{"metrics": {"RMSE": 0.265645, "LogLoss": 0.267574, "RMSE(bins)": 0.039605, "ICI": 0.010973, "AUC": 0.640927}, "user": 3, "size": 4255, "parameters": {"0": [4.6381, 10.8228, 11.4011, 10.9234, 7.1465, 0.7717, 2.1306, 0.001, 1.4827, 0.1098, 0.9757, 1.8976, 0.0819, 0.476, 2.3429, 0.1742, 3.0004, 0.7536, 0.3332, 0.1437, 0.1056]}}```
cosmic hedge
#

I'm just curious but would easy_bonus still converge to 1 if you set 'easy=good' for all users except 1

polar maple
#

easy_bonus just has to give information that when it is pressed, FSRS knows that the user fits this type of profile

quasi shadow
#

😅 OK, they are inconsistent.

#

I will debug it.

quasi shadow
#

OK, I figured it out.

unique salmon
#

@quasi shadow try what Alex suggested - normalize the weight of reviews from different users in such a way that each user contributes equally
For example, if someone has 100 reviews, the weight of each of their reviews should be 1/100

quasi shadow
#

it doesn't solve the distribution issue.

robust hill
#

is there a known retention increase

#

when we make our own cards compared to using premade cards

quasi shadow
#

Please pull the latest commit.

cosmic hedge
# quasi shadow It takes ~55GB RAM😅

could filtering the columns help?

def process_user(user_id):
    dataset = pd.read_parquet(
        DATA_PATH / "revlogs", filters=[("user_id", "=", user_id)], columns=["card_id", "rating", "elapsed_days"]
    )
    dataset["delta_t"] = dataset["elapsed_days"]
    dataset = create_features(
        dataset, model_name=MODEL_NAME, secs_ivl=SECS_IVL
    )
    return user_id, dataset
cosmic hedge
#

converges to 1 for just user 2

unique salmon
#

It's somewhat like bogosort, but with FSRS parameters 🤣
Instead of "try sorting the list using a random number generator until you get a sorted list by pure chance", it's "try optimizing FSRS on random subsets of users until you get reasonable parameters"

quasi shadow
#

What are the criteria of "reasonable parameters"?

unique salmon
# quasi shadow What are the criteria of "reasonable parameters"?
  1. Calculate the 10th and 90th percentile of each parameter (use the same data that you use for plots), those will be the boundaries
  2. Run FSRS on a subset of users
for n in range(len(fsrs_pretrain_params)):
    if not (10th_percentile[n] <= fsrs_pretrain_params[n] <= 90th_percentile[n]):
        raise Exception('Parameter is outside of reasonable bounaries')
quasi shadow
#

What's the size of the subset?

unique salmon
#

500

#

Or whatever you want

quasi shadow
#

If it's 500, we will have 10kC500 subsets.

#

😅 It's not a small number.

unique salmon
#

Good, so you will be done in a weekend

quasi shadow
#

505C500 is equal to 268318178226

quasi shadow
unique salmon
#

Okay, two weekends

#

In all seriousness, idk

quasi shadow
#

two thousands years later

unique salmon
#

Ok, here's another stupid but less stupid method: The Average Joe Method

  1. Take parameters of a user
  2. Use them for FSRS "dry run"
  3. Repeat for all 10k users
    This way we will find the guy whose parameters fit all other Anki users the best. And it only requires running FSRS ten thousand times instead of 10^1000 times!
unique salmon
#

So you try using parameters of user 1, user 2, user 3, etc.

#

That way one user will be chosen as the "donor" of parameters, and he will never even know 🤣

#

Though, after thinking about it, it probably won't improve the metrics by much compared to our current median method

#

Man, this whole situation sucks
I was hoping pretraining FSRS would be straightforward FeelsBadAnki

lapis hearth
#

Dae is back

#

Could someone notify him about FSRS 6

unique salmon
#

In other news, I still can't run the new benchmark code without getting errors. Except that's not news at this point. No amount of updating the optimizer and downloading the correct code from the correct branch helps
@quasi shadow I wanted to make calibration plots for each version of FSRS, for that I need .jsonl file with --raw flag for each version of FSRS, on all 10k users

#

Actually, wait
FSRS-6 runs, but now FSRS v1 doesn't
Maybe you made a change that broke older versions of FSRS?

#

Give me a minute

#

FSRS-6: ✅
FSRS-5: ✅
FSRS-4.5: ✅
FSRS v4: ✅
FSRS v3: ✅
FSRS v2: ✅
FSRS v1: ❌

#

huh

File "C:\Users\Andrew\srs-benchmark-final\other.py", line 126, in iter
    stabilities, difficulties = outputs[
ValueError: too many values to unpack (expected 2)
#

It only affects FSRS v1 🤔

cosmic hedge
unique salmon
#

Alright, now it will take me two weeks to run all versions of FSRS on all 10k users 😅
But as a result we will get the best visualization of algorithmic advancements!

#

Here's FSRS v1 on the first 169 users

unique salmon
#

FSRS v1 vs FSRS-6 with recency weighting on 307 users

#

🥹

#

Surely now nobody will complain about TR<DR, right...

lapis hearth
#

FSRS 5 is already doing a good job (i said that before) though what i was having fears about, was that FSRS was actually artificially selecting cards so that it pumps my TR to my DR

#

Because my DR is 95% and I dont believe that I actually know 95% of my cards. If I open them and reread them outside of Anki, it is as if I forgot them

unique salmon
#

We'll see how FSRS-5 looks like compared to FSRS-6-recency. It's a bit unfair because only one will have recency, but I already will have to run this stuff for 10-14 days and I'd rather not add another 30 hours to it 🤣

#

Or even 70 hours

unique salmon
#

And I'm not sure what you mean by "selecting" - FSRS schedules intervals for every card according to DR

#

It doesn't pick like "Hmmm, I will schedule this card as if DR was a few % lower and I will schedule that other card as if DR was a few % higher"

tepid spoke
#

I actually wonder what'll happen to the retention now, that I've run out of new cards

#

I always had ~85% on average, with the newer cards kinda saving it, and the old ones being in the mud

polar maple
#

e.g. can have a poor calibration graph but still reach the TR perfectly

#

or can have a perfect calibration but still perform badly

unique salmon
polar maple
#

i think it should be expected to overestimate and underestimate sometimes especially for small collection sizes

unique salmon
#

Though we can tell that at least FSRS-6 recency isn't systematically wrong in one particular direction, except at very low R

polar maple
#

the forgetting curve isn't flat enough still 😁

unique salmon
#

Meanwhile FSRS v1

#

Yeah, that one is definitely systematically wrong

#

I'm curious what the graph for FSRS-5 will look like, whether it will be noticeably different from FSRS-6-recency

#

Also, I wonder if maybe we could use some other "tricks" other than recency weighting

polar maple
#

you mean FSRS-6 compared to FSRS-6-recency?

unique salmon
#

No, I mean FSRS-5 vs FSRS-6-recency

#

The only version I will run with recency is FSRS-6

polar maple
#

we've seen FSRS-5-recency before

#

500 users

unique salmon
#

God I hate having these four lines/curves on the same graph
It's so cluttered 😭

unique salmon
#

Like, idk, some kind of loss smoothing or something

#

Then again, I don't expect that there are sharp minima in the loss landscape of FSRS

polar maple
#

ill check if rewriting the initial stabilities as e^x instead of just x will help it learn better from backpropagation

#

i noticed that they barely move from training

#

but rn i implemented Reptile for FSRS so im just waiting for it to finish

#

hopefully we get 0.0001 log loss improvement lol, im not hoping for much

lapis hearth
#

Do you get what I am saying

#

Maybe if there is a way to visualize learning progress on these difficult cards then i wouldnt be so apprehensive

#

So I have the suspicion that FSRS knows indeed that I have forgotten a particular 50 cards, gives me 5 a day of these 50 cards. Give me 95% easy cards, and I am fooled by hey look, your TR is actually 95% so everything is okay.

But the progress on those 5 difficult cards and ultimately the 50 difficult cards in total has not changed.

And I keep wondering, why it seems as if my learning progress on those 50 cards has reached a stalemate

#

Do you get what I am saying

#

Again if there is a way to visualize learning progress on those 50 difficult cards as a whole, i could be relieved

unique salmon
#

I get that there are leeches that just don't "stick", so you never feel like you've made progress memorizing them. But still, FSRS doesn't do this kind of "global" thing. When a card is scheduled depends directly only on the history of that card. I guess it indirectly depends on other cards in the sense that reviews of other cards affect parameters, and parameters affect this card. But FSRS doesn't directly use review histories of other cards to schedule card A

lapis hearth
#

And if I were to custom schedule those 50 cards, the actual cards which are in need of relearning and learning, my TR would drop to 50%

#

This is my fear

#

And this is why I have a subjective feeling that FSRS is scheduling "around" the leech cards

lapis hearth
unique salmon
#

Alex's neural net has "memory states" (if we loosely borrow FSRS terminology) not just for cards, but also for notes, decks, presets and the entire collection
So instead of "I will use the memory state of this card to schedule this card", it's "I will use the memory states of this card, the note, the deck, the preset and the whole collection to schedule this card"

lapis hearth
#

So my fear is true

#

Fuck

polar maple
#

i guess it also depends on whether your leeches have > 1 day stability

bold terrace