Comments (24)
Maybe decide this based on the number of datapoints in each case?
if S0_good > S0_easy:
if n_datapoints_good > n_datapoints_easy:
S0_easy = S0_good
else:
S0_good = S0_easy
However if you look at the table in #5 (comment), there are cases where S0_again > S0_hard
or S0_hard > S0_good
so this issue is not limited to the pair of Good and Easy.
from fsrs-optimizer.
There are 2 simple ways to solve this:
if S0_good > S0_easy:
S0_good = S0_easy
or
if S0_good > S0_easy:
S0_easy = S0_good
In the first method we artificially decrease S0 for Good, in the second method we artificially increase S0 for Easy. I don't know which one makes more sense, but probably the latter. If S0 for Good is based on a larger number of reviews, then it is calculated more accurately than S0 for Easy, and therefore we shouldn't change it and instead we should change the less accurate S0 for Easy.
from fsrs-optimizer.
In my opinion, the second approach makes more sense.
from fsrs-optimizer.
I suppose the idea above should be applied to all pairs: Again-Hard, Hard-Good and Good-Easy.
from fsrs-optimizer.
@L-M-Sherlock here are some good ideas:
- The one above by nb9618, but apply it to all pairs: Again-Hard, Hard-Good and Good-Easy. There will likely be issues with that, though. I don't expect it to work on the first try without creating new problems.
- When using additive smoothing, instead of using retention of the entire collection/deck, only use retention based on second reviews to calculate p0 (the initial guess).
- When using the outlier filter based on IQR, use ln(delta_t) rather than delta_t itself. Filtering based on IQR doesn't work well on data that isn't normally distributed, and delta_t certainly isn't.
Of course, all of these changes should be evaluated using statistical significance tests, I hope by now you have set up an automated system to run tests on all 66 collections.
Oh, also: in the scheduler code change // recommended setting: 0.8 ~ 0.9
to // recommended values: 0.75 ~ 0.97
from fsrs-optimizer.
@L-M-Sherlock you've been inactive for a couple of days, so there is a good chance you missed my comment above. I'm pinging you just to remind you about it.
from fsrs-optimizer.
I am just tired to maintain the optimizer module. You can check these parameters in the batch training in collected data: open-spaced-repetition/fsrs4anki#351 (comment). There are some cases where the initial stability of again is large than the initial stability or the initial stability of good is large than the initial stability of easy. These cases would have different reason. We should deal with these problem according to the concrete cases.
from fsrs-optimizer.
Ok, forget about 1, but I would still ask you to test 2 and 3.
from fsrs-optimizer.
For 2, here is a extreme case:
The user always remember in the next review when he pressed easy
in the first learning. In this case, the retention is 100%. If we use this value, the additive smoothing will be useless.
from fsrs-optimizer.
I think you misunderstood my idea a little bit. I didn't mean "use four different initial guesses for each grade", I meant "use the same initial guess for each grade". So just calculate average retention for all second reviews.
from fsrs-optimizer.
By the way, have you automated running statistical significance tests on all collections?
from fsrs-optimizer.
3. When using the outlier filter based on IQR, use ln(delta_t) rather than delta_t itself. Filtering based on IQR doesn't work well on data that isn't normally distributed, and delta_t certainly isn't.
I'm testing this in all 66 collections.
from fsrs-optimizer.
I think you misunderstood my idea a little bit. I didn't mean "use four different initial guesses for each grade", I meant "use the same initial guess for each grade". So just calculate average retention for all second reviews.
OK. I will test it after above test. It will cost nearly 3 hours.
from fsrs-optimizer.
I'm testing this in all 66 collections.
Before:
Weighted RMSE: 0.04149183369953192
Weighted Log loss: 0.3815897150075234
Weighted MAE: 0.02342977913950602
Weighted R-squared: 0.7697902622572932
After:
Weighted RMSE: 0.04174954832152736
Weighted Log loss: 0.38212856042129156
Weighted MAE: 0.02374078044685508
Weighted R-squared: 0.7672438581669868
p = 0.0045 (for RMSE)
3. When using the outlier filter based on IQR, use ln(delta_t) rather than delta_t itself. Filtering based on IQR doesn't work well on data that isn't normally distributed, and delta_t certainly isn't.
It's worse than the current version with statistical significance.
Here is the code:
def remove_outliers(group: pd.DataFrame) -> pd.DataFrame:
# threshold = np.mean(group['delta_t']) * 1.5
# threshold = group['delta_t'].quantile(0.95)
Q1 = group['delta_t'].map(np.log).quantile(0.25)
Q3 = group['delta_t'].map(np.log).quantile(0.75)
IQR = Q3 - Q1
threshold = Q3 + 1.5 * IQR
group = group[group['delta_t'].map(np.log) <= threshold]
return group
from fsrs-optimizer.
Huh, I'm surprised. Maybe the more data is removed, the easier it is for FSRS to fit the remaining data well? In other words, what if we cannot rely on RMSE when removing outliers because, between two methods that both aim at removing outliers, the one that removes more data will always result in a lower RMSE?
from fsrs-optimizer.
Removing more data not always results in a lower RMSE. Removing too many data might lead to underfitting, where the model fails to capture the underlying trend of the data. This can also increase the RMSE.
from fsrs-optimizer.
Alright, then test the idea with p0 for additive smoothing, and that's it.
After that I would like you to benchmark all 5 algorithms, I'll explain it in a bit more detail in the relevant issue.
from fsrs-optimizer.
additive smoothing:
Weighted RMSE: 0.04147353655819303
Weighted Log loss: 0.3815885589708383
Weighted MAE: 0.023376754517799636
Weighted R-squared: 0.7699164899424069
p=0.38
It is slightly better but not statistically significant.
from fsrs-optimizer.
Removing more data not always results in a lower RMSE. Removing too many data might lead to underfitting, where the model fails to capture the underlying trend of the data. This can also increase the RMSE.
I agree that removing more data would not always result in a lower RMSE.
But here, we are selectively removing the data which lies at the right-hand-side of the curve (and not any random data). So, the remaining data is more homogenous and this might explain why the RMSE is lower.
from fsrs-optimizer.
So, the remaining data is more homogenous and this might explain why the RMSE is lower.
Yeah, I'm just surprised that my approach is somehow worse, even though in theory IQR should work better with normally distributed data.
from fsrs-optimizer.
I think that the increase in RMSE that we saw when using log of delta_t
is just an artifact.
For example, when the optimizer filtered out all the cards with first rating = Again
in my collection, the RMSE got a crazy low value (0.0056). I first mentioned this here: open-spaced-repetition/fsrs4anki#348 (comment)
from fsrs-optimizer.
I think that the increase in RMSE that we saw when using log of
delta_t
is just an artifact.
So we should not only consider the RMSE, right? We should have other criterion to make decision whether a idea should be employed in FSRS.
from fsrs-optimizer.
Maybe decide this based on the number of datapoints in each case?
I will adopt this idea, not for the sake of enhancing the model's accuracy, but to alleviate users' confusion. Therefore, I would not to run evaluation tests.
from fsrs-optimizer.
I think that the increase in RMSE that we saw when using log of
delta_t
is just an artifact.So we should not only consider the RMSE, right? We should have other criterion to make decision whether a idea should be employed in FSRS.
Yes, but I don't know which metric would be appropriate in this case.
Also, let's discuss this further in #16.
from fsrs-optimizer.
Related Issues (20)
- Use the median instead of the mean for recall costs and learn cost HOT 8
- A better outlier filter for "Compute minimum recommended retention" HOT 2
- how to input data from obsidian-spaced-repetition-recall, ob-revlog.csv, into optimizer HOT 26
- [Feature request] A way to extrapolate values of S0 without curve_fit HOT 8
- [BUG] file not found when running local optimizer for multiple decks HOT 2
- Use results from benchmark experiment as initial values of S0 HOT 12
- Command Line typo on usage section fsrs-optimizer doesn't exist [BUG] HOT 1
- [BUG] Loosen the clamping for w[10] and w[8] HOT 5
- [Feature Request] Loosen the clampings for w[9] HOT 5
- index 1 is out of bounds for axis 0 with size 1 [BUG] HOT 1
- Training data is inadequate. HOT 5
- [Question] Explain how the optimizer calculates retention that minimizes review times HOT 12
- [Feature Request] make the simulator more precise by using different values of recall_cost for Hard, Good and Easy
- [Bug] Can't use absolute path as arg HOT 1
- [BUG] 'Optimizer' object has no attribute 'w' HOT 1
- [Feature request] Improve post-lapse stability analysis HOT 13
- See if this code could be used to speed up finding optimal retention HOT 16
- [Feature Request] Investigate how robust are parameters and RMSE HOT 18
- Optimized w[3] too low HOT 12
- [Feature Request] Add another condition to the outlier filter HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fsrs-optimizer.