Dear authors, I am recently reading your amazing working of this paper, however I enco

Question about prediction using scaling law about datablations HOT 2 CLOSED

huggingface commented on May 22, 2024

Question about prediction using scaling law

from datablations.

Comments (2)

x54-729 commented on May 22, 2024 1

That's clear enough for me, thank you for your explanation!

from datablations.

Muennighoff commented on May 22, 2024

The loss values predicted by the formula are unlikely to exactly match the final loss of the model.

This is because we borrow some of its parameters from the values put forth in Chinchilla, which were created using a different setup, some different hyperparameters (learning rate, schedule etc.) leading to different loss values (See Appendix B). However, the trend predicted by the formula is expected to be accurate, e.g. plugging in two model configurations to compare which one should have a better loss (as for Figure 1, right). Or if you e.g. modify U_d = D / 7 to U_d = D / 5 in your code, you will get a sense of how much your loss should improve relatively by having more unique data (e.g. relatively speaking U_d = D / 7 -> U_d = D / 5 gives a higher boost than U_d = D / 3 -> U_d = D / 1).

In Figure 5 left, the loss predictions from the formula are all shifted by a constant, such that they match with the actual loss at 100%. The point of this Figure is to show how well the trend matches not the actual predicted loss values. Hence the caption says Loss curves predicted by our data-constrained scaling laws are shifted to exactly match the loss at 100% unique data. , but maybe this isn't made clear enough. Let me know if you have an idea for making this clearer in the paper / the repo! 🧐

from datablations.

Recommend Projects

Question about prediction using scaling law about datablations HOT 2 CLOSED

Comments (2)

Related Issues (6)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent