Comments (4)
Hi VespaLan,
The AMBL method is to model the parameters of the based model by a multivariate normal distribution with diagonal covariance matrix: w ~ p(w | theta)
, where: theta = (mean, std)
. To obtain theta
, we need to initialize it, and then, train. In addition, the original paper included a hyper-prior: p(theta)
- a normal-gamma distribution (usually seen in statistics due to conjugate prior) to regularize theta
.
In my implementation, I was lazy to include this hyper-prior. What I did was to set the L2-regularization for theta in the optimizer. In this case, the hyper-prior is a "normal-normal" distribution. If you want to include th normal-gamma hyper-prior as stated in the original ABML paper, you can add it to the meta_loss
right before the meta_loss.backward()
.
Let me know if you still have any further concerns about the implementation.
from few_shot_meta_learning.
Many thanks to your reply! I am new to the field of meta learning and that helps a lot. Yes I still have questions about the details. Here is my intuition of the implementation of the normal-gamma distribution and I wonder if it's correct: I found that in the paper alpha
and beta
were not updated as theta
. So perhaps I just have to randomly sample these parameters for the initialization of theta
? (Or I should sample it every time theta
updates which makes it serve like a weight decay function.)
from few_shot_meta_learning.
Hi @VespaLan
I am not sure if I can understand you. The parameters a
and b
of the hyper-prior normal-gamma (alpha
and beta
in your case) are hyper-parameters, and those are chosen (see Table 4 in the Appendix of the ABML paper). To initialize theta
, you can either sample from the hyper-prior, or sample randomly.
from few_shot_meta_learning.
Oh I didn't notice alpha
and beta
is given in the appendix... I see. Now I fully understood this . Thank you so much for the quick and helpful response!
from few_shot_meta_learning.
Related Issues (15)
- Some questions about this code. HOT 1
- Loss is NaN in PLATIPUS HOT 2
- Platipus loss function potentially doesn't match paper HOT 2
- Question about the implementation of VAMPIRE HOT 4
- test in Platius model HOT 2
- NaN loss when training with sine HOT 4
- error in Platipus model with sineline data source
- Models not training HOT 4
- Potential Problem of the loss function in ABML HOT 2
- Loss function for implementation of BMAML HOT 2
- First order approximate typo? HOT 1
- getting NaN's in ABML at about epoch 14 HOT 4
- Consultation about the code HOT 1
- Regression code HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from few_shot_meta_learning.