Comments (4)
Rather, I meant whether there could be some improvement when some parameters' information tends to 0. But I understand better now that it's something very tricky to achieve.
Tricky indeed! Also consider that BADS is orders of magnitude faster than other Bayesian optimization methods (which unfortunately means that there is not much time to do fancy analyses between iterations).
I am quite convinced that it doesn't make much sense, but I still wonder if heuristically disregarding exactly non-informative parameters would be an option to consider.
It's something worth considering in general. For example, finding a "principal" subspace of active parameters is one of the approaches used to perform Bayesian optimization in high-dimension. So, a smart approach might be a way to increase the dimensionality of the problems BADS can tackle.
On the other hand, the problem is that of course we don't want this feature to cause issues in other circumstances. In many real case scenarios, a parameter might be "unused" in certain regions but become active elsewhere, so it's hardly a on-off inference; and if there is an exactly unused parameter, it probably should be information provided by the user. In fact, for the record, requiring users to actively provide information they have about their problem is a positive thing - there is no such thing as a truly black box scenario; we often know more about the problem and there is no reason why this information should be withheld from the algorithm.
from bads.
Thanks for the comment!
First, for those who are not aware, BADS supports explicitly "unused" parameters in that you can manually tell BADS to ignore certain parameters by setting the upper/lower bounds to some fixed value (see here).
However, here you mean automated detection of unused (or less informative) parameters. That's in theory what "automatic relevance determination" (ARD) kernels do in Gaussian process regression; a dimension that has little or no impact on the target function would have a very large length scale (that is, the function would not change much along that dimension).
And that's more or less what BADS will infer and make use of during the optimization (it uses an ARD kernel). So why does performance decrease if you add "non-informative" dimensions?
Consider that:
- BADS only builds a local GP approximation of the target function. Thus, global properties need to be re-learnt locally, and BADS might keep checking that a parameter has still no influence. This might seem an issue, but in fact is a major perk of the algorithm as it can deal successfully with non-stationarity and other common features that easily break a global, stationary GP.
- There is more going on in BADS than GP regression, so I can see a few parts in which having a non-informative dimension would still negatively affect and slow down the optimization.
In short, I don't find surprising that BADS performs worse in the case in which more (non-informative) dimensions are provided. That's the cost of not giving the algorithm our prior information that those dimensions are actually useless globally. The rest you are suggesting already happens; BADS tends to follow the most promising directions (although of course there is space for improvement).
Finally, the issue of "unused" parameters is going to affect VBMC even more, in that VBMC infers posteriors so there is no space for "non-informative" dimensions (that is, even dimensions in which the likelihood is flat need to have a non-trivial posterior). Indeed, there may be non-informative directions in the likelihood but for a number of technical reasons (involving Bayesian quadrature), for maximal flexibility the GP needs to be placed on the log posterior (and not on the log likelihood).
from bads.
Thanks for the explanatory response :)
First, for those who are not aware, BADS supports explicitly "unused" parameters in that you can manually tell BADS to ignore certain parameters by setting the upper/lower bounds to some fixed value (see here).
Yes, I am using this already, but from a practical point of view it would be great if BADS would account for it "automagically"
The rest you are suggesting already happens; BADS tends to follow the most promising directions (although of course there is space for improvement).
Yes, sorry, I didn't mean it doesn't. It works great in that regard actually. Rather, I meant whether there could be some improvement when some parameters' information tends to 0. But I understand better now that it's something very tricky to achieve.
I am quite convinced that it doesn't make much sense, but I still wonder if heuristically disregarding exactly non-informative parameters would be an option to consider.
from bads.
Yes, I agree with all what you said
from bads.
Related Issues (13)
- Error when setting bounds equal to X0 HOT 2
- More than 40 dimensions HOT 3
- Problems when moving plb & pub HOT 2
- Unexpected error while optimizing: Undefined function or variable 'usearch'. HOT 3
- Less dependency? HOT 2
- Shadowing "unwrap" HOT 1
- warn ID for mesh overflow HOT 1
- Inaccurate error when all variables fixed HOT 1
- Search in specified discrete combinatory space HOT 1
- parallel function evaluations? HOT 1
- Parallel function evaluations?
- other metaheuristics HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bads.