jamesrobertlloyd / gpss-research Goto Github PK
View Code? Open in Web Editor NEWKernel structure discovery research code - likely to be unstable
License: MIT License
Kernel structure discovery research code - likely to be unstable
License: MIT License
e la Bayesian Data Analysis 3
For speed, but then compute nll using full data - randomising the subset should also guard against an unlucky subset being chosen at the beginning of the search
Does the search multiply by Lin again?
Is the jitter size correct (too big and we lose optimised values - too small and spurious Lins will appear again)
e.g. when data is known to be on a regular grid but is sparse
Only the constant kernel should - fits better with symbolic regression grammars
Need to look at gradient - can't just re-use SE*something logic
Before computing stats python can work out where a kernel applies - potentially even a sum of kernels
e.g. should consider the expansion A + B + C -> (A + B) * D + C
This will likely tidy up some duplicate code since we know that all operators have operands etc
Also - if we record properties like commutativity / distributivity etc. we can abstract their behaviour.
Use bsxfun and generally optimise the formulae
Earthquakes
EEG
Changepoint papers?
Fault detection papers?
Multiresolution paper?
Fix some of the current data sets that were subsampled.
e.g. product kernel incorrectly uses output_variance
Whoops!
Bunch all the other components together when demonstrating a decomposition
Not variance - change the text accordingly
Changepoints etc. should select a dimension to act upon (but should pass all data shape and variables downstream)
The 10 fold cross validation needs to be updated
The new data shape parameters need to behave correctly
Should sometimes be very large e.g. twice the data range - this is the neutral value (ie.. inifintiy)
Most aspects of the algorithm can scale appropriately but hard to control everything - should we just standardise data before running the search?
e.g. SE * Per - the posterior mean tends to zero but this is just due to uncertainty about the period
In general, SE*Per should talk about a range of plausible periods
Periodic components seem the most difficult to find - probably requiring good initial values of hyperparameters
Might help for parsimony but does not feel like the right way forward
% signs!
A teaser for learning output warping or a demonstration of deficiency?
How would we compare marginal likelihoods? Check out the warped GP paper.
Rather than the broad mixture that is RQ - maybe try a tighter mixture e.g. a Gaussian centred on a particular lengthscale?
If the sum of two kernel components dramatically reduces uncertainty (at points where the uncertainty is greater than zero e.g. blackouts / changepoints) then these components probably belong together e.g. A + A + B -> 2A + B
Reparametrise it
It is optimised along with the other parameters and should be treated similarly (e.g. jitter)
e.g. for Const kernel which does not depend on dimension
Alternatively - we should not always use masks - only when appropriate
One of these solutions needed to make hashing in multi-d correct
Location should place mass outside of data range since this is what results in linearly increasing / decreasing variance
e.g. multiplying by const, SE*SE
Safest to do when in additive form since it won't affect the search
Derivative w.r.t location can be infty * 0 - can either be fixed by changing order of calculation or by thresholding quantities by realmax
Another (fiddly) way would be to use signed log transforms
Need to restrict to relevant regions - plug in a constant to the changepoints to see where the variance should be measured?
/scratch/home/Research/GPs/gpss-research/experiments/2013-09-26.py
( M(0, SE(ell=0.3, sf=5.5)) + ( M(0, FT(ell=-1.9, p=-0.0, sf=3.2)) x M(0, LN(off=-0.8, ell=1.3, loc=1950.7)) ) )
yielded as much as
( M(0, SE(ell=0.3, sf=5.5)) + M(0, FT(ell=-1.9, p=-0.0, sf=3.2)) )
Seems wrong - but I might have been mistaken
Search operators
Whether or not to include the MAE best kernel in the search as well as marginal likelihood
Will increase the need for anticorrelated component detection (and combination) since laplace will recognise this as being ok
Is there a bug in the change window expansion code?
Better to think of it as variable phase - comment about the period in terms of Fourier transforms
Or only when a few full data iters are done as well?
Lengthscale parameter is invariant to scale - relative to period - default value of zero makes sense - also check that no out of bounds checks are made
It is also numerically unstable - past a certain lengthscale the function might as well just call CovCos
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.