Now that a lot of the issues are handled, I think the next big push is on putting the

This is related to <a class="issue-link js-issue-link" data-error-text="Failed to load

Validator() about talos HOT 9 CLOSED

autonomio commented on May 18, 2024

Validator()

from talos.

Comments (9)

matthewcarbone commented on May 18, 2024

Sounds good! I'll try and think about this some more when I can and respond with anything I come up with 👍

from talos.

matthewcarbone commented on May 18, 2024

Ok so there are a couple things I've been thinking about, and a few layers of computational complexity to worry about. I'll elaborate.

We cannot really confidently choose the best HP data point based on Talos' current state. It certainly works but perhaps not to the degree some users would want. The reason is two-fold. Consider a data set and a train/validation/test split.

First (assume testing data is set aside) often times the validation loss / accuracy is highly dependent on the chunk of data chosen to be the validation set. Talos will currently only give a ballpark answer to the question of "which HP is best" for this reason. A statistical average over these different splits is often necessary. This is of course k-fold cross-validation and is definitely an option we should give the users to implement.
Second, I have also seen in my own work that even for a set HP point and a set k-fold, sometimes the random initializations of the weights has an impact on the answer. In my experience this more often occurs in CNN's but regardless it is still another layer of statistical averaging that should in principle be done before any conclusions can be drawn.

You see where I'm going with this! If Talos currently runs in O(N) time, where N is the number of HP permutations, then in a perfect world if we were to be sure about our choice of "best HP", we would need brute force O(NKL) time to do this, where K is the number of folds and L is the number of times you want to run statistical averages over the random initializations.

So I guess my comment on this discussion would be: should we focus on a Validator() yet, or should be try to be smarter about directing the flow of Scan() so that it finds the optimal HP sooner? Perhaps one of the search methods mentioned in previous issues? I dunno, @mikkokotila, but let's discuss before we start doing more work. 😄

By the way, this is where I implement hardware accelerators since it is much too slow on CPU's.

Google Colab is an amazing option for anyone who doesn't have access to a HPC cluster!

from talos.

mikkokotila commented on May 18, 2024

I agree with everything you say above. Let's figure out the optimization layer first, and then move to validating. With this in mind, I'm working on a major rehaul / refactoring of the codebase so that it is less anxiety inducing to make major changes. Scan() is already completely cleaned, the param handling is completely rebuilt, as is reductions. These would be the three things that play some role in the optimization aspect.

As you may have noted, in the initial architecture I've assumed an approach where:

a scan is started after downsampling it (first reduction)
there is the possibility to apply reducers as the scan progress
each time a reducer is applied, the number of available permutations is lowered

The idea is that we could have many different strategies, which all take as input the results from the previous rounds (from the experiment log) and based on that input reduce the complexity of the rest of the experiment. This in effect happens by removing select items from self.param_log.

I've written an article which should be ready to publish in the next few days that goes a little bit deeper into the reasoning for this approach (as opposed to the approach where random/grid search is considered a taxonomically parallel to something like Bayesian).

Google Colab seems amazing, will definitely try it! :)

from talos.

mikkokotila commented on May 18, 2024

This is related to #17 where some additional comments can be found.

from talos.

matthewcarbone commented on May 18, 2024

I agree with everything you say above. Let's figure out the optimization layer first, and then move to validating. With this in mind, I'm working on a major rehaul / refactoring of the codebase so that it is less anxiety inducing to make major changes. Scan() is already completely cleaned, the param handling is completely rebuilt, as is reductions.

That is awesome news. I stuck a TODO in Scan() for precisely this reason. Can't wait to see it!

I've written an article which should be ready to publish in the next few days that goes a little bit deeper into the reasoning for this approach (as opposed to the approach where random/grid search is considered a taxonomically parallel to something like Bayesian).

Fantastic! We may want to begin linking things in the wiki or something 👍

Google Colab seems amazing, will definitely try it! :)

Please do. It really lowers the barrier of entry into this kind of work which I feel is incredibly important to the scientific community. Lots of smart people out there who want to do ML but don't have the firepower to train deep networks. If you need any help with figuring it out feel free to email me or something. Took me a bit to figure it all out 😄, no need for both of us to waste time!

In any case, after your refactoring I'll reread anything and help you clean up. Then we can move forward!

from talos.

mikkokotila commented on May 18, 2024

This is a nice article (with a comprehensive collection) on the metrics topic

What’s WRONG with Metrics?

from talos.

mikkokotila commented on May 18, 2024

@x94carbone just a heads up that this is moving :) It seems that saving the model does not need any messing around with tf session / graph objects, but we can just save the model as json inside a list in the Scan() object, and the model weights in a separate list in the Scan() object. Then load the model from the json, and set its weights from the corresponding weights. This seems to be the same way as one would do it from a file. Very clean. I will start testing this now.

I will then move on to implementing a k-fold cross validation for "best model" which we can then use to build towards the discussion we've had i.e. several best models being cross validated or and competed against each other in some meaningful way against various sampling methods.

from talos.

mikkokotila commented on May 18, 2024

Well well. We have now implemented f1-score based kfold crossvalidation. If you look at /utils/predict the whole thing becomes quite apparent. The workflow is very simple. After you have concluded the experiment with Scan()

p = ta.Predict(s)
p.evaluate(x, y, average='macro')

Where 's' is the Scan object, and x and y is the cross-validation data. In this case it's multi-class (i.e. y dims > 1) so average is set to 'macro'.

TODO:

add data splitting facility that supports the workflow including both Scan and evaluation
add a proper Validation layer which takes several best models and does what Predict.evaluate is doing now

from talos.

mikkokotila commented on May 18, 2024

This is now available through Evaluate() and Autom8(). Closing here.

from talos.

Validator() about talos HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent