michaelhush / m-loop Goto Github PK
View Code? Open in Web Editor NEWM-LOOP: Machine-learning online optimization package
Home Page: http://m-loop.readthedocs.io/en/latest/
License: MIT License
M-LOOP: Machine-learning online optimization package
Home Page: http://m-loop.readthedocs.io/en/latest/
License: MIT License
Hello,
I'm using M-LOOP with my experiment and so far I've had great results with the gaussian_learner.
I have run into a few issues with the neural_net controller though. It looks like the neuralnet.py uses some deprecated tensorflow features. In particular, when attempting to optimize with it using tensorflow 2.0.0 I get the following error:
Process NeuralNetLearner-1:
Traceback (most recent call last):
File "C:\Users\user_name\.conda\envs\labscript\lib\multiprocessing\process.py", line 297, in _bootstrap
self.run()
File "C:\Users\user_name\.conda\envs\labscript\lib\site-packages\mloop\learners.py", line 1869, in run
self.create_neural_net()
File "C:\Users\user_name\.conda\envs\labscript\lib\site-packages\mloop\learners.py", line 1619, in create_neural_net
n.init()
File "C:\Users\user_name\.conda\envs\labscript\lib\site-packages\mloop\neuralnet.py", line 493, in init
self.net = self._make_net(self.last_net_reg)
File "C:\Users\user_name\.conda\envs\labscript\lib\site-packages\mloop\neuralnet.py", line 431, in _make_net
return SampledNeuralNet(creator, 1)
File "C:\Users\user_name\.conda\envs\labscript\lib\site-packages\mloop\neuralnet.py", line 298, in __init__
self.nets = [self.net_creator() for _ in range(count)]
File "C:\Users\user_name\.conda\envs\labscript\lib\site-packages\mloop\neuralnet.py", line 298, in <listcomp>
self.nets = [self.net_creator() for _ in range(count)]
File "C:\Users\user_name\.conda\envs\labscript\lib\site-packages\mloop\neuralnet.py", line 430, in <lambda>
self.losses_list)
File "C:\Users\user_name\.conda\envs\labscript\lib\site-packages\mloop\neuralnet.py", line 60, in __init__
self.tf_session = tf.Session(graph=self.graph)
AttributeError: module 'tensorflow' has no attribute 'Session'
After some Googling, it seems that this is a deprecated way to use tensorflow. One possible work around is to replace import tensorflow as tf
with the following in neuralnet.py
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
That prevented the error and allowed the optimization to proceed, but it produced the following warnings
WARNING:tensorflow:From C:\Users\user_name\.conda\envs\labscript\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:65: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
2020-02-26 18:24:16.228468: I tensorflow/core/platform/cpu_feature_guard.cc:145] This TensorFlow binary is optimized with Intel(R) MKL-DNN to use the following CPU instructions in performance critical operations: AVX
To enable them in non-MKL-DNN operations, rebuild TensorFlow with the appropriate compiler flags.
2020-02-26 18:24:16.230358: I tensorflow/core/common_runtime/process_util.cc:115] Creating new thread pool with default inter op setting: 4. Tune using inter_op_parallelism_threads for best performance.
So it looks like that work around will likely stop working in a future version of tensorflow. Maybe there is a better way to fix it?
For reference, I have gotten the neural_net controller to work with tensorflow 1.15.0, although it does issue several deprecation warnings.
Cheers,
Zak
Describe the bug
I was realizing that M-LOOP seems to be overestimating the uncertainty in the presence of noise, both in real experiments and on synthetic data. To illustrate this I try to optimize a "experiment" in which I can easily control the level of noise. While being able to fit the function quite well, the uncertainty seems to greatly exceed the seen data. Furthermore, if I fit the data "by hand" using the hyperparameters from M-LOOP, the uncertainty seems to be more realistic. I am worried that this will affect the performance in the real experiment, so I wanted to ask if this behavior is to be expected.
To Reproduce
#Imports for M-LOOP
import mloop.interfaces as mli
import mloop.controllers as mlc
import mloop.visualizations as mlv
import mloop.utilities as mlu
# sklearn imports
import sklearn.gaussian_process as skg
import sklearn.gaussian_process.kernels as skk
import numpy as np
import matplotlib.pyplot as plt
import sys
noise_level = float(sys.argv[1])
input_dict = {
'max_num_runs' : 30,
'num_params' : 1,
'min_boundary' : [0],
'max_boundary' : [1],
'cost_has_noise' : True
}
with open("cost.npy", 'rb') as file:
cost_array = np.load(file)
params_array = np.linspace(0, 1, len(cost_array))
def cost_fct(x, xp, yp, noise):
res = np.interp(x, xp, yp)
res += np.random.normal(0, noise)
return res
class CustomInterface(mli.Interface):
def __init__(self):
super(CustomInterface,self).__init__()
def get_next_cost_dict(self,params_dict):
params = params_dict['params']
cost = cost_fct(params[0], params_array,cost_array, noise_level)
uncer = 0
bad = False
cost_dict = {'cost':cost, 'uncer':uncer, 'bad':bad}
return cost_dict
def main():
print(cost_array, )
interface = CustomInterface()
controller = mlc.create_controller(interface, **input_dict)
controller.optimize()
# visualization
vis = mlv.GaussianProcessVisualizer(controller.ml_learner.total_archive_filename)
vis.plot_cross_sections()
# plotting
plt.figure(1)
plt.scatter(controller.out_params, controller.in_costs, label="sampled data")
plt.plot(params_array, cost_array, '-', label="true data")
plt.title("Landscape M-LOOP (noise = %0.3f)" % noise_level)
plt.legend()
# manual fit
gp_kernel = skk.RBF(vis.length_scale)
gp_kernel += skk.WhiteKernel(vis.noise_level)
alpha = vis.all_uncers**2
gaussian_process = skg.GaussianProcessRegressor(kernel=gp_kernel, n_restarts_optimizer=10)
gaussian_process.fit(vis.all_params,vis.all_costs)
params = np.linspace(0, 1, 100).reshape(-1, 1)
(cost, uncer) = gaussian_process.predict(params, return_std=True)
#plotting
plt.figure(2)
plt.title("Landscape (noise = %0.3f)")
plt.plot(params, cost, 'r-', label="fit")
plt.plot(params,cost+uncer, 'r--')
plt.plot(params, cost-uncer, 'r--')
plt.scatter(controller.out_params, controller.in_costs, label="sampled data")
plt.plot(params_array, cost_array, '-', label="true data")
plt.title("Landscape M-LOOP (noise = %0.3f)" % noise_level)
plt.xlim(0, 1)
plt.legend()
plt.show()
if __name__ == '__main__':
main()
Expected behavior
Good fits, but very large uncertainty
Describe the bug
Creating a learner visualizer instance causes M-LOOP to create a directory called M-LOOP_archives
in the current working directory.
To Reproduce
Steps to reproduce the behavior:
M-LOOP_archives
M-LOOP_archives
directory has been created in the current working directory.learner_archive_filename = '' # Set to learner achive file, including path and extension.
import mloop.visualizations as mlv
learner_visualizer = mlv.create_learner_visualizer_from_archive(learner_archive_filename)
Expected behavior
The M-LOOP_archives
directory should be created if necessary when an optimization is started since M-LOOP needs a place to store the new archive files for the new optimization. However, that directory should not be created when instantiating a visualizer instance to plot data from an existing learner archive as no new files will be created.
Additional context
Results from an optimization are plotted using the visualizer classes. The learner visualizer classes inherit from the learner classes themselves, so learners.Learner.__init__()
is run when a learner visualizer is instantiated. That method then creates the directory M-LOOP_archives
in the current working directory if it doesn't already exist. That behavior makes sense when the Learner.__init__()
is being run while creating a learner to start an optimization, but ideally it shouldn't happen when just creating a visualizer.
It might be possible to fix this by passing learner_archive_filename=None
to the parent __init__()
methods in the visualizer classes. That should avoid the directory creation because of the if learner_archive_filename is None
statement in Learner.__init__()
. I haven't tried this though.
Not sure if I'll get a chance to fix this soon, but figured I'd post a bug report about it now before I forget. It's a pretty minor and inconsequential bug anyway, though it does lead to some file system clutter.
Describe the bug
When running the tests, execution hangs. Pointed out in #111 (comment).
To Reproduce
Steps to reproduce the behavior:
pytest -v
test_examples.py::TestExamples::test_shell_interface_config
.Desktop (please complete the following information):
Additional context
Manually running the code from that test gives the following error:
Traceback (most recent call last):
File "C:\Users\user_name\Software\anaconda3\envs\mloop_install_test_2\lib\threading.py", line 926, in _bootstrap_inner
self.run()
File "c:\users\user_name\software\m-loop\mloop\interfaces.py", line 106, in run
cost_dict = self.get_next_cost_dict(params_dict)
File "c:\users\user_name\software\m-loop\mloop\interfaces.py", line 287, in get_next_cost_dict
self.param_names.append('param' + str(ind+1))
AttributeError: 'NoneType' object has no attribute 'append'
mlv.show_all_default_visualizations_from_archive()
no longer supports the upload_cross_sections
option for uploading to plotly, and the script lanscape_vis.py
in the tools
directory should be updated to reflect that.
This is an easy fix, which I hopefully will have a chance to take care of in a few weeks
Make an interface which for experiments run from the command line. With a result returned on the console.
Hi all,
I came across this issue when trying to plot the results from an optimization run done using the neural_net controller. Here is the code to produce the error:
import mloop.visualizations as mlv
filename = r"C:\path\to\learner_archive_2020-02-22_06-44.txt"
file_type='txt'
visualization = mlv.NeuralNetVisualizer(filename, file_type)
and here's the resulting traceback:
AttributeError ย ย ย ย ย ย ย ย ย ย ย ย ย ย Traceback (most recent call last)
<ipython-input-1-21542a9fb6a5> in <module>
ย ย ย 12 # mlv.configure_plots()
ย ย ย 13 # mlv.create_neural_net_learner_visualizations(learner_archive,file_type='txt')
---> 14 visualization = mlv.NeuralNetVisualizer(filename, file_type)
ย ย ย 15
ย ย ย 16 plt.show()
~\.conda\envs\labscript\lib\site-packages\mloop\visualizations.py in __init__(self, filename, file_type, **kwargs)
ย ย 608 ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย nn_training_file_type = file_type,
ย ย 609 ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย update_hyperparameters = False,
--> 610 ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย **kwargs)
ย ย 611
ย ย 612 ย ย ย ย import plotly.plotly as py
~\.conda\envs\labscript\lib\site-packages\mloop\learners.py in __init__(self, trust_region, default_bad_cost, default_bad_uncertainty, nn_training_filename, nn_training_file_type, minimum_uncertainty, predict_global_minima_at_end, **kwargs)
ย ย 1478 ย ย ย ย ย ย #Data from previous experiment
ย ย 1479 ย ย ย ย ย ย self.all_params = np.array(self.training_dict['all_params'], dtype=float)
-> 1480 ย ย ย ย ย ย self.all_costs = mlu.safe_squeeze(self.training_dict['all_costs'])
ย ย 1481 ย ย ย ย ย ย self.all_uncers = mlu.safe_squeeze(self.training_dict['all_uncers'])
ย ย 1482
AttributeError: module 'mloop.utilities' has no attribute 'safe_squeeze'
Maybe this function was superseded by safe_cast_to_array()
?
I'm happy to submit the learner archive if desired. I'd also be happy to fix the bug and issue a pull request if it's a simple matter of changing safe_squeeze()
to safe_cast_to_array()
.
Cheers,
Zak
Hello! I tried to run the code several time and sometimes (but not always), it stops in the middle (not always in the same place). This is some output from one of the runs where it stopped (the last lines of the terminal output - Powershell on Windows 10):
INFO cost 32.265573116854235 +/- 0.0
INFO Run: 70 (machine learner)
INFO params [-500. -965.9951292 -600.97861861 -300. -352.33433901]
INFO cost 1000.0 +/- 0.0
INFO Run: 71 (trainer)
INFO params [-1298.35679689 -816.87250569 -615.15777157 -426.3981931
-499.14163221]
INFO cost 1000.0 +/- 0.0
INFO Run: 72 (machine learner)
INFO params [-1265.03281961 -1137.39862994 -657.10271614 -719.20824797
-377.11512437]
INFO cost 1000.0 +/- 0.0
INFO Run: 73 (machine learner)
INFO params [-1096.93149896 -1076.23267012 -614.1589204 -706.57107924
-351.21274201]
INFO cost 49.006086364989656 +/- 0.0
INFO Run: 74 (machine learner)
INFO params [-1500. -1313.5352554 -765.02461555 -796.65924047
-369.23886747]
INFO cost 107.2868748223299 +/- 0.0
INFO Run: 75 (trainer)
INFO params [-726.25157754 -628.76582468 -604.00469597 -403.95902915 -397.53860198]
Sometimes it stops at a very early run (<10) sometimes it goes few hundred runs, sometimes it finishes, without changing anything in the python script, just re-running it. This is the get_next_cost_dict
:
def get_next_cost_dict(self,params_dict):
params = params_dict['params']
V1 = params[0]
V2 = params[1]
V3 = params[2]
V4 = params[3]
V5 = params[4]
filename = "out_test.txt"
try:
os.remove(filename)
except:
print("")
subprocess.call(r"powershell.exe & '.\SIMION 8.1.lnk' --nogui fastadj .\electrode.PA0 " + "1=" + str(V1) + ",2=" + str(V2) + ",3=" + str(V3) + ",4=" + str(V4) + ",5=" + str(V5))
subprocess.call(r"powershell.exe & '.\SIMION 8.1.lnk' --nogui fly --restore-potentials=0 --recording-output=out_test.txt .\electrode.iob")
while True:
if os.path.isfile(".\out_test.txt"):
data = np.loadtxt(filename,skiprows=1,usecols=0)
if len(data)==500:
break
else:
time.sleep(0.2)
else:
time.sleep(0.2)
data_simion = np.loadtxt(filename,skiprows=1)
idx = np.where(data_simion[:,0]==50)
data_simion = data_simion[idx]
pos_y = data_simion[:,1]
pos_z = data_simion[:,2]
radius = np.sqrt((pos_y)**2+(pos_z)**2)
if len(data_simion)<400:
resolution = 1000
else:
resolution = np.std(radius)/np.mean(radius)*200
new_func_value = resolution
os.remove(filename)
cost = np.sum(new_func_value)
uncer = 0
bad = False
cost_dict = {'cost':cost, 'uncer':uncer, 'bad':bad}
return cost_dict
and this is the main function:
def main():
filename = "learner_archive_" + str(strftime("%Y-%m-%d_%H-%M")) + ".txt"
interface = CustomInterface()
controller = mlc.create_controller(interface,
controller_type='neural_net',
#controller_type='gaussian_process',
max_num_runs = 500,
num_params = 5,
min_boundary = [-1500,-1500,-1000,-1000,-500],
max_boundary = [-500,-500,-300,-300,-100],
first_params = [-1000,-1000,-500,-500,-300],
training_type = "differential_evolution",
num_training_runs = 50)
controller.optimize()
print('Best parameters found:')
best_params = controller.best_params
best_cost = controller.best_cost
print(best_params)
print(best_cost)
I get this behavior both for NN and Gaussian process.
Many experiments are already running on python. Add a tutorial to the documentation on how to use M-LOOP as a python library.
Describe the bug
Syntax error when installing:
python setup.py develop
To Reproduce
Upon recloning:
git clone git://github.com/michaelhush/M-LOOP.git
Running installation:
python setup.py develop
Expected behavior
Successful completion of installation.
Screenshots
/tmp/easy_install-y_M61L/pytest-runner-5.3.1/temp/easy_install-UeIPEu/setuptools_scm-6.0.1/src
<pkg_resources.WorkingSet object at 0xb59f03d0>
Traceback (most recent call last):
File "setup.py", line 73, in
main()
File "setup.py", line 68, in main
'Topic :: Scientific/Engineering :: Physics']
File "/usr/lib/python2.7/dist-packages/setuptools/init.py", line 144, in setup
_install_setup_requires(attrs)
File "/usr/lib/python2.7/dist-packages/setuptools/init.py", line 139, in _install_setup_requires
dist.fetch_build_eggs(dist.setup_requires)
File "/usr/lib/python2.7/dist-packages/setuptools/dist.py", line 724, in fetch_build_eggs
replace_conflicting=True,
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 782, in resolve
replace_conflicting=replace_conflicting
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 1065, in best_match
return self.obtain(req, installer)
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 1077, in obtain
return installer(requirement)
File "/usr/lib/python2.7/dist-packages/setuptools/dist.py", line 791, in fetch_build_egg
return cmd.easy_install(req)
File "/usr/lib/python2.7/dist-packages/setuptools/command/easy_install.py", line 704, in easy_install
return self.install_item(spec, dist.location, tmpdir, deps)
File "/usr/lib/python2.7/dist-packages/setuptools/command/easy_install.py", line 730, in install_item
dists = self.install_eggs(spec, download, tmpdir)
File "/usr/lib/python2.7/dist-packages/setuptools/command/easy_install.py", line 915, in install_eggs
return self.build_and_install(setup_script, setup_base)
File "/usr/lib/python2.7/dist-packages/setuptools/command/easy_install.py", line 1183, in build_and_install
self.run_setup(setup_script, setup_base, args)
File "/usr/lib/python2.7/dist-packages/setuptools/command/easy_install.py", line 1169, in run_setup
run_setup(setup_script, args)
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 253, in run_setup
raise
File "/usr/lib/python2.7/contextlib.py", line 35, in exit
self.gen.throw(type, value, traceback)
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 195, in setup_context
yield
File "/usr/lib/python2.7/contextlib.py", line 35, in exit
self.gen.throw(type, value, traceback)
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 166, in save_modules
saved_exc.resume()
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 141, in resume
six.reraise(type, exc, self._tb)
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 154, in save_modules
yield saved
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 195, in setup_context
yield
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 250, in run_setup
_execfile(setup_script, ns)
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 45, in _execfile
exec(code, globals, locals)
File "/tmp/easy_install-y_M61L/pytest-runner-5.3.1/setup.py", line 21, in
name = 'M-LOOP',
File "/usr/lib/python2.7/dist-packages/setuptools/init.py", line 144, in setup
_install_setup_requires(attrs)
File "/usr/lib/python2.7/dist-packages/setuptools/init.py", line 139, in _install_setup_requires
dist.fetch_build_eggs(dist.setup_requires)
File "/usr/lib/python2.7/dist-packages/setuptools/dist.py", line 724, in fetch_build_eggs
replace_conflicting=True,
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 782, in resolve
replace_conflicting=replace_conflicting
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 1065, in best_match
return self.obtain(req, installer)
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 1077, in obtain
return installer(requirement)
File "/usr/lib/python2.7/dist-packages/setuptools/dist.py", line 791, in fetch_build_egg
return cmd.easy_install(req)
File "/usr/lib/python2.7/dist-packages/setuptools/command/easy_install.py", line 704, in easy_install
return self.install_item(spec, dist.location, tmpdir, deps)
File "/usr/lib/python2.7/dist-packages/setuptools/command/easy_install.py", line 730, in install_item
dists = self.install_eggs(spec, download, tmpdir)
File "/usr/lib/python2.7/dist-packages/setuptools/command/easy_install.py", line 915, in install_eggs
return self.build_and_install(setup_script, setup_base)
File "/usr/lib/python2.7/dist-packages/setuptools/command/easy_install.py", line 1183, in build_and_install
self.run_setup(setup_script, setup_base, args)
File "/usr/lib/python2.7/dist-packages/setuptools/command/easy_install.py", line 1169, in run_setup
run_setup(setup_script, args)
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 253, in run_setup
raise
File "/usr/lib/python2.7/contextlib.py", line 35, in exit
self.gen.throw(type, value, traceback)
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 195, in setup_context
yield
File "/usr/lib/python2.7/contextlib.py", line 35, in exit
self.gen.throw(type, value, traceback)
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 166, in save_modules
saved_exc.resume()
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 141, in resume
six.reraise(type, exc, self._tb)
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 154, in save_modules
yield saved
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 195, in setup_context
yield
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 250, in run_setup
_execfile(setup_script, ns)
File "/usr/lib/python2.7/dist-packages/setuptools/sandbox.py", line 45, in _execfile
exec(code, globals, locals)
File "/tmp/easy_install-y_M61L/pytest-runner-5.3.1/temp/easy_install-UeIPEu/setuptools_scm-6.0.1/setup.py", line 52, in
download_url = 'https://github.com/michaelhush/M-LOOP/tarball/3.2.1',
File "/tmp/easy_install-y_M61L/pytest-runner-5.3.1/temp/easy_install-UeIPEu/setuptools_scm-6.0.1/setup.py", line 29, in scm_config
File "/tmp/easy_install-y_M61L/pytest-runner-5.3.1/temp/easy_install-UeIPEu/setuptools_scm-6.0.1/src/setuptools_scm/init.py", line 8, in
File "/tmp/easy_install-y_M61L/pytest-runner-5.3.1/temp/easy_install-UeIPEu/setuptools_scm-6.0.1/src/setuptools_scm/config.py", line 6, in
File "/tmp/easy_install-y_M61L/pytest-runner-5.3.1/temp/easy_install-UeIPEu/setuptools_scm-6.0.1/src/setuptools_scm/utils.py", line 41
print(*k)
^
SyntaxError: invalid syntax
Desktop (please complete the following information):
Describe the bug
I am trying to use data from previous experiments as input for my optimizer. Unfortunately this does not work if I use a Gaussian Process controller (yet, for a Differential Evolution controller everything seems to work). Is this a known issue and is there a workaround?
To Reproduce
#Imports for M-LOOP
import mloop.interfaces as mli
import mloop.controllers as mlc
import mloop.visualizations as mlv
import mloop.utilities as mlu
class CustomInterface(mli.Interface):
def __init__(self):
super(CustomInterface,self).__init__()
self.minimum_params = np.array([0,0.1,-0.1])
def get_next_cost_dict(self,params_dict):
params = params_dict['params']
cost = -np.sum(np.sinc(params - self.minimum_params))
uncer = 0
bad = False
cost_dict = {'cost':cost, 'uncer':uncer, 'bad':bad}
return cost_dict
def main():
input_dict = mlu.get_dict_from_file('config.txt', 'txt')
interface = CustomInterface()
controller = mlc.create_controller(interface, **input_dict)
controller.optimize()
if __name__ == '__main__':
main()
with the configuration file
max_num_runs = 15
target_cost = -2.99
num_params = 3
min_boundary = [-2,-2,-2]
max_boundary = [2,2,2]
controller_type = 'differential_evolution'
param_names=None
training_type = 'differential_evolution'
gp_training_filename = "training_data.txt"
gp_training_file_type = 'txt'
where I either use controller_type = 'differential_evolution'
or controller_type = 'gaussian_process'
. training_data.txt
simply is a learner archive from a previous run.
Expected behavior
controller_type = 'differential_evolution'
and current run controller_type = 'differential_evolution'
: workscontroller_type = 'gaussian_process'
and current run controller_type = 'differential_evolution'
: workscontroller_type = 'gaussian_process'
and current run controller_type = 'gaussian_process'
: TypeError: __init__() got multiple values for keyword argument 'param_names'
controller_type = 'differential_evolution'
and current run controller_type = 'gaussian_process'
: TypeError: __init__() got multiple values for keyword argument 'param_names'
Desktop (please complete the following information):
The following problem has turned up recently in my work with M-LOOP, and I haven't been able to understand why:
I'm using the file interface with another program, let's call it TARGET
. In my interface I have added some terminal output to print when TARGET
starts and stops. I have begun experiencing the following sequence in the terminal:
INFO Run: 1
INFO params [<paramset1>]
Running TARGET...
TARGET done.
INFO cost <cost1>
INFO Run: 2
INFO params [<paramset2>]
Running TARGET...
INFO cost <cost1>
INFO Run: 3
INFO params [<paramset3>]
TARGET done.
INFO cost <cost2>
The problem here is that M-LOOP for some reason reads the same cost twice. This means it will assign the wrong cost to a set of parameters twice, first when it copies the previous cost, and second when the next cost is applied to a set of new, wrong parameters. Note that nothing happens in the interface between TARGET
runs and TARGET
stops, it simply waits for TARGET
to finish.
I have checked to see if exp_output.txt
for some reason might not be deleted, causing M-LOOP to think the experiment finished again, but this is not the case. It is deleted at the correct point in the program. I have the latest version of M-LOOP. This seems a serious problem to me, as the program will completely misunderstand the parameter-cost space in this way. But for some reason it has only recently begun being a problem, and it only happens sometimes.
Can you help me fix this?
The Gaussian process is run as a separate process. Unfortunately windows does not fork python when it runs a new process but creates a new session and pickles the process object. Currently the GaussianProcessLearner can not be pickled. Currently looking for a work around.
Hi,
Using the new neural net learner we have experienced the following error at the end of a several hundred run long execution.
Traceback (most recent call last):
File "c:\program files\anaconda3\lib\multiprocessing\process.py", line 249, in _bootstrap
self.run()
File "c:\program files\anaconda3\lib\site-packages\mloop\learners.py", line 1914, in run
self.fit_neural_net()
AttributeError: 'NeuralNetLearner' object has no attribute 'fit_neural_net'
The logs, archives, and config files are given in:
mloop neural net fail.txt
mloopfiles.zip
Previously we have had good success using the gaussian process learner.
Thanks.
Ashby
A differential optimizer as a compliment to the machine learning algorithm, or to benchmark its performance would be helpful.
Hi, I've been using M-LOOP, but I've had an issue where the learner archive is not being created exclusively for the neural_net learner type. It works fine for the other learners. To distill the issue, I notice that the same thing happens in the less-complicated tutorial code on the M-LOOP website. I am using M-LOOP 3.2.1
To Reproduce
Execute the following code. (It's the same as the tutorial code on the website, but I have specified the controller_type='neural_net')
# Imports for python 2 compatibility
from __future__ import absolute_import, division, print_function
__metaclass__ = type
# Imports for M-LOOP
import mloop.interfaces as mli
import mloop.controllers as mlc
import mloop.visualizations as mlv
# Other imports
import numpy as np
import time
# Declare your custom class that inherits from the Interface class
class CustomInterface(mli.Interface):
# Initialization of the interface, including this method is optional
def __init__(self):
# You must include the super command to call the parent class, Interface, constructor
super(CustomInterface, self).__init__()
# Attributes of the interface can be added here
# If you want to precalculate any variables etc. this is the place to do it
# In this example we will just define the location of the minimum
self.minimum_params = np.array([0, 0.1, -0.1])
# You must include the get_next_cost_dict method in your class
# this method is called whenever M-LOOP wants to run an experiment
def get_next_cost_dict(self, params_dict):
# Get parameters from the provided dictionary
params = params_dict['params']
# Here you can include the code to run your experiment given a particular set of parameters
# In this example we will just evaluate a sum of sinc functions
cost = -np.sum(np.sinc(params - self.minimum_params))
# There is no uncertainty in our result
uncer = 0
# The evaluation will always be a success
bad = False
# Add a small time delay to mimic a real experiment
time.sleep(.01)
# The cost, uncertainty and bad boolean must all be returned as a dictionary
# You can include other variables you want to record as well if you want
cost_dict = {'cost': cost, 'uncer': uncer, 'bad': bad}
return cost_dict
def main():
# M-LOOP can be run with three commands
# First create your interface
interface = CustomInterface()
# Next create the controller. Provide it with your interface and any options you want to set
controller = mlc.create_controller(interface,
controller_type='neural_net',
max_num_runs=1000,
target_cost=-2.99,
num_params=3,
min_boundary=[-2, -2, -2],
max_boundary=[2, 2, 2])
# To run M-LOOP and find the optimal parameters just use the controller method optimize
controller.optimize()
# The results of the optimization will be saved to files and can also be accessed as attributes of the controller.
print('Best parameters found:')
print(controller.best_params)
# You can also run the default sets of visualizations for the controller with one command
mlv.show_all_default_visualizations(controller)
# Ensures main is run when this code is run as a script
if __name__ == '__main__':
main()
The error I get from running this is
Traceback (most recent call last):
File "C:/Users/matth/PycharmProjects/pythonProject/mloop_quick_test.py", line 77, in <module>
main()
File "C:/Users/matth/PycharmProjects/pythonProject/mloop_quick_test.py", line 72, in main
mlv.show_all_default_visualizations(controller)
File "C:\Users\matth\labscript-suite\Python_38\lib\site-packages\mloop\visualizations.py", line 91, in show_all_default_visualizations
create_learner_visualizations(
File "C:\Users\matth\labscript-suite\Python_38\lib\site-packages\mloop\visualizations.py", line 258, in create_learner_visualizations
visualizer = create_learner_visualizer_from_archive(
File "C:\Users\matth\labscript-suite\Python_38\lib\site-packages\mloop\visualizations.py", line 198, in create_learner_visualizer_from_archive
controller_type = mlu.get_controller_type_from_learner_archive(filename)
File "C:\Users\matth\labscript-suite\Python_38\lib\site-packages\mloop\utilities.py", line 274, in get_controller_type_from_learner_archive
learner_dict = get_dict_from_file(learner_filename, file_type)
File "C:\Users\matth\labscript-suite\Python_38\lib\site-packages\mloop\utilities.py", line 222, in get_dict_from_file
dictionary = txt_file_to_dict(filename)
File "C:\Users\matth\labscript-suite\Python_38\lib\site-packages\mloop\utilities.py", line 158, in txt_file_to_dict
with open(filename,'r') as in_file:
FileNotFoundError: [Errno 2] No such file or directory: './M-LOOP_archives/learner_archive_2022-01-05_13-35.txt'
Process finished with exit code 1
because the learner archive file was not created. I do not get this error with the GP or DE learners. Could someone suggest a fix for this? Is there something that I misunderstood? Thanks!
M-LOOP's Gaussian process and neural network optimizers are implementations of Bayesian optimization. The docs should include those magic words, probably at least in the introductory section.
Describe the bug
When trying to compute the best predicted parameters and the associated cost
To Reproduce
Simply run the provided example python_controlled_experiment.py
. At the end of the optimization, when trying to find the predicted optimum, there is a sklearn error, and the optimum is not computed. When adding noise to the cost function to force the predicted optimum to differ from the best found result, I still get the error.
Expected behavior
I suppose I should get at this step the predicted best parameters
Screenshots
Process GaussianProcessLearner-1:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "~/M-LOOP/mloop/learners.py", line 2048, in run
self.find_global_minima()
File "~/M-LOOP/mloop/learners.py", line 2095, in find_global_minima
self.predicted_best_cost = self.cost_scaler.inverse_transform(self.predicted_best_scaled_cost)
File "~/mloop/lib/python3.8/site-packages/scikit_learn-1.0-py3.8-linux-x86_64.egg/sklearn/preprocessing/_data.py", line 1016, in inverse_transform
X = check_array(
File "~/mloop/lib/python3.8/site-packages/scikit_learn-1.0-py3.8-linux-x86_64.egg/sklearn/utils/validation.py", line 761, in check_array
raise ValueError(
ValueError: Expected 2D array, got 1D array instead:
array=[-1.62357476].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
Desktop (please complete the following information):
Currently the interface class inherits from multiprocessing. However running the interface to the experiment in a forked python environment can lead to some trouble in certain OS, particularly when calling matplotlib or other libraries which are not multiprocessing safe.
Hi,
I installed the M-LOOP from source. I have run '$python setup.py develop'. It is done with no error. And then I run '$python setup.py test' an warning occurred,
============================== warnings summary ===============================
mloop\testing.py:71
D:\grocery\M-LOOP\mloop\testing.py:71: DeprecationWarning: invalid escape sequence \s
'''
-- Docs: https://docs.pytest.org/en/latest/warnings.html
================= 18 passed, 1 warnings in 107.85s (0:01:47) ==================
The entries for controller_archive_filename
and controller_archive_file_type
are wrong about what the default values are.
I followed the installation instructions using anaconda, and creating a python environment with scikit-learn (named chipGA). I get an error preventing me from installing M-LOOP:
$ python setup.py test
Traceback (most recent call last):
File "~/anaconda/envs/chipGA/lib/python3.5/site-packages/setuptols-26.1.1-py3.5.egg/setuptools/dist.py", line 434, in fetch_build_egg
AttributeError: 'Distribution' object has no attribute '_egg_fetcher'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "setup.py", line 41, in
'Topic :: Scientific/Engineering :: Physics']
File "/anaconda/envs/chipGA/lib/python3.5/distutils/core.py", ine 108, in setup/anaconda/envs/chipGA/lib/python3.5/site-packages/setuptols-26.1.1-py3.5.egg/setuptools/dist.py", line 348, in init
_setup_distribution = dist = klass(attrs)
File "
File "/anaconda/envs/chipGA/lib/python3.5/site-packages/setuptols-26.1.1-py3.5.egg/setuptools/dist.py", line 394, in fetch_build_eggs/anaconda/envs/chipGA/lib/python3.5/site-packages/setuptols-26.1.1-py3.5.egg/pkg_resources/init.py", line 851, in resolve
File "
File "/anaconda/envs/chipGA/lib/python3.5/site-packages/setuptols-26.1.1-py3.5.egg/pkg_resources/init.py", line 1123, in best_match/anaconda/envs/chipGA/lib/python3.5/site-packages/setuptols-26.1.1-py3.5.egg/pkg_resources/init.py", line 1135, in obtain
File "
File "/anaconda/envs/chipGA/lib/python3.5/site-packages/setuptols-26.1.1-py3.5.egg/setuptools/dist.py", line 453, in fetch_build_egg/anaconda/envs/chipGA/lib/python3.5/site-packages/setuptols-26.1.1-py3.5.egg/setuptools/dist.py", line 418, in get_egg_cache_dir
File "
PermissionError: [Errno 13] Permission denied: './.eggs'
Here's the output of 'conda list' for the environment:
mkl 11.3.3 0
numpy 1.11.1 py35_0
openssl 1.0.2h 2
pip 8.1.2 py35_0
python 3.5.2 0
readline 6.2 2
scikit-learn 0.17.1 np111py35_2
scipy 0.18.0 np111py35_0
setuptools 26.1.1 py35_0
sqlite 3.13.0 0
tk 8.5.18 0
wheel 0.29.0 py35_0
xz 5.2.2 0
zlib 1.2.8 3
While I run the comand M-LOOP -c [my_config.txt] in the root where the comand "M-LOOP" is invoked, appear me two errors, I solved the fist one but later appear the secod error :
1rs error:
root@10814d9fcf9e:/notebooks/M-LOOP# M-LOOP -c [my_config.txt] Traceback (most recent call last): File "/usr/local/bin/M-LOOP", line 6, in exec(compile(open(file).read(), file, 'exec')) File "/notebooks/M-LOOP/bin/M-LOOP", line 38 main(sys.argv[1:])
IndentationError: unindent does not match any outer indentation level
solve this by modifieding the line of code 37 in /bin/M-LOOP file :
if __name__=="__main__":
mp.freeze_support()
main(sys.argv[1:])
and then run : M-LOOP -c [my_config.txt]
2nd Error:
Traceback (most recent call last):
File "/usr/local/bin/M-LOOP", line 6, in
exec(compile(open(file).read(), file, 'exec'))
File "/notebooks/M-LOOP/bin/M-LOOP", line 38, in
main(sys.argv[1:])
File "/notebooks/M-LOOP/bin/M-LOOP", line 34, in main
_ = mll.launch_from_file(config_filename)
File "/notebooks/M-LOOP/mloop/launchers.py", line 26, in launch_from_file
file_kwargs = mlu.get_dict_from_file(config_filename,'txt')
File "/notebooks/M-LOOP/mloop/utilities.py", line 159, in get_dict_from_file
dictionary = txt_file_to_dict(filename)
File "/notebooks/M-LOOP/mloop/utilities.py", line 113, in txt_file_to_dict
with open(filename,'r') as in_file:
FileNotFoundError: [Errno 2] No such file or directory: '[my_config.txt]'
I confused about this :/ , the file is in the root where the M-LOOP is called.
One thing I would find convenient was to be able to aranging the plots created by mloop.visualizations
more freely (e.g. position the plots in a specific figure/subfigure). For that one could make the lines
figure_counter += 1
plt.figure(figure_counter)
optional. I think that would be a useful feature, but obviously not necessary.
Installing on windows machines is currently quite complicated. A binary installer would be useful. Preferably with an ability to update the installation.
Hi,
Apologies as this is not really an issue but an asking for advice. I am currently using M-LOOP for optimising ring resonators on a silicon chip. I would like to employ an ANN to do this alongside online optimisation, having already been using the GPR in M-LOOP. I am aware the Neural Net controller can be used out of the box so to speak, however I am unsure as to how to find the depth of this ANN, it's activation functions etc. I am wondering how one can create their own ANN to be used in M-LOOP? So for example can one be created in keras and then imported into M-LOOP, or created in M-LOOP itself. I have read the paper 'Applying machine learning optimization methods to the production of a quantum gas' were they have a self defined ANN and use it in M-LOOP, but I do not know how to do this myself.
Thankyou for any help!
Is your feature request related to a problem? Please describe.
I'm trying to use your software for some optimizations of experimental parameters. One potential problem is that some of my parameters have discrete values, e.g., integers 0, 1, ..., 255, in addition to other parameters that may be continuous. I don't know how to treat these parameters within your framework.
Describe the solution you'd like
Describe alternatives you've considered
I'm not so familiar with the mathematical background for your optimization methods. Thus, I'm not sure to what extent discrete parameter ranges are compatible with Gaussian processes or the other algorithms employed. However, the library scikit-optimize (https://pypi.org/project/scikit-optimize/) also uses Gaussian process regression and does allows discrete or categorical parameter ranges. As scikit-learn is commonly used for hyperparameter optimization in machine learning (many dimensions of discrete values), I assume that there are practical ways to allow discrete values in Gaussian process regression.
I have been using scikit-optimize. However, your library seems to have some features that scikit-optimize is lacking, such as parallel data acquisition and processing. Therefore I will be trying it as well and comparing results.
An obvious "alternative" is to treat the discrete spaces as continuous and round off the parameters when they are fed to experimental control. For reasonably fine-grained parameter ranges, such as 256 or 1000 distinct values, I imagine this might work, even though it technically violates some smoothness assumptions implied by the algorithms? I would love to hear some comments from you if this is feasible or would break the optimization. What kinds of hyper-parameters or modes of operation would be best suited to this scenario?
While testing M-LOOP by manually deleting exp_input and creating exp_output.txt, using the simple configuration
num_params = 2; max_num_runs = 3
I was unable to get the program to stop. Using configuration
num_params = 2; max_num_runs = 3; target_cost = 0.1
it ends seemingly at random, both running more than three times, and continuing past costs < 0.1. See attached file for an example of behaviour with the second configuration.
strange_output.txt
Is it not supposed to stop before 7 runs, or at the instant it surpasses target_cost?
I'm running on Linux Mint.
Describe the bug
With recent versions of scikit-learn (somewhere between 1.1.2 and 1.2.1), the scaler classes now prohibit using *args
in the call signatures for their __init__()
methods. M-LOOP's custom ParameterScaler
class inherits from scikit-learn's MinMaxScaler
and uses *args
to pass any additional positional arguments to the parent class. We don't actually use that capability at all (it was just put there to be flexible if the parent class call signature changes), so it's easy to just remove *args
from the call signature.
To Reproduce
Steps to reproduce the behavior:
Desktop (please complete the following information):
I realized that when using the the function plot_hyperparameters_vs_run I could get an error quite easily in case I had less different length scales than parameters (e.g. when having only one length scale for all). I could easily circumvent this by simply specifying a parameter subset but fixing this might be useful.
Thank you for the great work you put into this project. I love using M-LOOP!
Describe the bug
The test_gaussian_process_complete_config()
test fails due to the presence of an unused keyword argument in the config file.
To Reproduce
Steps to reproduce the behavior:
pytest -v -k gaussian_process_complete_config
Desktop (please complete the following information):
Additional context
The issue is that the gp_training_override_kwargs
argument is passed to GuassianProcessLearner
, but that option was removed in 845a74b. I forgot to update a couple places in the code when doing that. In particular I forgot to remove it from the complete Gaussian process learner config file, which is the reason that this test fails. I also forgot to remove it from the call to super()
in GaussianProcessVisualizer.__init__()
, though that doesn't have any effect here. Fortunately these are easy fixes.
Here is the test output of the test for reference:
==================================================================================================== test session starts =====================================================================================================
platform win32 -- Python 3.7.6, pytest-5.3.5, py-1.8.1, pluggy-0.13.1 -- C:\Users\user_name\Software\anaconda3\envs\mloop_install_test_2\python.exe
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('C:\\Users\\user_name\\Software\\M-LOOP\\.hypothesis\\examples')
rootdir: C:\Users\user_name\Software\M-LOOP
plugins: hypothesis-5.5.4, arraydiff-0.3, astropy-header-0.1.2, doctestplus-0.5.0, openfiles-0.4.0, remotedata-0.3.2
collected 20 items / 19 deselected / 1 selected
tests/test_examples.py::TestExamples::test_gaussian_process_complete_config FAILED [100%]
========================================================================================================== FAILURES ==========================================================================================================
_____________________________________________________________________________________ TestExamples.test_gaussian_process_complete_config _____________________________________________________________________________________
self = <test_examples.TestExamples testMethod=test_gaussian_process_complete_config>
def test_gaussian_process_complete_config(self):
controller = mll.launch_from_file(mlu.mloop_path+'/../examples/gaussian_process_complete_config.txt',
interface_type = 'test',
no_delay = False,
> **self.override_dict)
test_examples.py:100:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
config_filename = 'c:\\users\\user_name\\software\\m-loop\\mloop/../examples/gaussian_process_complete_config.txt', kwargs = {'console_log_level': 10, 'file_log_level': 30, 'interface_type': 'test', 'no_delay': False, ...}
file_kwargs = {'gp_training_override_kwargs': False}, interface = <TestInterface(Thread-1, initial)>, controller = <mloop.controllers.GaussianProcessController object at 0x00000238A3112C88>
extras_kwargs = {'visualizations': False}
def launch_from_file(config_filename,
**kwargs):
'''
Launch M-LOOP using a configuration file. See configuration file documentation.
Args:
config_filename (str): Filename of configuration file
**kwargs : keywords that override the keywords in the file.
Returns:
controller (Controller): Controller for optimization.
'''
try:
file_kwargs = mlu.get_dict_from_file(config_filename,'txt')
except (IOError, OSError):
print('Unable to open M-LOOP configuration file:' + repr(config_filename))
raise
file_kwargs.update(kwargs)
#Main run sequence
#Create interface and extract unused keywords
interface = mli.create_interface(**file_kwargs)
file_kwargs = interface.remaining_kwargs
#Create controller and extract unused keywords
controller = mlc.create_controller(interface, **file_kwargs)
file_kwargs = controller.remaining_kwargs
#Extract keywords for post processing extras, and raise an error if any keywords were unused.
extras_kwargs = _pop_extras_kwargs(file_kwargs)
if file_kwargs:
logging.getLogger(__name__).error('Unused extra options provided:' + repr(file_kwargs))
> raise ValueError
E ValueError
..\mloop\launchers.py:42: ValueError
---------------------------------------------------------------------------------------------------- Captured stdout call ----------------------------------------------------------------------------------------------------
INFO M-LOOP version 3.2.1
DEBUG M-LOOP Logger configured.
DEBUG Creating interface.
DEBUG Setting default landscapes
INFO Using the test interface with the experiment.
DEBUG Controller init completed.
DEBUG Learner init completed.
DEBUG Random learner init completed.
DEBUG Learner init completed.
ERROR Unused extra options provided:{'gp_training_override_kwargs': False}
----------------------------------------------------------------------------------------------------- Captured log call ------------------------------------------------------------------------------------------------------
INFO mloop:utilities.py:86 M-LOOP version 3.2.1
DEBUG mloop:utilities.py:87 M-LOOP Logger configured.
DEBUG mloop.interfaces:interfaces.py:82 Creating interface.
DEBUG mloop.testing:testing.py:33 Setting default landscapes
INFO mloop.interfaces:interfaces.py:40 Using the test interface with the experiment.
DEBUG mloop.controllers:controllers.py:235 Controller init completed.
DEBUG mloop.learners.1:learners.py:185 Learner init completed.
DEBUG mloop.learners.1:learners.py:469 Random learner init completed.
DEBUG mloop.learners.2:learners.py:185 Learner init completed.
ERROR mloop.launchers:launchers.py:41 Unused extra options provided:{'gp_training_override_kwargs': False}
============================================================================================== 1 failed, 19 deselected in 3.37s ==============================================================================================
Right now the docs suggest running python setup.py develop
to perform an editable install, but the better approach is to use pip install -e <path_to_mloop>
. See e.g. https://stackoverflow.com/questions/30306099/pip-install-editable-vs-python-setup-py-develop.
Similarly, the suggested testing command at the moment is python setup.py test
which is now a deprecated approach (see pypa/setuptools#1684, pytest-dev/pytest#5534, pytest-dev/pytest#5546). The command pytest
runs the tests, so that should be the suggested method.
I am trying to reproduce some visualizations as you have written in http://m-loop.readthedocs.io/en/latest/visualizations.html#reproducing-visualizations, but when I run the sample code with one of my archives, I get the following error:
/path/anaconda3/lib/python3.5/site-packages/sklearn/gaussian_process/gpr.py:308: UserWarning: Predicted variances smaller than 0. Setting those variances to 0.
warnings.warn("Predicted variances smaller than 0. "
Traceback (most recent call last):
File "revisualize.py", line 6, in
mlv.create_gaussian_process_learner_visualizations('M-LOOP_archives/learner_archive_2017-02-13_10-07.txt',file_type='txt')
File "/path/M-LOOP/mloop/visualizations.py", line 360, in create_gaussian_process_learner_visualizations
visualization.plot_all_minima_vs_cost()
File "/path/M-LOOP/mloop/visualizations.py", line 489, in plot_all_minima_vs_cost
if not self.has_all_minima:
AttributeError: 'GaussianProcessVisualizer' object has no attribute 'has_all_minima'
This is both for processes that have finished and have been interrupted. How do I fix this? Archives below.
controller_archive_2017-02-13_10-07.txt
learner_archive_2017-02-13_10-07.txt
In particular:
GaussianProcessVisualizer.plot_hyperparameters_vs_run()
'...vs fit number'
.GaussianProcessLearner.plot_noise_level_vs_run()
has the same issues as plot_hyperparameters_vs_run()
.NeuralNetVisualizer.plot_losses()
To make things more clear I was thinking of making the following changes:
GaussianProcessVisualizer
methods mentioned above to end in vs_fit()
.
NeuralNetVisualizer.plot_losses()
to say epochs instead of training run.
@charmasaur do you have any opinions on this? If not I'll just make the changes listed above, probably some time in the next week.
Looks like there's two small errors in the evolution_strategy
docstring section of the DifferentialEvolutionLearner
in learners.py. The strategy 'best1'
is listed twice (one should be 'best2'
) and the default is said to be 'best2'
but appears to be set to 'best1'
in DifferentialEvolutionLearner.__init__()
.
Make M-LOOP compatible with python 2 and 3
I'm trying to study the evolution of the GP learner over time (i.e. how does the fit improve after 10, 50, 100... runs). My original method involved collecting the parameters from the controller_archive dictionary for each run where a lower cost was found than all previous runs:
(plotting cost of each run vs run number, blue points are runs, blue line marks current minimum cost and red markers represent some examples of new, better costs being found)
However this method starts to work less well when noise is added to the fit, as the cost function can't decrease as quickly, and I think it would be interesting to see how the learner's best guess changes over time.
The ideal solution for this would be to optionally run mloop.learners.find_global_minima() and record the output every x runs in the learner_archive or controller_archive. Optionally because it will probably cost performance even though it's only a quick search, if it runs often.
Currently, find_global_minima() is only run if the Learner attribute predict_global_minima_at_end is True, after optimisation. This change would mean the loop within Learner.run() is modified to include an if statement:
self.best_params_every_x_runs = []
for _ in range(self.generation_num):
self.log.debug('Gaussian process learner generating parameter:'+ str(self.params_count+1))
next_params = self.find_next_parameters()
self.params_out_queue.put(next_params)
if record_minima_over_time:
if _ % x == 0:
self.best_params_every_x_runs.append(self.find_global_minima())
if self.end_event.is_set():
raise LearnerInterrupt()
Here, x
is the repetition rate aka how often the global minimum is checked and record_minima_over_time is a boolean which is True if recording is enabled. This is from the GaussianProcessLearner but I assume the NeuralNetLearner would work similarly.
I'd be happy to implement this in principle but I haven't fully understood how the controller_archive process works yet, and I'm very new to open-source & github. I checked the current parameters in both controller & learner archive and they don't appear to contain this data already either. If there's a way to do this without a pull request, please let me know!
I am following your installation instructions on Ubuntu 16.04. I have python 3.5.2 and fully functional tensorflow-gpu 1.2.0 installed on it, but I receive the following error:
Processing dependencies for M-LOOP==2.2.0
Searching for tensorflow>=1.2.0
Reading https://pypi.python.org/simple/tensorflow/
No local packages or working download links found for tensorflow>=1.2.0
error: Could not find suitable distribution for Requirement.parse('tensorflow>=1.2.0')
Bug Desctiption
Since version 1.24 numpy
no longer has the attributes np.float
(or np.int
etc). They have been deprecated since numpy 1.20. They are still used in M-LOOP, which will cause issues when using latest numpy. See https://numpy.org/doc/stable/release/1.20.0-notes.html#using-the-aliases-of-builtin-types-like-np-int-is-deprecated for the deprecation note.
To Reproduce
Steps to reproduce the behavior:
AttributeError: module 'numpy' has no attribute 'float'.
Expected behavior
Don't use stuff that doesn't exist.
Additional context
This is trivial to fix (just replace np.float
with float
). I can create a PR if that helps.
Currently if you want to provide a training archive to an optimization using a Gaussian process, you have to pass the name of the file as gp_training_filename
. On the other hand, if you want to do the same thing for the neural net you have to pass the name of the file as nn_training_filename
. That's a bit annoying because you have to change the argument name if you change which learner you want to use, even though you're passing in the same file to fulfill the same role. For concreteness, I should mention that both learners are able to take an archive from a previous optimization run with any learner. So it's not the case that these arguments have different names because they need to be from optimizations run with different learners. We should deprecate these arguments and just use training_filename
argument for the MachineLearner
class (then update the docs accordingly).
Along these same lines, we should deprecate the gp_training_file_type
and nn_training_file_type
arguments since the file type should just be automatically determined from the file extension. Mentions of these options should then be removed from the documentation.
Describe the bug
At the end of an optimization with the neural net learner, the predicted cross sections show signs of overfitting. I haven't done a full study on how this affects optimizations, but it leads to extra local minima in the predicted cost landscape which may make it harder to pick new parameter values. It also makes it harder for the user to interpret the cross sections as they have to guess which features are real and which are due to overfittting.
Expected behavior
The predicted landscapes from the neural net learner should fit the data to a reasonable degree; they should be relatively smooth without sharp wiggles and spikes.
Desktop:
To Reproduce
This can be seen by running an optimization with the neural net learner. For convenience I've attached the files from an optimization that demonstrates this behavior: 2020-11-04_1667_raman_cool_n_stage_4.zip. I've also included some example code below to play around with the results in those attached files. I'll omit many of the plots generated but attach enough of them to demonstrate the issue. Generally I'll include one of the cross section plots generated by a single net and the plot that shows the min/max/average of the predictions of the different nets.
First, here is what the predicted cross sections look like at the end of the optimization:
# Set options (Set learner_archive to the correct path for your machine)
learner_archive = 'path/to/learner_archive_2020-11-04_22-53.txt'
# Load in data from archive.
import mloop.visualizations as mlv
learner_visualizer = mlv.create_learner_visualizer_from_archive(learner_archive)
# Show cross sections generated during optimization.
learner_visualizer.do_cross_sections()
Which yields four plots, including these two:
There are a lot of sharp edges, wiggles, and local minima which are likely due to overfitting. To check that we can delete the neural nets then recreate them with the same regularization coefficient. The only difference is that their weights will be reinitialized to random values then trained.
# Delete the existing nets and create/train new ones.
for net in learner_visualizer.neural_net:
net.destroy()
learner_visualizer.create_neural_net()
for j in range(learner_visualizer.num_nets):
learner_visualizer._fit_neural_net(j)
# Plot new fitted cross sections.
learner_visualizer.do_cross_sections()
This yields
The predicted cross sections are now much smoother, as the cost landscapes presumably are. As a further check we can make sure that training the nets more causes overfitting again. The following section took a while, ~5 minutes, to run on my machine.
n_extra_trainings = 100
for k in range(n_extra_trainings):
for j in range(learner_visualizer.num_nets):
learner_visualizer._fit_neural_net(j)
# Plot new fitted cross sections.
learner_visualizer.do_cross_sections()
The nets seem overfit again. For example, the blue curve now has a large spike which is not a real feature of the cost landscape. Running the training routines too many times leads to overfitting, and the training routines are run many times throughout an M-LOOP optimization.
Again I haven't done much looking into how this effects M-LOOP's attempts to optimize. This issue certainly does make it harder for the user to interpret the final cost landscapes though, e.g. to figure out which parameters are important or not. Maybe one solution would be to reinitialize the nets periodically during the optimization?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.