vadim0x60 / cibi Goto Github PK
View Code? Open in Web Editor NEWCibi: lifelong reinforcement learning via program generation and scrum
License: Apache License 2.0
Cibi: lifelong reinforcement learning via program generation and scrum
License: Apache License 2.0
Senior Developers have an absolute character limit and they break when code written by Junior Developers exceeds it
I changed behavior of this parameter without changing its name for backwards compat for now.
This will enable:
In most code we treat the memory as any integer, but in some places we treat it as if it is 0-256 only
BrainCoder implements federated learning. We skipped this part for now, but it would bbe handy to have
Cibi 2 should not work with experiments that specify a different version in the spec.
Cibi 3 should only work with experiments that specify cibi 3 as required version in the spec
Probabilities of various mutations/crossovers should be trained just like the LSTM weights are trained
In version 2 this should be set with a flag in experiment spec, in version 3 this should become the default
With install_requires
and command line utilities
Make a script to transition experimental results that are still valid to version 3
It's bad. Summaries aren't summaries, but huge files with unnecessary info and even redundancy.
Simple questions like "Which were the best programs?" take an hour of regex-wielding to answer
So that it is a good example for cibi
We should positively reinforce developers of parents of good programs
Officially we support restarts, but when you actually restart an experiment sometimes this happens:
Traceback (most recent call last):
File "/home/mcs001/20194474/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/home/mcs001/20194474/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/home/mcs001/20194474/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 'then' and 'else' must have the same size. but received: [4,23] vs. [4,13]
[[{{node local/policy/rnn/while/Select_2}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/mcs001/20194474/cibi/cibi/train.py", line 132, in run_experiments
rollout = agent.attend_gym(env, max_reps=max_episode_length, render=render)
File "/home/mcs001/20194474/cibi/cibi/agent.py", line 31, in attend_gym
self.init()
File "/home/mcs001/20194474/cibi/cibi/scrum_master.py", line 82, in init
self.reprogram()
File "/home/mcs001/20194474/cibi/cibi/scrum_master.py", line 145, in reprogram
self.write_programs()
File "/home/mcs001/20194474/cibi/cibi/scrum_master.py", line 136, in write_programs
programs = self.lead_developer.write_programs(self.archive_branch)
File "/home/mcs001/20194474/cibi/cibi/senior_developer.py", line 286, in write_programs
return self.developer.write_programs(self.session, inspiration_branch)
File "/home/mcs001/20194474/cibi/cibi/senior_developer.py", line 182, in write_programs
return self.model.write_programs(session, inspiration_branch)
File "/home/mcs001/20194474/cibi/cibi/lm.py", line 653, in write_programs
self.sampled_batch.log_probs])
File "/home/mcs001/20194474/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/home/mcs001/20194474/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "/home/mcs001/20194474/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/home/mcs001/20194474/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 'then' and 'else' must have the same size. but received: [4,23] vs. [4,13]
[[node local/policy/rnn/while/Select_2 (defined at /.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
Original stack trace for 'local/policy/rnn/while/Select_2':
File "/cibi/cibi/train.py", line 161, in <module>
run_experiments()
File "/.local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/.local/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/.local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/.local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/cibi/cibi/train.py", line 126, in run_experiments
train_dir, events_dir, scrum_config, seed_codebase) as agent:
File "/cibi/cibi/scrum_master.py", line 163, in hire_team
for dev in developers]
File "/cibi/cibi/scrum_master.py", line 163, in <listcomp>
for dev in developers]
File "/cibi/cibi/senior_developer.py", line 250, in hire
self.set_language(language)
File "/cibi/cibi/senior_developer.py", line 127, in set_language
verbose_level=model_v)
File "/cibi/cibi/lm.py", line 318, in __init__
loop_fn=loop_fn)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/rnn.py", line 1252, in raw_rnn
swap_memory=swap_memory)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/control_flow_ops.py", line 2753, in while_loop
return_same_structure)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/control_flow_ops.py", line 2245, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/control_flow_ops.py", line 2170, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/rnn.py", line 1234, in body
emit_output = _copy_some_through(zero_emit, emit_output)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/rnn.py", line 1232, in _copy_some_through
return nest.map_structure(copy_fn, current, candidate)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/util/nest.py", line 536, in map_structure
structure[0], [func(*x) for x in entries],
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/util/nest.py", line 536, in <listcomp>
structure[0], [func(*x) for x in entries],
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/rnn.py", line 1230, in copy_fn
return array_ops.where(elements_finished, cur_i, cand_i)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 324, in new_func
return func(*args, **kwargs)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/util/dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py", line 3759, in where
return gen_math_ops.select(condition=condition, x=x, y=y, name=name)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_math_ops.py", line 9439, in select
"Select", condition=condition, t=x, e=y, name=name)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
self._traceback = tf_stack.extract_stack()
Early stopping? Iteration limit? Keyboard interrupt?
It'd be nice to compare PQT against them
There's a program synthesis benchmark corpus making rounds, PSB2: see if cibi can be tested on it https://zenodo.org/record/5084812#.YOw9XHUzaV4
I have never tested distributed training so it probably doesn't work
The quality metric of a program in the codebase is based on one execution, so it might be way off if the program is unstable. All further operations with this program are based on this (possibly erroneous) number
This seems to be the main/most obvious bottleneck on the way to higher performance on all benchmarks.
If a program is cycled, it's supposed to never terminate: that's the whole point.
But for some reason, they still do.
Hi Vadim,
thanks for the great package!Unfortunately, I got stuck in the installation with the following version issues:
Successfully built gym gym-sepsis wrapt
ERROR: astroid 2.3.3 requires typed-ast<1.5,>=1.4.0; implementation_name == "cpython" and python_version < "3.8", which is not installed.
ERROR: thinc 6.10.3 has requirement wrapt<1.11.0,>=1.10.0, but you'll have wrapt 1.12.1 which is incompatible.
ERROR: spacy 2.0.12 has requirement regex==2017.4.5, but you'll have regex 2020.2.20 which is incompatible.
ERROR: econml 0.6.1 has requirement matplotlib<3.1, but you'll have matplotlib 3.1.3 which is incompatible.
ERROR: econml 0.6.1 has requirement scikit-learn~=0.21.0, but you'll have scikit-learn 0.22.1 which is incompatible.
ERROR: astroid 2.3.3 has requirement wrapt==1.11.*, but you'll have wrapt 1.12.1 which is incompatible.
Installing collected packages: pyglet, box2d-py, gym, deap, gym-sepsis, wrapt
Attempting uninstall: wrapt
Found existing installation: wrapt 1.10.11
ERROR: Cannot uninstall 'wrapt'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
Are these critical to solve or is there another workaround, other than up/down versioning some of the packages?
They got lost on the way somewhere
And have been failing for a while now. Do git bisect
My current GP implemtation lacks them
I might have copied something we don't need here form Brain Coder
Hard to reproduce. occurs in experiments Taxi experiments from time to time.
Traceback attached.
Based on BrainCoder repo, implement a self-reprogramming Brainfuck agent
And neural developers are really good at cheating
Maybe it's normal and we just have to wait, but then we need better logging/monitoring to see why and how it happens.
Re-write task_launcher() so that it filters them out
Staying true to the metaphor, the next step is probably adding an Auditor to the team.
The most important value per experiment, score
is left out of summary.yml
It's confusing. Cibi 3?
The following code
archive_branch = make_prod_codebase(deduplication=True,
save_file=scrum_config['program_file'])
is supposed to load programs from program file. It doesn't.
Check whether this bug has damaged the currently launched experiments.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.