vadim0x60 / cibi Goto Github PK

View Code? Open in Web Editor NEW

7.0 7.0 0.0 702 KB

Cibi: lifelong reinforcement learning via program generation and scrum

License: Apache License 2.0

Python 99.93% Dockerfile 0.07%

cibi's People

Contributors

Stargazers

Watchers

cibi's Issues

Character limit bug

Senior Developers have an absolute character limit and they break when code written by Junior Developers exceeds it

Prevent developers with same names from ever occuring

Change config language so that multiplication of max_sprints by team members is more obvious

I changed behavior of this parameter without changing its name for backwards compat for now.

Bandits should write their parameter states to files

This will enable:

Picking experiments up where you left off
Introspection into what the parameters are, for the paper

Introduce some attribution for previous programs

Remove "base" from the interpreter

In most code we treat the memory as any integer, but in some places we treat it as if it is 0-256 only

Introduce distributed training

BrainCoder implements federated learning. We skipped this part for now, but it would bbe handy to have

Require cibi version

Cibi 2 should not work with experiments that specify a different version in the spec.
Cibi 3 should only work with experiments that specify cibi 3 as required version in the spec

best_model_checkpoint isn't used

Genetic algorithm prioritization

Probabilities of various mutations/crossovers should be trained just like the LSTM weights are trained

In version 2 this should be set with a flag in experiment spec, in version 3 this should become the default

Repackage properly

With install_requires and command line utilities

grandfather.py

Make a script to transition experimental results that are still valid to version 3

Reinvent logging

It's bad. Summaries aren't summaries, but huge files with unnecessary info and even redundancy.
Simple questions like "Which were the best programs?" take an hour of regex-wielding to answer

Incorporate developer session logic into Developer class

Generalize to other programming languages

Make a taxi program that actually works

So that it is a good example for cibi

Programs that don't perform well are still useful if they give birth to good programs

We should positively reinforce developers of parents of good programs

Problem with restarts

Officially we support restarts, but when you actually restart an experiment sometimes this happens:


Traceback (most recent call last):
  File "/home/mcs001/20194474/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/home/mcs001/20194474/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/home/mcs001/20194474/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 'then' and 'else' must have the same size.  but received: [4,23] vs. [4,13]
	 [[{{node local/policy/rnn/while/Select_2}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/mcs001/20194474/cibi/cibi/train.py", line 132, in run_experiments
    rollout = agent.attend_gym(env, max_reps=max_episode_length, render=render)
  File "/home/mcs001/20194474/cibi/cibi/agent.py", line 31, in attend_gym
    self.init()
  File "/home/mcs001/20194474/cibi/cibi/scrum_master.py", line 82, in init
    self.reprogram()
  File "/home/mcs001/20194474/cibi/cibi/scrum_master.py", line 145, in reprogram
    self.write_programs()
  File "/home/mcs001/20194474/cibi/cibi/scrum_master.py", line 136, in write_programs
    programs = self.lead_developer.write_programs(self.archive_branch)
  File "/home/mcs001/20194474/cibi/cibi/senior_developer.py", line 286, in write_programs
    return self.developer.write_programs(self.session, inspiration_branch)
  File "/home/mcs001/20194474/cibi/cibi/senior_developer.py", line 182, in write_programs
    return self.model.write_programs(session, inspiration_branch)
  File "/home/mcs001/20194474/cibi/cibi/lm.py", line 653, in write_programs
    self.sampled_batch.log_probs])
  File "/home/mcs001/20194474/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/home/mcs001/20194474/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/mcs001/20194474/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/home/mcs001/20194474/.local/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 'then' and 'else' must have the same size.  but received: [4,23] vs. [4,13]
	 [[node local/policy/rnn/while/Select_2 (defined at /.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]

Original stack trace for 'local/policy/rnn/while/Select_2':
  File "/cibi/cibi/train.py", line 161, in <module>
    run_experiments()
  File "/.local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/.local/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/.local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/.local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/cibi/cibi/train.py", line 126, in run_experiments
    train_dir, events_dir, scrum_config, seed_codebase) as agent:
  File "/cibi/cibi/scrum_master.py", line 163, in hire_team
    for dev in developers]
  File "/cibi/cibi/scrum_master.py", line 163, in <listcomp>
    for dev in developers]
  File "/cibi/cibi/senior_developer.py", line 250, in hire
    self.set_language(language)
  File "/cibi/cibi/senior_developer.py", line 127, in set_language
    verbose_level=model_v)
  File "/cibi/cibi/lm.py", line 318, in __init__
    loop_fn=loop_fn)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/rnn.py", line 1252, in raw_rnn
    swap_memory=swap_memory)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/control_flow_ops.py", line 2753, in while_loop
    return_same_structure)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/control_flow_ops.py", line 2245, in BuildLoop
    pred, body, original_loop_vars, loop_vars, shape_invariants)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/control_flow_ops.py", line 2170, in _BuildLoop
    body_result = body(*packed_vars_for_body)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/rnn.py", line 1234, in body
    emit_output = _copy_some_through(zero_emit, emit_output)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/rnn.py", line 1232, in _copy_some_through
    return nest.map_structure(copy_fn, current, candidate)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/util/nest.py", line 536, in map_structure
    structure[0], [func(*x) for x in entries],
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/util/nest.py", line 536, in <listcomp>
    structure[0], [func(*x) for x in entries],
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/rnn.py", line 1230, in copy_fn
    return array_ops.where(elements_finished, cur_i, cand_i)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 324, in new_func
    return func(*args, **kwargs)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/util/dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py", line 3759, in where
    return gen_math_ops.select(condition=condition, x=x, y=y, name=name)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_math_ops.py", line 9439, in select
    "Select", condition=condition, t=x, e=y, name=name)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

Specify which stopping criterion lead to experiment's end

Early stopping? Iteration limit? Keyboard interrupt?

Add grammar-bounded program generation

Introduce genetic algorithms

It'd be nice to compare PQT against them

Test on PSB2

There's a program synthesis benchmark corpus making rounds, PSB2: see if cibi can be tested on it https://zenodo.org/record/5084812#.YOw9XHUzaV4

ps-tasks is not used

I have never tested distributed training so it probably doesn't work

Support environment termination

Why are all the generated programs of the same length?

LanguageModel summarizer calls removed functions

Add memory initializer network

Compare input-driven, output-driven frequency and an explicit tick operator

Test programs for several sprints

The quality metric of a program in the codebase is based on one execution, so it might be way off if the program is unstable. All further operations with this program are based on this (possibly erroneous) number

This seems to be the main/most obvious bottleneck on the way to higher performance on all benchmarks.

Cycled programs sometimes terminate

If a program is cycled, it's supposed to never terminate: that's the whole point.
But for some reason, they still do.

Installation

Hi Vadim,

thanks for the great package!Unfortunately, I got stuck in the installation with the following version issues:
Successfully built gym gym-sepsis wrapt
ERROR: astroid 2.3.3 requires typed-ast<1.5,>=1.4.0; implementation_name == "cpython" and python_version < "3.8", which is not installed.
ERROR: thinc 6.10.3 has requirement wrapt<1.11.0,>=1.10.0, but you'll have wrapt 1.12.1 which is incompatible.
ERROR: spacy 2.0.12 has requirement regex==2017.4.5, but you'll have regex 2020.2.20 which is incompatible.
ERROR: econml 0.6.1 has requirement matplotlib<3.1, but you'll have matplotlib 3.1.3 which is incompatible.
ERROR: econml 0.6.1 has requirement scikit-learn~=0.21.0, but you'll have scikit-learn 0.22.1 which is incompatible.
ERROR: astroid 2.3.3 has requirement wrapt==1.11.*, but you'll have wrapt 1.12.1 which is incompatible.
Installing collected packages: pyglet, box2d-py, gym, deap, gym-sepsis, wrapt
Attempting uninstall: wrapt
Found existing installation: wrapt 1.10.11
ERROR: Cannot uninstall 'wrapt'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

Are these critical to solve or is there another workaround, other than up/down versioning some of the packages?

A lot of data on agents performance gets thrown away after an experiment
We don't measure important metrics like "best 100-episode performance"

Staying true to the metaphor, the next step is probably adding an Auditor to the team.

archive_branch = make_prod_codebase(deduplication=True, 
                                            save_file=scrum_config['program_file'])

is supposed to load programs from program file. It doesn't.

Check whether this bug has damaged the currently launched experiments.

vadim0x60 / cibi Goto Github PK

cibi's People

Contributors

Stargazers

Watchers

cibi's Issues

Recommend Projects

Recommend Topics

Recommend Org