The neuralmonkey's discuss from ufal

Package executable

The package should somehow provide an executable. Right now the training may be executed with python -u -m neuralmonkey.train whatever.ini, maybe we should document this somewhere until we find a better way to do this. One of us will have to learn how to manage a proper Python package.

update documentation and helper scripts to TF 0.9

This should have been done in the tf9 branch before the merge.

Do not catch general exceptions in config_loader

When you catch general exceptions there, every exception (eg. import error) from the module that you want something from gets caught and the the error message about non-existence of something that is clearly there is quite confusing. Btw. you should never just catch general exceptions. Fix this after I finish #4.

Prevent colissions of encoders/decoders variables

I you attempt to create more encoders and do not provide its name, it will crash later on collision of the variable scopes. I would suggest a mechanism (probably in utils.py) that would be always asked to get a name and append a number if there would be a collision.

Variables that are saved are not the best

Variables are saved only if they are the best so far. They should be saved whenever the score makes it to top-n scores.

logbook

Why is this logbook thing in master? Is it done or is it work-in-progress? If it's done, flask should be a dependency. It is not used anywhere, is it a stand-alone tool? Then it should be documented somewhere.

tests culture

Test scripts should be moved away from the root directory. Also, why there is two files, tests_run.sh and run_tests.sh and what is their purpose? This should be all done in the tests directory. Also with unit-tests_run.sh, mypy_run.sh, lint_run.sh and others.

Also, one of the run tests scripts (tests_run i think) should use the -P (or --directory-prefix) option of wget instead of cd-ing there and back again. The test-output directory should be generated somewhere else than in the root of the repository, preferable in a temporary location either in /tmp or in a tmp dir in subdirectory of tests.

Things that we need to refactor

While I try to make #6 happen, I find many issues that I am not capable/willing to address. I will maintain a checklist of what needs to be done here. Btw. you would not believe how much code that clearly is not working (random.random > 0.5, unused variables, ...) I've encountered so far.

License

Since we are hosting this publicly, it should have a license. I personally like MIT or BSD3.

Batch size configuration

Why does batch size in ini files appear both in [main] and [runner]? Which one is used?

Run as a web service

Once we have the run.py script it should be extremely easy using flask (which is already a dependency). It will receive a dictionary of dataset series (the same way we have right now) as JSON and send back a JSON with outputs and some statistics.

Make sure histogram gradients are drawn in TensorBoard

Global random_seed

Can we define one random_seed in the top level of the configuration, that will be used everywhere?

Create a functional demo on quest

With webservice ready, it is time to set up something like this:
http://quest.ms.mff.cuni.cz/moses/demo.php

Here's the checklist

Also, creating a new label for issues related to the web service.

Get rid of depricated getargspec() in config_loader.py

Wrong symlink in best variables

Commit add9bdc (fix saving variables) introduced a bug. It creates a symlink with wrong relative address. For example, when I set my output to directory test-out, it creates a link to test-out/data.whatever in that folder. So the script is looking for test-out/test-out/data.whatever instead of test-out/data.whatever.

My preferred solution would be to put the commit into a separate branch (rewinding current master by one commit) and merge #13. That will enable running tests/small.ini on Travis. The new branch can be merged when the bug is fixed. What do you think, @jindrahelcl?

Minimum risk training

Implement minimum risk trainer as described in http://arxiv.org/abs/1512.02433.

sampling is done by computing top-k translations (beam search is already done)

Move to Python 3.5

A second column for comparing logs in LogBook

Add option to hide raw reference and output

Since 5a4498, original (meaning not post-processed) decoded output and pre-processed reference is shown in the validation log. This makes the validation output twice as large and ultimately more hideous. But it's useful when debugging pre- and postprocessing.

unstated dependencies

neuralmonkey/estimate_scheduled_sampling.py is depending on scipy. Scipy is quite a big dependency, I'd hate to install it just for the one function. Can we do something about this?

Lazy dataset will use wildcards

Often data are split into more files. Lazy dataset which is designed for loading bigger datasets should be able to get a list of files / wildcards specifying the files it will read.

Evaluation functions

Evaluation functions should be refactored in callable and comparable objects to simplify the training loop function. They can also define their own name so the output in the log need not to be the name of the function.

Syntax highligh for ini in LogBook

Web service should log all trafic

IP adresses, queries, results... all of this should be stored in some files.

Upgrade to Tensorflow 0.9

This should be mainly rewriting tensorflow.python.ops.rnn* to tf.nn.rnn*.

Should we freeze dependencies?

If we do not have concrete versions of dependencies in requirements, things like error in #54 might be happening from time to time. On the other hand, if we freeze the dependencies, we should check for updates from time to time, which means more work. I'm leaning towards automatic updates and letting the build fail from time to time. We test things fairly regularly now, so we should be able to catch and repair breaking changes. What do you think?

Documentation

Create documentation from docstrings
Publish it automatically

Lazy dataset should have a config method

Lazy dataset should have a similar building function as the standard (in-memory) dataset in the config module. Moreover, its __init__ method should be refactored the same way.

What is the goal of this package?

I'm not quite sure what are we trying to achieve here. What is the goal of this package? How does it differ from tflearn and similar frameworks? Are we writing something that is already done somewhere? If not, what is new here?

These questions should be clearly answered in the README, if we want anybody to use this.

Write our own BLEU score

... and get rid of NLTK, whom I don't believe a single line of code.

imagenet_synset_words.txt

What is imagenet_synset_words.txt doing in the repo?

Postprocessing bug

When I ran my tests/small.ini configuration, it failed on some error with lambda wanting two arguments, but just one was given. I solved this in my branch by removing the second (unused) argument of the lambda on line 36 in learning_utils, but I'm not sure whether this does not break anything else. Can you have a look at this, @jindrahelcl?

Edit: The correct lambda was on line 155, but maybe they should be the same?

Sequence labeler as another decoder

Ensembles

Write support for ensemble models.

The idea is, in the end, to give the running script multiple *.ini files (or one with links to another experiments).

Fix random seeds

Right now we can specify a random seed in configuration, but it does not work.

Do not list enocoders in [main] configuration

Now, the encoders are listed multiple times in the configuration: in the main configuration and as the arguments of a decoder. Duplication is a frequent source of errors, and therefore it should be only in the decoder.

Beamsearch does not work if no ground truth is provided

Implementation of beam search relies on placeholders which are fed fed nothing if the ground thruth sequence is not provided.

Create and maintain specification for the configuration

Discussion on configuration

@jlibovicky, in #15 you mentioned that it would be hard to generate an ini file. Why do we need to do that?

I'm not quite sure what is the design (if any) of the configuration manipulation. I thought that there is an ini file, that gets parsed into an abstract representation (some terrible Python object), then we build a computation graph according to the representation and run it. Is there anything else happening?

Values in <> disappeared in LogBook

It is probably and encoding issue connected to transfer to Python 3.5. Values in <> get ignored in when logbook serves the ini files.

Multiple decoders

We should be able to create models that have more decoders at the same time. E.g. that would classify a sentence and output a sequence at the same time.

The lib directory

Should subword_nmt be a submodule? Are we doing the imports right?

Cannot import 'Levenshtein'

Cannot import 'Levenshtein' when trying to run pylint on evaluation.py. This means that a package is missing in requirements.txt.

Create a package hieararchy

We need to:

create a reasonable directory structure
create __init__.py file in every directory
rewrite relative imports (eg. import neuralmonkey.vocabulary instead of import vocabulary)
enable relative import warnings in .pylintrc

I've already done this for the tests/python directory, I hope it does not break anything.

Logbook shows no expirements on tests

When I run bin/neuralmonkey-logbook --port 5050 --logdir tests/tmp-test-output, I get a screen with "click experiment on the left", but there is no experiment on the left. Why is that?

code review

There are many code review tools integrated with GitHub (eg. Reviewable). Should we use one of them in our workflow?

Run pylint on everything

We should run pylint on everything. For easy automatic checking, every file should have 10/10 score. To achieve this, you may have to locally disable some warnings (# pylint: disable=...) – use this only if it is really necessary. After you eliminate all errors and warnings, add this line to the file:

# tests: lint

All files containing this line are checked with pylint by lint_run.sh, which you should always run before you commit anything and which is automatically run on Travis CI after you push to Github.

You can see the list of files that have not been checked yet with test_status.sh. This issue will be closed when that list is empty.

ufal / neuralmonkey Goto Github PK

neuralmonkey's Issues

Recommend Projects

Recommend Topics

Recommend Org