minerva-ml / minerva-training-materials Goto Github PK

Learn advanced data science on real-life, curated problems

License: MIT License

Python 3.28% Jupyter Notebook 96.72%

data-science machine-learning deep-learning neural-network python training-materials minerva education knowledge training

minerva-training-materials's Introduction

Minerva

Minerva is an educational project that lets you learn advanced data science on real-life, curated problems.

Getting started

Follow the Installation Guide for setup instructions.
Familiarize yourself with our approach: check User Guide or go straight to the Fashion MNIST problem and start solving.
When ready, go to Right Whale Recognition problem to start working on complex problem.

Hands-on approach to learning

With Minerva you will reproduce, piece by piece, a solution to the most difficult data scientific problems, especially challenges. Since each problem is quite complex, we divided it into a collection of small self-contained pieces called tasks.

Task is a single step in machine learning pipeline, it has its own learning objectives, descriptions and a piece of code that needs to be implemented. This is your job: to create a technical implementation that fulfills this gap. You use your engineering skills, extensive experimentation and our feedback in order to make sure that your implementation meets certain quality level. We know what the final score for a well implemented pipeline should be. So as you solve tasks and re-implement parts of the pipeline we will be checking whether your implementation does the job well enough to keep the score high.

Reproduce Kaggle winning solutions in a transparent way → learn advanced data science

Working on tasks that, if taken together, create solution to the problem lets you reproduce Kaggle winning solution, piece by piece. This is our hands on approach to learning, because you can work on each part of the winning implementation by yourself.

Available problems

Problem	Description
Fashion mnist	Get started with Minerva by solving easy pipeline on nice dataset fashion-mnist
Whales	Reproduce Right Whale Recognition Kaggle winning solution!
	(more problems will be published in the future, so stay tuned)

Disclaimer

In this open source solution you will find references to the neptune.ml. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ml is not necessary to proceed with this solution. You may run it as plain Python script 😉.

User support

You can seek support in two ways:

Check Minerva wiki for typical problems and questions.
Create an issue with label question, in case Minerva wiki does not have an answer to your question.

Contributing to Minerva

Check CONTRIBUTING for more information.

About the name

Minerva is a Roman goddess of wisdom, arts and craft. She was usually presented with the strong association with knowledge. Her sacred creature 'owl of Minerva' symbolizes wisdom and knowledge. We think that this name depicts our project very well, since it is about acquiring knowledge and skills.

minerva-training-materials's People

Contributors

Stargazers

Watchers

Forkers

jakubczakon anitakar mjl68 pknut taraspiotr boulious kstrzala skaiphd qbasuperfly ondrocks patiem dzikpol grzedluz sabinazet

minerva-training-materials's Issues

Which version of torch package is being used ?

There are some changes in the recent package which are breaking the process. Like this one. Please add version number and package details in the requirements file.

torch==0.3.1

torch==0.4.0

missing package h5py

For

neptune run -- 'dry_train --problem fashion_mnist'

I obtained

Traceback (most recent call last):
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 138, in <module>
    execute()
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 134, in execute
    execfile(job_filepath, job_globals)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/past/builtins/misc.py", line 82, in execfile
    exec_(code, myglobals, mylocals)
  File "main.py", line 72, in <module>
    action()
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "main.py", line 20, in dry_train
    dry_run(problem, dev_mode, cloud_mode, train_mode=True)
  File "main.py", line 44, in dry_run
    pm.dry_run(sub_problem, train_mode, dev_mode, cloud_mode)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva/minerva/fashion_mnist/problem_manager.py", line 22, in dry_run
    trainer.train()
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva/minerva/fashion_mnist/trainer.py", line 23, in train
    'inference': False}})
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva/minerva/backend/base.py", line 75, in fit_transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva/minerva/backend/base.py", line 81, in fit_transform
    step_output_data = self._cached_fit_transform(step_inputs)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva/minerva/backend/base.py", line 93, in _cached_fit_transform
    self.transformer.save(self.cache_filepath_step_transformer)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva/minerva/backend/models/keras/models.py", line 50, in save
    self.model.save(filepath)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/keras/engine/topology.py", line 2573, in save
    save_model(self, filepath, overwrite, include_optimizer)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/keras/models.py", line 60, in save_model
    raise ImportError('`save_model` requires h5py.')
ImportError: `save_model` requires h5py.

Everything worked fine after pip3 install h5py.

Please add h5py to requirements.

[whales, task 3 output] what is validation/test score?

The validation score (2.0059) is not equal to validation loss (1.01667) or the validation accuracy (0.77713). Similarly, test score is hard to interpret. How are these two scores calculated?

226894.311837 | 2018-04-15 01-09-37 minerva >>> epoch 250 current lr: 0.0003252930814335209
226894.312173 | 2018-04-15 01-09-37 minerva >>> epoch 249 loss: 0.03353
226894.312389 | 2018-04-15 01-09-37 minerva >>> epoch 249 accuracy: 0.99986
226981.769858 | 2018-04-15 01-11-05 minerva >>> epoch 249 validation loss: 1.01667
226981.770167 | 2018-04-15 01-11-05 minerva >>> epoch 249 validation accuracy: 0.77713
227067.955128 | 2018-04-15 01-12-31 minerva >>> training finished...
<...>
227715.884304 | Validation score is 2.0059
227715.884506 | Test score is 2.1295
227715.884696 | That is a solid validation
227715.884888 | Sorry, but this score is not high enough to pass the task

Please add h5py to requirements

https://pastebin.com/b61zHcNk

[fashion-MNIST, task 2] Allowing automated hyperparameter search (feature request)

Would it be possible to include an option of using automated parameter search (via RandomizedSearchCV, etc) for this task? Additionally, info on how to run several Neptune experiments in parallel with different parameter combinations would be useful to users.

please add jupyter to requirements

Without jupyter package the submit mode doesn't work. Try for example

python main.py -- submit --problem fashion_mnist --task_nr 1

to check it.

Suggested order of in which tasks for each problem should be completed

The suggested order in which user should complete tasks is not specified. It might be worth mentioning somewhere that there’s no suggested order, if that’s the case.

error "TypeError: 'NoneType' object is not callable" after the experiment

For
python run_minerva.py -- dry_run --problem fashion_mnist
in the end I obtain

Test score is 0.9067
That is a solid validation
Congrats you solved the task!
Exception ignored in: <bound method BaseSession.__del__ of <tensorflow.python.client.session.Session object at 0x7f35e6e7cb38>>
Traceback (most recent call last):
  File "/home/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 595, in __del__
TypeError: 'NoneType' object is not callable```

[whales, task3] task submission ends after 249 epochs with "Sorry, your validation split is messed up. Fix it please."

Submitting the task (via 'neptune run') was unsuccessful and ended after 249 with an error "Sorry, your validation split is messed up. Fix it please." The validation split isn't a part of any task within the Whales problem - perhaps this error is leaking from Fashion-MNIST part.

(sidenote - the log below suggests that the numbering of prints with 'current learning rate' is shifted by one - epoch 250 vs 249, in 5th and 6th row, respectively)

156834.540647	Connection lost. Retrying...
156834.540935	2018-04-06 08-00-15 minerva >>> epoch 249 batch 110 ...
156834.557396	Connection lost. Retrying...
156834.557587	2018-04-06 08-00-17 minerva >>> epoch 249 average batch time: 0:00:04.0
156834.557771	2018-04-06 08-00-17 minerva >>> epoch 250 current lr: 0.0003252930814335209
156834.557955	2018-04-06 08-00-17 minerva >>> epoch 249 loss: 0.03416
156834.558143	2018-04-06 08-00-17 minerva >>> epoch 249 accuracy: 0.99975
156834.840497	Connection lost. Retrying...
156834.840747	Connection restored!
156895.617163	2018-04-06 08-01-23 minerva >>> epoch 249 validation loss: 0.98257
156895.617506	2018-04-06 08-01-23 minerva >>> epoch 249 validation accuracy: 0.78689
156949.918186	2018-04-06 08-02-18 minerva >>> training finished...
157301.446679	2018-04-06 08-08-09 minerva >>> step classifier_network saving transformer...
157301.793026	2018-04-06 08-08-09 minerva >>> step classifier_network saving outputs...
157301.793355	2018-04-06 08-08-09 minerva >>> step classifier_calibrator adapting inputs
157302.072814	2018-04-06 08-08-10 minerva >>> step classifier_calibrator saving transformer...
157302.073103	2018-04-06 08-08-10 minerva >>> step classifier_calibrator saving outputs...
157302.073299	2018-04-06 08-08-10 minerva >>> step classifier_encoder adapting inputs
157302.073493	2018-04-06 08-08-10 minerva >>> step classifier_encoder loading...
157302.073687	2018-04-06 08-08-10 minerva >>> step classifier_encoder transforming...
157302.256781	2018-04-06 08-08-10 minerva >>> step classifier_output adapting inputs
157302.257108	2018-04-06 08-08-10 minerva >>> step classifier_output loading...
157302.257317	2018-04-06 08-08-10 minerva >>> step classifier_output transforming...
157302.744636	2018-04-06 08-08-10 minerva >>> step classifier_encoder adapting inputs
157302.744928	2018-04-06 08-08-10 minerva >>> step classifier_encoder loading...
157302.745115	2018-04-06 08-08-10 minerva >>> step classifier_encoder transforming...
157302.745298	2018-04-06 08-08-10 minerva >>> step classifier_loader adapting inputs
157302.745478	2018-04-06 08-08-10 minerva >>> step classifier_loader loading...
157302.745656	2018-04-06 08-08-10 minerva >>> step classifier_loader transforming...
157302.745832	2018-04-06 08-08-10 minerva >>> step classifier_network unpacking inputs
157302.746007	2018-04-06 08-08-10 minerva >>> step classifier_network loading...
157302.746183	2018-04-06 08-08-10 minerva >>> step classifier_network transforming...
157351.151419	2018-04-06 08-08-59 minerva >>> step classifier_calibrator adapting inputs
157351.151625	2018-04-06 08-08-59 minerva >>> step classifier_calibrator loading...
157351.15182	2018-04-06 08-08-59 minerva >>> step classifier_calibrator transforming...
157351.330583	2018-04-06 08-08-59 minerva >>> step classifier_encoder adapting inputs
157351.330964	2018-04-06 08-08-59 minerva >>> step classifier_encoder loading...
157351.33123	2018-04-06 08-08-59 minerva >>> step classifier_encoder transforming...
157351.331464	2018-04-06 08-08-59 minerva >>> step classifier_output adapting inputs
157351.331668	2018-04-06 08-08-59 minerva >>> step classifier_output loading...
157351.331846	2018-04-06 08-08-59 minerva >>> step classifier_output transforming...
157351.33203	2018-04-06 08-08-59 minerva >>> step classifier_encoder adapting inputs
157351.332214	2018-04-06 08-08-59 minerva >>> step classifier_encoder loading...
157351.332397	2018-04-06 08-08-59 minerva >>> step classifier_encoder transforming...
157351.332591	2018-04-06 08-08-59 minerva >>> step classifier_loader adapting inputs
157351.332774	2018-04-06 08-08-59 minerva >>> step classifier_loader loading...
157351.332958	2018-04-06 08-08-59 minerva >>> step classifier_loader transforming...
157351.333137	2018-04-06 08-08-59 minerva >>> step classifier_network unpacking inputs
157351.333316	2018-04-06 08-08-59 minerva >>> step classifier_network loading...
157351.53075	2018-04-06 08-08-59 minerva >>> step classifier_network transforming...
157411.719687	2018-04-06 08-09-59 minerva >>> step classifier_calibrator adapting inputs
157411.719895	2018-04-06 08-09-59 minerva >>> step classifier_calibrator loading...
157411.720097	2018-04-06 08-09-59 minerva >>> step classifier_calibrator transforming...
157411.720338	2018-04-06 08-09-59 minerva >>> step classifier_encoder adapting inputs
157411.720538	2018-04-06 08-09-59 minerva >>> step classifier_encoder loading...
157411.720747	2018-04-06 08-09-59 minerva >>> step classifier_encoder transforming...
157411.720945	2018-04-06 08-09-59 minerva >>> step classifier_output adapting inputs
157411.721135	2018-04-06 08-09-59 minerva >>> step classifier_output loading...
157411.721333	2018-04-06 08-09-59 minerva >>> step classifier_output transforming...
157411.928652
157411.92899	Validation score is 1.9927
157411.929189	Test score is 2.2540
157411.929416	Sorry, your validation split is messed up. Fix it please.

(make local dirs and public dirs the same)solution dir paths don't work

Fashion mnist

These work:

in neptune.yaml: solution_dir: resources/fashion_mnist/solution
command
python main.py -- dry_run --problem fashion_mnist --train_mode False
or
neptune run -- dry_run --problem fashion_mnist --train_mode False

This doesn't work:

in neptune.yaml: solution_dir: /public/minerva/resources/fashion_mnist/solution
command

neptune send \
--environment keras-2.0-gpu-py3 \
--worker gcp-gpu-medium \
-- dry_run --problem fashion_mnist --train_mode False

Please make resources on github repo and neptune's public exactly the same.

Whales

This doesn't work:

in neptune.yaml:

data_dir: resources/whales/data
solution_dir: resources/whales/solution

command
python main.py -- dry_run --problem whales --train_mode False

Error: ValueError: Specified solution_dir is missing 'transformers' directory. Use dry_run with train_mode=True or specify the path to trained pipeline.

It seems that automatic subproblem inference doesn't reach there.

Avoid empty solution functions unless necessary

In some tasks we have to do nothing with solution function, however we can still see this empty function. I'd like to suggest to just erase this function (or, analogously, CONFIG), unless necessary.

Changing run_minerva.py and config.yaml to default Neptune names

I suggest to consider changing names:

run_minerva.py -> main.py,
config.yaml -> neptune.yaml.
In this way you can shorten neptune commands, eg. from

neptune run run_minerva.py --config config.yaml -- dry_run --problem fashion_mnist

neptune run -- dry_run --problem fashion_mnist

problem with psutil during the installation of requirements

OS: Ubuntu 16.04.
Command: pip3 install -r minerva/requirements.txt
Result (no-error lines before the psutil error omitted):

  Running setup.py bdist_wheel for psutil ... error
  Complete output from command /home/patryk/Documents/edukacyjne/Minerva/1217/minerva/bin/python3.5 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-j2yhq163/psutil/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmpt1r_vi31pip-wheel- --python-tag cp35:
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.5
  creating build/lib.linux-x86_64-3.5/psutil
  copying psutil/_pssunos.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/_common.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/_pslinux.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/__init__.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/_psposix.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/_compat.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/_psosx.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/_pswindows.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/_psbsd.py -> build/lib.linux-x86_64-3.5/psutil
  creating build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_windows.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_memory_leaks.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/runner.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/__init__.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_linux.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_testutils.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_system.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_sunos.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_bsd.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_posix.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_process.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_osx.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_misc.py -> build/lib.linux-x86_64-3.5/psutil/tests
  running build_ext
  building 'psutil._psutil_linux' extension
  creating build/temp.linux-x86_64-3.5
  creating build/temp.linux-x86_64-3.5/psutil
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DPSUTIL_VERSION=430 -I/usr/include/python3.5m -I/home/patryk/Documents/edukacyjne/Minerva/1217/minerva/include/python3.5m -c psutil/_psutil_linux.c -o build/temp.linux-x86_64-3.5/psutil/_psutil_linux.o
  psutil/_psutil_linux.c:12:20: fatal error: Python.h: No such file or directory
  compilation terminated.
  error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
  
  ----------------------------------------
  Failed building wheel for psutil
  Running setup.py clean for psutil
Failed to build psutil
Installing collected packages: psutil, pathlib2, neptune-cli, opencv-python, pandas, pydot, pydot-ng, scikit-learn, torchvision
  Running setup.py install for psutil ... error
    Complete output from command /home/patryk/Documents/edukacyjne/Minerva/1217/minerva/bin/python3.5 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-j2yhq163/psutil/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-tclqna_o-record/install-record.txt --single-version-externally-managed --compile --install-headers /home/patryk/Documents/edukacyjne/Minerva/1217/minerva/include/site/python3.5/psutil:
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-3.5
    creating build/lib.linux-x86_64-3.5/psutil
    copying psutil/_pssunos.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/_common.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/_pslinux.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/__init__.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/_psposix.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/_compat.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/_psosx.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/_pswindows.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/_psbsd.py -> build/lib.linux-x86_64-3.5/psutil
    creating build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_windows.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_memory_leaks.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/runner.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/__init__.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_linux.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_testutils.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_system.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_sunos.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_bsd.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_posix.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_process.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_osx.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_misc.py -> build/lib.linux-x86_64-3.5/psutil/tests
    running build_ext
    building 'psutil._psutil_linux' extension
    creating build/temp.linux-x86_64-3.5
    creating build/temp.linux-x86_64-3.5/psutil
    x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DPSUTIL_VERSION=430 -I/usr/include/python3.5m -I/home/patryk/Documents/edukacyjne/Minerva/1217/minerva/include/python3.5m -c psutil/_psutil_linux.c -o build/temp.linux-x86_64-3.5/psutil/_psutil_linux.o
    psutil/_psutil_linux.c:12:20: fatal error: Python.h: No such file or directory
    compilation terminated.
    error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
    
    ----------------------------------------
Command "/home/patryk/Documents/edukacyjne/Minerva/1217/minerva/bin/python3.5 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-j2yhq163/psutil/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-tclqna_o-record/install-record.txt --single-version-externally-managed --compile --install-headers /home/patryk/Documents/edukacyjne/Minerva/1217/minerva/include/site/python3.5/psutil" failed with error code 1 in /tmp/pip-build-j2yhq163/psutil/

Standard solutions from stackoverflow didn't help.

[fashion-MNIST] Unlimited CPUs usage

fashion_MNIST takes all the available CPUs, it would be nice to have a parameter for setting the number of workers.

--train_model False raises an error

python run_minerva.py -- dry_run --problem fashion_mnist

works whereas

python run_minerva.py -- dry_run --problem fashion_mnist --train_mode False

raises an error:

~/Documents/edukacyjne/Minerva/0401/minerva$ python run_minerva.py -- dry_run --problem fashion_mnist --train_mode False
2018-01-11 13-08-12 minerva >>> starting experiment...
Using TensorFlow backend.
2018-01-11 13-08-14 minerva >>> running: None
neptune: Executing in Offline Mode.
2018-01-11 13-08-14 minerva >>> Saving graph to path/to/your/solution/class_predictions_graph.json
2018-01-11 13-08-14 minerva >>> step input unpacking inputs
2018-01-11 13-08-14 minerva >>> step input loading...
2018-01-11 13-08-14 minerva >>> step input transforming...
2018-01-11 13-08-14 minerva >>> step keras_model unpacking inputs
Epoch 1/200
2018-01-11 13:08:15.268968: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-11 13:08:15.269056: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-11 13:08:15.269094: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-11 13:08:15.269123: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-11 13:08:15.269150: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
46/47 [============================>.] - ETA: 1s - loss: 0.4396 - acc: 0.9772/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/callbacks.py:494: RuntimeWarning: Early stopping conditioned on metric `val_loss` which is not available. Available metrics are: acc,loss
  (self.monitor, ','.join(list(logs.keys()))), RuntimeWarning
Traceback (most recent call last):
  File "run_minerva.py", line 46, in <module>
    action()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "run_minerva.py", line 27, in dry_run
    pm.dry_run(sub_problem, train_mode, dev_mode, cloud_mode)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/problem_manager.py", line 16, in dry_run
    _evaluate(trainer)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/problem_manager.py", line 39, in _evaluate
    score_valid, score_test = trainer.evaluate()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/trainer.py", line 22, in evaluate
    score_valid = self._evaluate(X_valid, y_valid)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/trainer.py", line 29, in _evaluate
    'inference': True}})
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 102, in transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 74, in fit_transform
    step_output_data = self._cached_fit_transform(step_inputs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 84, in _cached_fit_transform
    step_output_data = self.transformer.fit_transform(**step_inputs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 206, in fit_transform
    self.fit(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/keras/models_keras.py", line 28, in fit
    **self.training_config)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/engine/training.py", line 2187, in fit_generator
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/callbacks.py", line 73, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/keras/callbacks_keras.py", line 19, in on_epoch_end
    self.ctx.channel_send('Log-loss validation', self.epoch_id, logs['val_loss'])
KeyError: 'val_loss'
Sentry is attempting to send 1 pending error messages
Waiting up to 10 seconds
Press Ctrl-C to quit

[fashion-MNIST & whales] Dry_train and dry_eval might need more background info for users

Dry_train and dry_eval might need more background info for users. e.g.:

Is running them required for prior to submission of user’s solutions to tasks?
If it's not required - what are the benefits of running them?
Is there any additional knowledge about the process that would be valuable for users?

Thanks!

Neptune: logs created with use of `logging` go to Neptune stderr

If you use Neptune, logs created with use of logging go to Neptune stderr, not to stdout.

I also checked that if you change logger.setLevel from logging.INFO (equal 20) to:

19: nothing changes,
21: logs from logging are not displayed anywhere.

add neptune run/send inference to drop cloud_mode option

neptune send needs an aditional cloud_mode flag which is annoying.
Add send vs run inference.

make minerva/utils.py errors more descriptive.

_prep_cache issue

For

neptune send --environment pytorch-0.2.0-gpu-py3 --worker gcp-gpu-medium -- dry_eval --problem whales

I obtained

114.465695 | Traceback (most recent call last):
-- | --
114.466053 | File "/usr/local/lib/python3.6/dist-packages/deepsense/neptune/job_wrapper.py", line 138, in <module>
114.466311 | execute()
114.466593 | File "/usr/local/lib/python3.6/dist-packages/deepsense/neptune/job_wrapper.py", line 134, in execute
114.466841 | execfile(job_filepath, job_globals)
114.467081 | File "/usr/local/lib/python3.6/dist-packages/past/builtins/misc.py", line 82, in execfile
114.467322 | exec_(code, myglobals, mylocals)
114.467577 | File "main.py", line 72, in <module>
114.467857 | action()
114.468074 | File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 722, in __call__
114.468385 | return self.main(*args, **kwargs)
114.468668 | File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 697, in main
114.46896 | rv = self.invoke(ctx)
114.46924 | File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1066, in invoke
114.469555 | return _process_result(sub_ctx.command.invoke(sub_ctx))
114.469857 | File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 895, in invoke
114.470132 | return ctx.invoke(self.callback, **ctx.params)
114.470374 | File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 535, in invoke
114.47058 | return callback(*args, **kwargs)
114.470817 | File "main.py", line 28, in dry_eval
114.471031 | dry_run(problem, dev_mode, cloud_mode, train_mode=False)
114.471244 | File "main.py", line 40, in dry_run
114.471457 | pm.dry_run(sub_problem, train_mode, dev_mode, cloud_mode)
114.47169 | File "/neptune/minerva/whales/problem_manager.py", line 26, in dry_run
114.471911 | handle_empty_solution_dir(train_mode, config, pipeline)
114.472125 | File "/neptune/minerva/utils.py", line 54, in handle_empty_solution_dir
114.472339 | transformers_in_pipeline = set(pipeline(config).all_steps.keys())
114.472554 | File "/neptune/minerva/whales/pipelines.py", line 44, in alignment_pipeline
114.47277 | cache_dirpath=config['global']['cache_dirpath'])
114.472991 | File "/neptune/minerva/backend/base.py", line 183, in __init__
114.473226 | super().__init__(*args, **kwargs)
114.473485 | File "/neptune/minerva/backend/base.py", line 24, in __init__
114.473748 | self._prep_cache(cache_dirpath, save_outputs)
114.474027 | File "/neptune/minerva/backend/base.py", line 33, in _prep_cache
114.474336 | os.makedirs(os.path.join(cache_dirpath, dirname), exist_ok=True)
114.474574 | File "/usr/lib/python3.6/os.py", line 220, in makedirs
114.47482 | mkdir(name, mode)
114.475053 | OSError: [Errno 30] Read-only file system: '/public/minerva/resources/whales/solution/alignment/outputs'

I chose:

data_dir: /public/whales
solution_dir: /public/minerva/resources/whales/solution

[whales, task 7] AttributeError("Can't pickle local object 'solution.<locals>.DatasetLocalizer'",))

neptune run main.py -- submit --problem whales --task_nr 7

the above results in the error below.

-- | --
12.114279 | [NbConvertApp] Writing 4898 bytes to /mnt/ml-team/homes/usr/minerva/resources/whales/tasks/task7.py
12.612039 | /mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
12.612379 | from ._conv import register_converters as _register_converters
13.203399 | Using TensorFlow backend.
25.387684 | Traceback (most recent call last):
25.387881 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 103, in execute
25.388077 | execfile(job_filepath, job_globals)
25.388271 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/past/builtins/misc.py", line 82, in execfile
25.388466 | exec_(code, myglobals, mylocals)
25.388659 | File "main.py", line 66, in <module>
25.388854 | action()
25.389049 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
25.389249 | return self.main(*args, **kwargs)
25.389443 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
25.389637 | rv = self.invoke(ctx)
25.38983 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
25.390016 | return _process_result(sub_ctx.command.invoke(sub_ctx))
25.390203 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
25.39039 | return ctx.invoke(self.callback, **ctx.params)
25.390576 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
25.390775 | return callback(*args, **kwargs)
25.390969 | File "main.py", line 61, in submit
25.391163 | pm.submit_task(task_sub_problem, task_nr, file_path, dev_mode)
25.391356 | File "/mnt/ml-team/homes/usr/minerva/minerva/whales/problem_manager.py", line 44, in submit_task
25.391565 | new_trainer.train()
25.39176 | File "/mnt/ml-team/homes/usr/minerva/minerva/whales/trainer.py", line 48, in train
25.391954 | 'train_mode': True,
25.392147 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 75, in fit_transform
25.392339 | step_inputs[input_step.name] = input_step.fit_transform(data)
25.392535 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 75, in fit_transform
25.392728 | step_inputs[input_step.name] = input_step.fit_transform(data)
25.392918 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 81, in fit_transform
25.393109 | step_output_data = self._cached_fit_transform(step_inputs)
25.393302 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 91, in _cached_fit_transform
25.393489 | step_output_data = self.transformer.fit_transform(**step_inputs)
25.393675 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 218, in fit_transform
25.39386 | self.fit(*args, **kwargs)
25.394049 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/models/pytorch/models.py", line 55, in fit
25.394249 | for batch_id, data in enumerate(batch_gen):
25.394443 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 417, in __iter__
25.394635 | return DataLoaderIter(self)
25.394829 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 234, in __init__
25.395021 | w.start()
25.395211 | File "/usr/lib/python3.5/multiprocessing/process.py", line 105, in start
25.395401 | self._popen = self._Popen(self)
25.395623 | File "/usr/lib/python3.5/multiprocessing/context.py", line 212, in _Popen
25.395817 | return _default_context.get_context().Process._Popen(process_obj)
25.396009 | File "/usr/lib/python3.5/multiprocessing/context.py", line 274, in _Popen
25.396201 | return Popen(process_obj)
25.396396 | File "/usr/lib/python3.5/multiprocessing/popen_spawn_posix.py", line 33, in __init__
25.396587 | super().__init__(process_obj)
25.396777 | File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 20, in __init__
25.396966 | self._launch(process_obj)
25.397153 | File "/usr/lib/python3.5/multiprocessing/popen_spawn_posix.py", line 48, in _launch
25.39734 | reduction.dump(process_obj, fp)
25.397531 | File "/usr/lib/python3.5/multiprocessing/reduction.py", line 59, in dump
25.397722 | ForkingPickler(file, protocol).dump(obj)
25.397914 | AttributeError: Can't pickle local object 'solution.<locals>.DatasetLocalizer'
25.398359 |  
25.398564 | During handling of the above exception, another exception occurred:
25.398762 |  
25.398955 | Traceback (most recent call last):
25.399148 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 109, in <module>
25.399342 | execute()
25.413426 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 105, in execute
25.413676 | raise ExperimentExecutionException("Exception during experiment execution", ex)
25.413876 | deepsense.neptune.exceptions.ExperimentExecutionException: ('Exception during experiment execution', AttributeError("Can't pickle local object 'solution.<locals>.DatasetLocalizer'",))

Is --subproblem parameter really needed?

Each single step in whales constantly belongs to exactly one sub problem. So why should user care about sub problem by typing --sub_problem sth in command line? I think the sub problem could be inferred from the task automatically, eventually user could have a possibility (not an obligation) to choose one in dry dun mode.

dry_train and dry_eval?

I propose to exchange:

dry_run train_mode=False by dry_eval,
dry_run train_mode=True by dry_train.

In this way, the user could consciously choose if:

he/she wants to only evaluate results from in-house solution to quickly check if everything (including paths to solutions and data) is set correctly,
he/she wants to spend a lot of time to re-train the in-house solution and save it in his/her favourite path.

Also, after these exchanges the readmes should become more understandable.

Alternatively, I propose to exchange train_mode with only_eval/eval_only.

Better distinguishing the result

In the current version the info whether I did or didn't pass the step is just a line in stdout. However, this statement is so important that it'd be nice to distinguish it in a more effective way.

`--train_mode False` in whales doesn't work

For

python run_minerva.py -- dry_run --problem whales --sub_problem localization --train_mode False

I obtain

2018-01-18 13-34-59 minerva >>> starting experiment...
2018-01-18 13-35-01 minerva >>> running: localization
neptune: Executing in Offline Mode.
2018-01-18 13-35-01 minerva >>> step localizer_loader unpacking inputs
2018-01-18 13-35-01 minerva >>> step localizer_loader loading...
2018-01-18 13-35-01 minerva >>> step localizer_loader transforming...
2018-01-18 13-35-01 minerva >>> step localizer_network unpacking inputs
2018-01-18 13-35-01 minerva >>> initializing model weights...
2018-01-18 13-35-01 minerva >>> starting training...
2018-01-18 13-35-01 minerva >>> initial lr: 0.0005
2018-01-18 13-35-01 minerva >>> epoch 0 ...
2018-01-18 13-35-18 minerva >>> epoch 0 batch 0 ...
/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/torch/nn/modules/container.py:67: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
  input = module(input)
2018-01-18 13-35-26 minerva >>> epoch 0 batch 0 loss:     4.85479
2018-01-18 13-35-26 minerva >>> epoch 0 batch 0 accuracy: 0.00000
2018-01-18 13-35-26 minerva >>> epoch 0 average batch time: 0:00:08.0
(...) [ANALOGOUS STUFF FOR BATCHES 1-14]
/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/torch/nn/modules/container.py:67: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
  input = module(input)
2018-01-18 13-36-04 minerva >>> epoch 0 batch 15 loss:     4.70346
2018-01-18 13-36-04 minerva >>> epoch 0 batch 15 accuracy: 0.00000
2018-01-18 13-36-04 minerva >>> epoch 0 model saved to output/path_to_your_solution/checkpoints/localizer_network/model_epoch0.torch
2018-01-18 13-36-04 minerva >>> epoch 1 current lr: 0.0005
2018-01-18 13-36-04 minerva >>> epoch 0 loss:     4.78818
2018-01-18 13-36-04 minerva >>> epoch 0 accuracy: 0.02295
Traceback (most recent call last):
  File "run_minerva.py", line 46, in <module>
    action()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "run_minerva.py", line 27, in dry_run
    pm.dry_run(sub_problem, train_mode, dev_mode, cloud_mode)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/whales/problem_manager.py", line 25, in dry_run
    _evaluate(trainer, sub_problem)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/whales/problem_manager.py", line 49, in _evaluate
    score_valid, score_test = trainer.evaluate()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/trainer.py", line 22, in evaluate
    score_valid = self._evaluate(X_valid, y_valid)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/whales/trainer.py", line 68, in _evaluate
    'train_mode': False,
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 102, in transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 68, in fit_transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 74, in fit_transform
    step_output_data = self._cached_fit_transform(step_inputs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 84, in _cached_fit_transform
    step_output_data = self.transformer.fit_transform(**step_inputs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 206, in fit_transform
    self.fit(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/pytorch/models.py", line 61, in fit
    self.callbacks.on_epoch_end()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/pytorch/callbacks.py", line 86, in on_epoch_end
    callback.on_epoch_end(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/pytorch/callbacks.py", line 154, in on_epoch_end
    val_loss, val_acc = score_model_multi_output(self.model, self.loss_function, self.validation_datagen)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/pytorch/validation.py", line 50, in score_model_multi_output
    for batch_id, data in enumerate(batch_gen):
TypeError: 'NoneType' object is not iterable
Sentry is attempting to send 1 pending error messages
Waiting up to 10 seconds
Press Ctrl-C to quit

The same error arises when I use Neptune cloud. Everything works with default --train_mode True.

what's the difference between dry_train and dry_eval?

For fashion_mnist problem, if you set solution_dir already contained the trained model then the training in the dry_train mode takes only one epoch. Thus, dry_train and dry_eval do the same and there is no need to distinguish them.

Changing 'problems' to 'tasks'

I think it's worth changing particular folders' names from problems to tasks, since they contain tasks, not problems.

[whales, dry_eval] "Sorry, but this score is not high enough to pass the task" in classification

neptune run -- dry_eval --problem whales

The above throws an error after evaluating the classification algorithm [ie. doesn't pass the task], and then goes back to command prompt.

2018-04-03 10-22-58 minerva >>> running: classification
2018-04-03 10-23-00 minerva >>> step classifier_encoder adapting inputs
2018-04-03 10-23-00 minerva >>> step classifier_encoder loading...
2018-04-03 10-23-00 minerva >>> step classifier_encoder transforming...
2018-04-03 10-23-00 minerva >>> step classifier_loader adapting inputs
2018-04-03 10-23-00 minerva >>> step classifier_loader loading...
2018-04-03 10-23-00 minerva >>> step classifier_loader transforming...
2018-04-03 10-23-00 minerva >>> step classifier_network unpacking inputs
2018-04-03 10-23-00 minerva >>> step classifier_network loading...
2018-04-03 10-23-01 minerva >>> step classifier_network transforming...
100%|██████████| 16/16 [00:35<00:00,  2.24s/it]
2018-04-03 10-23-37 minerva >>> step classifier_calibrator adapting inputs
2018-04-03 10-23-37 minerva >>> step classifier_calibrator loading...
2018-04-03 10-23-37 minerva >>> step classifier_calibrator transforming...
2018-04-03 10-23-37 minerva >>> step classifier_encoder adapting inputs
2018-04-03 10-23-37 minerva >>> step classifier_encoder loading...
2018-04-03 10-23-37 minerva >>> step classifier_encoder transforming...
2018-04-03 10-23-37 minerva >>> step classifier_output adapting inputs
2018-04-03 10-23-37 minerva >>> step classifier_output loading...
2018-04-03 10-23-37 minerva >>> step classifier_output transforming...
2018-04-03 10-23-37 minerva >>> step classifier_encoder adapting inputs
2018-04-03 10-23-37 minerva >>> step classifier_encoder loading...
2018-04-03 10-23-37 minerva >>> step classifier_encoder transforming...
2018-04-03 10-23-37 minerva >>> step classifier_loader adapting inputs
2018-04-03 10-23-37 minerva >>> step classifier_loader loading...
2018-04-03 10-23-37 minerva >>> step classifier_loader transforming...
2018-04-03 10-23-37 minerva >>> step classifier_network unpacking inputs
2018-04-03 10-23-37 minerva >>> step classifier_network loading...
2018-04-03 10-23-37 minerva >>> step classifier_network transforming...
100%|██████████| 15/15 [00:26<00:00,  1.74s/it]
2018-04-03 10-24-03 minerva >>> step classifier_calibrator adapting inputs
2018-04-03 10-24-03 minerva >>> step classifier_calibrator loading...
2018-04-03 10-24-03 minerva >>> step classifier_calibrator transforming...
2018-04-03 10-24-03 minerva >>> step classifier_encoder adapting inputs
2018-04-03 10-24-03 minerva >>> step classifier_encoder loading...
2018-04-03 10-24-03 minerva >>> step classifier_encoder transforming...
2018-04-03 10-24-03 minerva >>> step classifier_output adapting inputs
2018-04-03 10-24-03 minerva >>> step classifier_output loading...
2018-04-03 10-24-03 minerva >>> step classifier_output transforming...

Validation score is 2.1188
Test score is 2.0907
That is a solid validation
Sorry, but this score is not high enough to pass the task
Calculated experiment snapshot size: 9.08 MB

1.408208 | 2018-04-03 10-31-59 minerva >>> starting experiment...
-- | --
3.79994 | 2018-04-03 10-32-01 minerva >>> running: alignment
5.394486 | 2018-04-03 10-32-03 minerva >>> step aligner_encoder adapting inputs
5.394776 | 2018-04-03 10-32-03 minerva >>> step aligner_encoder loading...
5.394961 | 2018-04-03 10-32-03 minerva >>> step aligner_encoder transforming...
5.395138 | 2018-04-03 10-32-03 minerva >>> step aligner_loader adapting inputs
5.395315 | 2018-04-03 10-32-03 minerva >>> step aligner_loader loading...
5.39549 | 2018-04-03 10-32-03 minerva >>> step aligner_loader transforming...
5.395664 | 2018-04-03 10-32-03 minerva >>> step aligner_network unpacking inputs
5.395837 | 2018-04-03 10-32-03 minerva >>> step aligner_network loading...
7.896086 | 2018-04-03 10-32-05 minerva >>> step aligner_network transforming...
46.352937 | 2018-04-03 10-32-44 minerva >>> step aligner_unbinner unpacking inputs
46.353462 | 2018-04-03 10-32-44 minerva >>> step aligner_unbinner loading...
46.35368 | 2018-04-03 10-32-44 minerva >>> step aligner_unbinner transforming...
46.353893 | 2018-04-03 10-32-44 minerva >>> step aligner_adjuster adapting inputs
46.575793 | 2018-04-03 10-32-44 minerva >>> step aligner_adjuster loading...
46.576187 | 2018-04-03 10-32-44 minerva >>> step aligner_adjuster transforming...
46.576443 | 2018-04-03 10-32-44 minerva >>> step aligner_output adapting inputs
46.576672 | 2018-04-03 10-32-44 minerva >>> step aligner_output loading...
46.576923 | 2018-04-03 10-32-44 minerva >>> step aligner_output transforming...
46.577152 | 2018-04-03 10-32-44 minerva >>> step aligner_encoder adapting inputs
46.57738 | 2018-04-03 10-32-44 minerva >>> step aligner_encoder loading...
46.57762 | 2018-04-03 10-32-44 minerva >>> step aligner_encoder transforming...
46.577851 | 2018-04-03 10-32-44 minerva >>> step aligner_loader adapting inputs
46.578076 | 2018-04-03 10-32-44 minerva >>> step aligner_loader loading...
46.578317 | 2018-04-03 10-32-44 minerva >>> step aligner_loader transforming...
46.57855 | 2018-04-03 10-32-44 minerva >>> step aligner_network unpacking inputs
46.578779 | 2018-04-03 10-32-44 minerva >>> step aligner_network loading...
46.579016 | 2018-04-03 10-32-44 minerva >>> step aligner_network transforming...
74.03968 | 2018-04-03 10-33-11 minerva >>> step aligner_unbinner unpacking inputs
74.03987 | 2018-04-03 10-33-11 minerva >>> step aligner_unbinner loading...
74.040058 | 2018-04-03 10-33-11 minerva >>> step aligner_unbinner transforming...
74.040246 | 2018-04-03 10-33-11 minerva >>> step aligner_adjuster adapting inputs
74.040454 | 2018-04-03 10-33-11 minerva >>> step aligner_adjuster loading...
74.040641 | 2018-04-03 10-33-11 minerva >>> step aligner_adjuster transforming...
74.040828 | 2018-04-03 10-33-11 minerva >>> step aligner_output adapting inputs
74.041018 | 2018-04-03 10-33-11 minerva >>> step aligner_output loading...
74.041203 | 2018-04-03 10-33-11 minerva >>> step aligner_output transforming...
74.041387 |  
74.041575 | Validation score is 62.5229
74.04176 | Test score is 64.9918
74.041945 | That is a solid validation
74.042129 | Congrats you solved the task!
74.042313 | 2018-04-03 10-33-11 minerva >>> running: localization
75.068115 | 2018-04-03 10-33-12 minerva >>> step localizer_loader unpacking inputs
75.068325 | 2018-04-03 10-33-12 minerva >>> step localizer_loader loading...
75.068492 | 2018-04-03 10-33-12 minerva >>> step localizer_loader transforming...
75.068652 | 2018-04-03 10-33-12 minerva >>> step localizer_network unpacking inputs
75.068768 | 2018-04-03 10-33-12 minerva >>> step localizer_network loading...
75.068882 | 2018-04-03 10-33-12 minerva >>> step localizer_network transforming...
101.504964 | 2018-04-03 10-33-39 minerva >>> step localizer_unbinner unpacking inputs
101.505176 | 2018-04-03 10-33-39 minerva >>> step localizer_unbinner loading...
101.505376 | 2018-04-03 10-33-39 minerva >>> step localizer_unbinner transforming...
101.505558 | 2018-04-03 10-33-39 minerva >>> step localizer_output adapting inputs
101.505748 | 2018-04-03 10-33-39 minerva >>> step localizer_output loading...
101.505929 | 2018-04-03 10-33-39 minerva >>> step localizer_output transforming...
101.506116 | 2018-04-03 10-33-39 minerva >>> step localizer_loader unpacking inputs
101.506312 | 2018-04-03 10-33-39 minerva >>> step localizer_loader loading...
101.506507 | 2018-04-03 10-33-39 minerva >>> step localizer_loader transforming...
101.506699 | 2018-04-03 10-33-39 minerva >>> step localizer_network unpacking inputs
101.506881 | 2018-04-03 10-33-39 minerva >>> step localizer_network loading...
101.50707 | 2018-04-03 10-33-39 minerva >>> step localizer_network transforming...
128.154553 | 2018-04-03 10-34-06 minerva >>> step localizer_unbinner unpacking inputs
128.154753 | 2018-04-03 10-34-06 minerva >>> step localizer_unbinner loading...
128.154942 | 2018-04-03 10-34-06 minerva >>> step localizer_unbinner transforming...
128.155128 | 2018-04-03 10-34-06 minerva >>> step localizer_output adapting inputs
128.15532 | 2018-04-03 10-34-06 minerva >>> step localizer_output loading...
128.155509 | 2018-04-03 10-34-06 minerva >>> step localizer_output transforming...
128.155701 |  
128.155905 | Validation score is 108.2913
128.156099 | Test score is 94.9431
128.156303 | That is a solid validation
128.156496 | Congrats you solved the task!
128.156686 | 2018-04-03 10-34-06 minerva >>> running: classification
129.684731 | 2018-04-03 10-34-07 minerva >>> step classifier_encoder adapting inputs
129.684952 | 2018-04-03 10-34-07 minerva >>> step classifier_encoder loading...
129.685101 | 2018-04-03 10-34-07 minerva >>> step classifier_encoder transforming...
129.685222 | 2018-04-03 10-34-07 minerva >>> step classifier_loader adapting inputs
129.685368 | 2018-04-03 10-34-07 minerva >>> step classifier_loader loading...
129.685489 | 2018-04-03 10-34-07 minerva >>> step classifier_loader transforming...
129.685635 | 2018-04-03 10-34-07 minerva >>> step classifier_network unpacking inputs
129.68579 | 2018-04-03 10-34-07 minerva >>> step classifier_network loading...
129.685929 | 2018-04-03 10-34-07 minerva >>> step classifier_network transforming...
158.137679 | 2018-04-03 10-34-36 minerva >>> step classifier_calibrator adapting inputs
158.138025 | 2018-04-03 10-34-36 minerva >>> step classifier_calibrator loading...
158.138231 | 2018-04-03 10-34-36 minerva >>> step classifier_calibrator transforming...
158.138611 | 2018-04-03 10-34-36 minerva >>> step classifier_encoder adapting inputs
158.138809 | 2018-04-03 10-34-36 minerva >>> step classifier_encoder loading...
158.138997 | 2018-04-03 10-34-36 minerva >>> step classifier_encoder transforming...
158.139212 | 2018-04-03 10-34-36 minerva >>> step classifier_output adapting inputs
158.139738 | 2018-04-03 10-34-36 minerva >>> step classifier_output loading...
158.139933 | 2018-04-03 10-34-36 minerva >>> step classifier_output transforming...
158.140115 | 2018-04-03 10-34-36 minerva >>> step classifier_encoder adapting inputs
158.140311 | 2018-04-03 10-34-36 minerva >>> step classifier_encoder loading...
158.140506 | 2018-04-03 10-34-36 minerva >>> step classifier_encoder transforming...
158.140697 | 2018-04-03 10-34-36 minerva >>> step classifier_loader adapting inputs
158.14089 | 2018-04-03 10-34-36 minerva >>> step classifier_loader loading...
158.141069 | 2018-04-03 10-34-36 minerva >>> step classifier_loader transforming...
158.141254 | 2018-04-03 10-34-36 minerva >>> step classifier_network unpacking inputs
158.141435 | 2018-04-03 10-34-36 minerva >>> step classifier_network loading...
158.347917 | 2018-04-03 10-34-36 minerva >>> step classifier_network transforming...
196.301351 | 2018-04-03 10-35-14 minerva >>> step classifier_calibrator adapting inputs
196.301537 | 2018-04-03 10-35-14 minerva >>> step classifier_calibrator loading...
196.301715 | 2018-04-03 10-35-14 minerva >>> step classifier_calibrator transforming...
196.301893 | 2018-04-03 10-35-14 minerva >>> step classifier_encoder adapting inputs
196.302067 | 2018-04-03 10-35-14 minerva >>> step classifier_encoder loading...
196.302249 | 2018-04-03 10-35-14 minerva >>> step classifier_encoder transforming...
196.302466 | 2018-04-03 10-35-14 minerva >>> step classifier_output adapting inputs
196.302687 | 2018-04-03 10-35-14 minerva >>> step classifier_output loading...
196.302904 | 2018-04-03 10-35-14 minerva >>> step classifier_output transforming...
196.505624 |  
196.505941 | Validation score is 2.1188
196.506135 | Test score is 2.0907
196.506323 | That is a solid validation
196.506537 | Sorry, but this score is not high enough to pass the task

neptune send: pip failed to install the requirements

For

neptune send -- dry_eval --problem fashion_mnist

I obtain

5.019441 | [pip]    Could not find a version that satisfies the requirement  ipython==6.2.1 (from -r /tmp/tmpryjdlc (line 21)) (from versions: 0.10,  0.10.1, 0.10.2, 0.11, 0.12, 0.12.1, 0.13, 0.13.1, 0.13.2, 1.0.0, 1.1.0,  1.2.0, 1.2.1, 2.0.0, 2.1.0, 2.2.0, 2.3.0, 2.3.1, 2.4.0, 2.4.1, 3.0.0,  3.1.0, 3.2.0, 3.2.1, 3.2.2, 3.2.3, 4.0.0b1, 4.0.0, 4.0.1, 4.0.2, 4.0.3,  4.1.0rc1, 4.1.0rc2, 4.1.0, 4.1.1, 4.1.2, 4.2.0, 4.2.1, 5.0.0b1, 5.0.0b2,  5.0.0b3, 5.0.0b4, 5.0.0rc1, 5.0.0, 5.1.0, 5.2.0, 5.2.1, 5.2.2, 5.3.0,  5.4.0, 5.4.1, 5.5.0)
-- | --
5.019586 | [pip] No matching distribution found for ipython==6.2.1 (from -r /tmp/tmpryjdlc (line 21))
5.019714 | Traceback (most recent call last):
5.019839 | File "/usr/local/lib/python2.7/dist-packages/deepsense/neptune/job_wrapper.py", line 138, in <module>
5.020117 | execute()
5.020519 | File "/usr/local/lib/python2.7/dist-packages/deepsense/neptune/job_wrapper.py", line 119, in execute
5.020902 | install_requirements()
5.021025 | File "/usr/local/lib/python2.7/dist-packages/deepsense/neptune/job_wrapper.py", line 51, in install_requirements
5.021172 | install_pip_requirements(os.environ['PIP_REQUIREMENTS'])
5.021552 | File "/usr/local/lib/python2.7/dist-packages/deepsense/neptune/job_wrapper.py", line 44, in _check_pip_install_result
5.021796 | raise RuntimeError('pip failed to install the requirements. '
5.024408 | RuntimeError: pip failed to install the requirements. For more details, see the stdout/stderr channels.

[whales, task 5] error after 149 epochs: FileNotFoundError(2, 'No such file or directory'))

neptune run -- submit --problem whales --task_nr 5

raises an error after the 149th epoch:

2018-04-15 11-05-41 minerva >>> epoch 149 batch 112 accuracy: 0.19444

262538.163211 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 103, in execute
-- | --
262538.163383 | execfile(job_filepath, job_globals)
262538.163556 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/past/builtins/misc.py", line 82, in execfile
262538.163723 | exec_(code, myglobals, mylocals)
262538.163892 | File "main.py", line 66, in <module>
262538.164061 | action()
262538.16423 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
262538.164399 | return self.main(*args, **kwargs)
262538.164568 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
262538.164776 | rv = self.invoke(ctx)
262538.165005 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
262538.165232 | return _process_result(sub_ctx.command.invoke(sub_ctx))
262538.16546 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
262538.165716 | return ctx.invoke(self.callback, **ctx.params)
262538.165954 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
262538.166181 | return callback(*args, **kwargs)
262538.166413 | File "main.py", line 61, in submit
262538.166645 | pm.submit_task(task_sub_problem, task_nr, file_path, dev_mode)
262538.166854 | File "/mnt/ml-team/homes/usr/minerva/minerva/whales/problem_manager.py", line 44, in submit_task
262538.167046 | new_trainer.train()
262538.167237 | File "/mnt/ml-team/homes/usr/minerva/minerva/whales/trainer.py", line 48, in train
262538.167427 | 'train_mode': True,
262538.167617 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 75, in fit_transform
262538.167806 | step_inputs[input_step.name] = input_step.fit_transform(data)
262538.167994 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 75, in fit_transform
262538.168179 | step_inputs[input_step.name] = input_step.fit_transform(data)
262538.168367 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 81, in fit_transform
262538.168556 | step_output_data = self._cached_fit_transform(step_inputs)
262538.168743 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 91, in _cached_fit_transform
262538.168971 | step_output_data = self.transformer.fit_transform(**step_inputs)
262538.169201 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 218, in fit_transform
262538.169429 | self.fit(*args, **kwargs)
262538.169659 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/models/pytorch/models.py", line 61, in fit
262538.169913 | self.callbacks.on_epoch_end()
262538.170143 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/models/pytorch/callbacks.py", line 87, in on_epoch_end
262538.17037 | callback.on_epoch_end(*args, **kwargs)
262538.170591 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/models/pytorch/callbacks.py", line 320, in on_epoch_end
262538.170778 | save_model(self.model, full_path)
262538.170972 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/models/pytorch/utils.py", line 68, in save_model
262538.171159 | torch.save(model.state_dict(), path)
262538.17134 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/torch/serialization.py", line 135, in save
262538.384228 | return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickle_protocol))
262538.384434 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/torch/serialization.py", line 115, in _with_file_like
262538.384553 | f = open(f, mode)
262538.384664 | FileNotFoundError: [Errno 2] No such file or directory: 'resources/whales/solution/localization/submit_solution/checkpoints/localizer_network/model_epoch149.torch'
262538.384773 |  
262538.384882 | During handling of the above exception, another exception occurred:
262538.384988 |  
262538.385094 | Traceback (most recent call last):
262538.385199 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 109, in <module>
262538.385304 | execute()
262538.385408 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 105, in execute
262538.385511 | raise ExperimentExecutionException("Exception during experiment execution", ex)
262538.385615 | deepsense.neptune.exceptions.ExperimentExecutionException: ('Exception during experiment execution', FileNotFoundError(2, 'No such file or directory'))

solution transformers are overwritten in submit mode

Transformers are overwritten in submit mode and when user goes to the next task without completing the task before he/she uses the incorrect (unsolved) transformer in the previous step.

strange things during dryrun on fashion_mnist

When I run

python run_minerva.py -- dry_run --problem fashion_mnist

I receive

2018-01-09 16-19-22 minerva-whales >>> starting experiment...
Using TensorFlow backend.
2018-01-09 16-19-23 minerva-whales >>> running: None
neptune: Executing in Offline Mode.
2018-01-09 16-19-23 minerva-whales >>> Saving graph to /mnt/ml-team/minerva/cache/whales/new_experiment/alignment/class_predictions_graph.json
2018-01-09 16-19-24 minerva-whales >>> step input unpacking inputs
2018-01-09 16-19-24 minerva-whales >>> step input saving transformer...
2018-01-09 16-19-24 minerva-whales >>> step input saving outputs...
2018-01-09 16-19-24 minerva-whales >>> step keras_model unpacking inputs
Epoch 1/200
2018-01-09 16:19:25.148275: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-09 16:19:25.148334: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-09 16:19:25.148367: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-09 16:19:25.148394: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-09 16:19:25.148421: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
46/47 [============================>.] - ETA: 1s - loss: 0.3966 - acc: 0.9724/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/callbacks.py:494: RuntimeWarning: Early stopping conditioned on metric `val_loss` which is not available. Available metrics are: loss,acc
  (self.monitor, ','.join(list(logs.keys()))), RuntimeWarning
Traceback (most recent call last):
  File "run_minerva.py", line 46, in <module>
    action()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "run_minerva.py", line 27, in dry_run
    pm.dry_run(sub_problem, train_mode, dev_mode, cloud_mode)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/problem_manager.py", line 16, in dry_run
    _evaluate(trainer)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/problem_manager.py", line 39, in _evaluate
    score_valid, score_test = trainer.evaluate()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/trainer.py", line 22, in evaluate
    score_valid = self._evaluate(X_valid, y_valid)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/trainer.py", line 29, in _evaluate
    'inference': True}})
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 102, in transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 74, in fit_transform
    step_output_data = self._cached_fit_transform(step_inputs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 84, in _cached_fit_transform
    step_output_data = self.transformer.fit_transform(**step_inputs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 206, in fit_transform
    self.fit(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/keras/models_keras.py", line 28, in fit
    **self.training_config)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/engine/training.py", line 2187, in fit_generator
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/callbacks.py", line 73, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/keras/callbacks_keras.py", line 21, in on_epoch_end
    self.ctx.channel_send('Log-loss validation', self.epoch_id, logs['val_loss'])
KeyError: 'val_loss'
Sentry is attempting to send 1 pending error messages
Waiting up to 10 seconds
Press Ctrl-C to quit
Exception ignored in: <bound method BaseSession.__del__ of <tensorflow.python.client.session.Session object at 0x7fc4dd75bdd8>>
Traceback (most recent call last):
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 595, in __del__
TypeError: 'NoneType' object is not callable

Three things are strange here:

minerva-whales in the first line, although I run fashion_mnist problem.
It seems it performs training despite that I don't use --train_mode and the train mode is off by default.
We can see an error about val_acc which is unavailable.

Where are data for whales project?

As above.

tasks are not registered

For

python main.py -- submit --problem whales --task_nr 1 --filepath resources/whales/tasks/task1.ipynb

I obtain

Traceback (most recent call last):
  File "main.py", line 72, in <module>
    action()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "main.py", line 67, in submit
    pm.submit_task(task_subproblem, task_nr, filepath, dev_mode, cloud_mode)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/whales/problem_manager.py", line 42, in submit_task
    task_handler = registered_tasks[task_nr](trainer)
KeyError: 1

I checked that the error raises because registered_tasks is {}, but this is all I understand. I encountered the same issue for fashion_mnist problem.

bug in neptune send for submit mode

After typing

neptune send run_minerva.py --environment keras-2.0-gpu-py3 --worker gcp-gpu-medium --config config.yaml -- submit --problem fashion_mnist --task_nr 2 --filepath resources/fashion_mnist/problems/task2.ipynb

with default task2.ipynb I see the following in Neptune:

18.442197 | 2018-01-17 11-30-08 minerva >>> starting experiment...
-- | --
18.801251 | Using TensorFlow backend.
34.768749 | 2018-01-17 11-30-24 minerva >>> Saving graph to /output/path_to_your_solution/class_predictions_graph.json
35.123845 | [NbConvertApp] WARNING \| pattern '/neptune/resources/fashion_mnist/problems/task2.ipynb' matched no files
35.366995 | Traceback (most recent call last):
35.367113 | File "/usr/local/lib/python3.5/dist-packages/deepsense/neptune/job_wrapper.py", line 138, in <module>
35.36723 | execute()
35.367347 | File "/usr/local/lib/python3.5/dist-packages/deepsense/neptune/job_wrapper.py", line 134, in execute
35.367463 | execfile(job_filepath, job_globals)
35.36759 | File "/usr/local/lib/python3.5/dist-packages/past/builtins/misc.py", line 82, in execfile
35.367724 | exec_(code, myglobals, mylocals)
35.367855 | File "run_minerva.py", line 46, in <module>
35.367979 | action()
35.368106 | File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 722, in __call__
35.368223 | return self.main(*args, **kwargs)
35.368341 | File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 697, in main
35.368456 | rv = self.invoke(ctx)
35.368649 | File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 1066, in invoke
35.368779 | return _process_result(sub_ctx.command.invoke(sub_ctx))
35.368922 | File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 895, in invoke
35.369046 | return ctx.invoke(self.callback, **ctx.params)
35.369161 | File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 535, in invoke
35.369324 | return callback(*args, **kwargs)
35.369478 | File "run_minerva.py", line 41, in submit
35.36963 | pm.submit_task(sub_problem, task_nr, filepath, dev_mode, cloud_mode)
35.369746 | File "/neptune/minerva/fashion_mnist/problem_manager.py", line 25, in submit_task
35.369861 | user_task_solution, user_config = _fetch_task_solution(filepath)
35.369984 | File "/neptune/minerva/fashion_mnist/problem_manager.py", line 33, in _fetch_task_solution
35.370102 | with TaskSolutionParser(filepath) as task_solution:
35.370218 | File "/neptune/minerva/backend/task_manager.py", line 45, in __enter__
35.370334 | if module_filename not in os.listdir(module_dir):
35.3705 | FileNotFoundError: [Errno 2] No such file or directory: '/neptune/resources/fashion_mnist/problems'

My config.yaml file is:

project-key: MIN

name: minerva

parameters:

  # Local setup
#  data_dir: path/to/your/data # for instance resources/whales/data
#  solution_dir:  output/path_to_your_solution # for instance /output/resources/whales/solution/localization

  # Cloud setup
  data_dir: /public/whales
  solution_dir: /output/path_to_your_solution

exclude:
  - resources
  - output
  - neptune.log
  - offline_job.log
  - .idea
  - .git
  - .ipynb_checkpoints

# Comment if local
pip-requirements-file: requirements.txt

Everything works fine for the dry run mode, i.e., for typing:

neptune send run_minerva.py \
--environment keras-2.0-gpu-py3 \
--worker gcp-gpu-medium \
--config config.yaml \
-- dry_run --problem fashion_mnist

Problem with task 1 of fashion_mnist

For

CONFIG = {'input_size':28,
          'classes':10}

def solution(input_size, classes):
    input_shape = (input_size, input_size, 1)
    images = Input(shape=input_shape)
    
    x = Conv2D(16, 3, padding='same', activation='relu')(images)
    x = Conv2D(16, 3, padding='same', activation='relu')(x)
    x = MaxPool2D()(x)
    
    x = Conv2D(16, 3, padding='same', activation='relu')(x)
    x = Conv2D(16, 3, padding='same', activation='relu')(x)
    x = MaxPool2D()(x)
    
    x = Flatten()(x)
    
    x = Dense(256, activation='relu')(x)
    x = Dropout(0.5)(x)
    x = Dense(256, activation='relu')(x)
    x = Dropout(0.5)(x)
    
    predictions = Dense(classes, activation='softmax', name='output')(x)
    
    model = Model(inputs=images, outputs=predictions)
    return model

I obtain

(minerva_venv) patryk@patryk-miziula:~/Documents/edukacyjne/Minerva/0401/minerva$ python run_minerva.py -- submit --problem fashion_mnist --task_nr 1 --filepath resources/fashion_mnist/problems/task1.ipynb
2018-01-11 18-34-57 minerva >>> starting experiment...
Using TensorFlow backend.
neptune: Executing in Offline Mode.
2018-01-11 18-34-58 minerva >>> Saving graph to output/path_to_your_solution/class_predictions_graph.json
[NbConvertApp] Converting notebook resources/fashion_mnist/problems/task1.ipynb to python
[NbConvertApp] Writing 2031 bytes to resources/fashion_mnist/problems/task1.py
Traceback (most recent call last):
  File "run_minerva.py", line 46, in <module>
    action()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "run_minerva.py", line 41, in submit
    pm.submit_task(sub_problem, task_nr, filepath, dev_mode, cloud_mode)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/problem_manager.py", line 25, in submit_task
    task_handler = registered_tasks[task_nr](trainer)
KeyError: 1
Sentry is attempting to send 1 pending error messages
Waiting up to 10 seconds
Press Ctrl-C to quit

Relatively high entry treshold in task 1 in fashion mnist

I think some inexperienced users may encounter some difficulties in the first task of the first problem they see (Task 1 of fashion mnist).

If you don't know DL then you won't learn it from that task and you won't obtain any clues how and where to start.
If you know DL, but don't know Keras, then you'll have a hard time trying to learn it from that task and, for instance, you may end up struggling with writing down the model in the sequential regime whereas the functional one is required there.

One can give similar notes to the other tasks.

Generally, I think that adding some additional comments for newbies would broaden the potential Minerva's target with little effort.

[fashion-MNIST, task 3] CONFIG appears to be needed for this task, contrary to the task description

Description of task 3 states "CONFIG is not needed for this particular task but we left it here for consistency with other tasks.". However, if a completed config dict is not included in the notebook, the submission process fails.

[whales, task 3] Error: no such option: -s

In task 3:

This is an alignment subtask so add -s classification to the execution command

neptune run -- submit --problem whales --task_nr 3 -s classification

The above throws an error:

Error: no such option: -s

Default `--filepath`

I think it's worth mentioning the default filepath for submitting - just to the task place in repo, e.g.

resources/fashion_mnist/problems/task1.ipynb

for task 1 from fashion mnist. In this way we get rid of a redundant parameter in the command line.

[whales, task 7] AttributeError: Can't pickle local object 'solution.<locals>.DatasetLocalizer'

Task 7 raises an error:

  File "main.py", line 66, in <module>
    action()
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-env/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-env/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-env/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-env/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-env/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "main.py", line 61, in submit
    pm.submit_task(task_sub_problem, task_nr, file_path, dev_mode)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/whales/problem_manager.py", line 44, in submit_task
    new_trainer.train()
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/whales/trainer.py", line 48, in train
    'train_mode': True,
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/backend/base.py", line 75, in fit_transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/backend/base.py", line 75, in fit_transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/backend/base.py", line 82, in fit_transform
    step_output_data = self._cached_fit_transform(step_inputs)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/backend/base.py", line 92, in _cached_fit_transform
    step_output_data = self.transformer.fit_transform(**step_inputs)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/backend/base.py", line 219, in fit_transform
    self.fit(*args, **kwargs)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/backend/models/pytorch/models.py", line 55, in fit
    for batch_id, data in enumerate(batch_gen):
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-env/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 417, in __iter__
    return DataLoaderIter(self)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-env/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 234, in __init__
    w.start()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 105, in start
    self._popen = self._Popen(self)
  File "/usr/lib/python3.5/multiprocessing/context.py", line 212, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/usr/lib/python3.5/multiprocessing/context.py", line 274, in _Popen
    return Popen(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_spawn_posix.py", line 33, in __init__
    super().__init__(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 20, in __init__
    self._launch(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_spawn_posix.py", line 48, in _launch
    reduction.dump(process_obj, fp)
  File "/usr/lib/python3.5/multiprocessing/reduction.py", line 59, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'solution.<locals>.DatasetLocalizer'

Class DatasetLocalizer can't be pickeled since it's not definded at the top level of the module.
https://docs.python.org/3/library/pickle.html#what-can-be-pickled-and-unpickled

I tried to move class definition to the outside of the function, but since it's the function that is being imported it can't find the module.


  File "<string>", line 1, in <module>
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 106, in spawn_main
    exitcode = _main(fd)
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 116, in _main
    self = pickle.load(from_parent)
ImportError: No module named 'task7'

[fashion-MNIST, task 4] KeyError while submitting the solution

neptune run -- submit --problem fashion_mnist --task_nr 4

-- | --

5.372691 | /mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
5.372979 | from ._conv import register_converters as _register_converters
5.373169 | Using TensorFlow backend.
12.010692 | [NbConvertApp] Converting notebook /mnt/ml-team/homes/rafal.jakubanis/minerva/resources/fashion_mnist/tasks/task4.ipynb to python
12.892425 | [NbConvertApp] Writing 641 bytes to /mnt/ml-team/homes/rafal.jakubanis/minerva/resources/fashion_mnist/tasks/task4.py
13.172734 | Traceback (most recent call last):
13.173071 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 103, in execute
13.17331 | execfile(job_filepath, job_globals)
13.17354 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/past/builtins/misc.py", line 82, in execfile
13.173774 | exec_(code, myglobals, mylocals)
13.174001 | File "main.py", line 66, in <module>
13.174223 | action()
13.174454 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
13.174719 | return self.main(*args, **kwargs)
13.174942 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_ve
nv/lib/python3.5/site-packages/click/core.py", line 697, in main
13.175152 | rv = self.invoke(ctx)
13.175377 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
13.175583 | return _process_result(sub_ctx.command.invoke(sub_ctx))
13.175779 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
13.17597 | return ctx.invoke(self.callback, **ctx.params)
13.17616 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
13.176348 | return callback(*args, **kwargs)
13.176538 | File "main.py", line 61, in submit
13.176747 | pm.submit_task(task_sub_problem, task_nr, file_path, dev_mode)
13.176932 | File "/mnt/ml-team/homes/rafal.jakubanis/minerva/minerva/fashion_mnist/problem_manager.py", line 34, in submit_task
13.177115 | new_trainer = task_handler.substitute(user_task_solution, user_config)
13.177299 | File "/mnt/ml-team/homes/rafal.jakubanis/minerva/minerva/backend/task_manager.py", line 14, in substitute
13.177482 | self.modify_trainer(user_solution, user_config)
13.177665 | File "/mnt/ml-team/homes/rafal.jakubanis/minerva/minerva/fashion_mnist/tasks.py", line 54, in modify_trainer
13.177847 | self.trainer.pipeline.get_step('loader').is_substituted = True
13.17803 | File "/mnt/ml-team/homes/rafal.jakubanis/minerva/minerva/backend/base.py", line 54, in get_step
13.178213 | return self.all_steps[name]
13.178396 | KeyError: 'loader'
13.178578 |  
13.178761 | During handling of the above exception, another exception occurred:
13.178949 |  
13.179133 | Traceback (most recent call last):
13.179326 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 109, in <module>
13.17951 | execute()
13.179693 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 105, in execute
13.179914 | raise ExperimentExecutionException("Exception during experiment execution", ex)
13.180092 | deepsense.neptune.exceptions.ExperimentExecutionException: ('Exception during experiment execution', KeyError('loader',))

train_mode=False in dry_run mode

I strongly suggest to set default train_mode as False in dry_run mode. In this way the user by default can quickly check if everything (including paths to original solutions) works. However, in my opinion train_mode=True should remain as possible to set to allow users to re-train their solutions and save them in their favourite path.

Please add the trained model to the fashion mnist problem.

as above

minerva-ml / minerva-training-materials Goto Github PK

minerva-training-materials's Introduction

Minerva

Getting started

Hands-on approach to learning

Reproduce Kaggle winning solutions in a transparent way → learn advanced data science

Available problems

Disclaimer

User support

Contributing to Minerva

About the name

minerva-training-materials's People

Contributors

Stargazers

Watchers

Forkers

minerva-training-materials's Issues

Recommend Projects

Recommend Topics

Recommend Org