Giter VIP home page Giter VIP logo

minerva-training-materials's Introduction

Minerva

Minerva is an educational project that lets you learn advanced data science on real-life, curated problems.


Getting started

  1. Follow the Installation Guide for setup instructions.
  2. Familiarize yourself with our approach: check User Guide or go straight to the Fashion MNIST problem and start solving.
  3. When ready, go to Right Whale Recognition problem to start working on complex problem.

Hands-on approach to learning

With Minerva you will reproduce, piece by piece, a solution to the most difficult data scientific problems, especially challenges. Since each problem is quite complex, we divided it into a collection of small self-contained pieces called tasks.

Task is a single step in machine learning pipeline, it has its own learning objectives, descriptions and a piece of code that needs to be implemented. This is your job: to create a technical implementation that fulfills this gap. You use your engineering skills, extensive experimentation and our feedback in order to make sure that your implementation meets certain quality level. We know what the final score for a well implemented pipeline should be. So as you solve tasks and re-implement parts of the pipeline we will be checking whether your implementation does the job well enough to keep the score high.

Reproduce Kaggle winning solutions in a transparent way → learn advanced data science

Working on tasks that, if taken together, create solution to the problem lets you reproduce Kaggle winning solution, piece by piece. This is our hands on approach to learning, because you can work on each part of the winning implementation by yourself.

Available problems

Problem Description
Fashion mnist Get started with Minerva by solving easy pipeline on nice dataset fashion-mnist
Whales Reproduce Right Whale Recognition Kaggle winning solution!
(more problems will be published in the future, so stay tuned)

Disclaimer

In this open source solution you will find references to the neptune.ml. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ml is not necessary to proceed with this solution. You may run it as plain Python script 😉.

User support

You can seek support in two ways:

  1. Check Minerva wiki for typical problems and questions.
  2. Create an issue with label question, in case Minerva wiki does not have an answer to your question.

Contributing to Minerva

Check CONTRIBUTING for more information.

About the name

Minerva is a Roman goddess of wisdom, arts and craft. She was usually presented with the strong association with knowledge. Her sacred creature 'owl of Minerva' symbolizes wisdom and knowledge. We think that this name depicts our project very well, since it is about acquiring knowledge and skills.

minerva-training-materials's People

Contributors

buus2 avatar dependabot[bot] avatar jakubczakon avatar kamil-kaczmarek avatar pknut avatar pziecina avatar rafajak avatar taraspiotr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

minerva-training-materials's Issues

missing package h5py

For

neptune run -- 'dry_train --problem fashion_mnist'

I obtained

Traceback (most recent call last):
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 138, in <module>
    execute()
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 134, in execute
    execfile(job_filepath, job_globals)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/past/builtins/misc.py", line 82, in execfile
    exec_(code, myglobals, mylocals)
  File "main.py", line 72, in <module>
    action()
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "main.py", line 20, in dry_train
    dry_run(problem, dev_mode, cloud_mode, train_mode=True)
  File "main.py", line 44, in dry_run
    pm.dry_run(sub_problem, train_mode, dev_mode, cloud_mode)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva/minerva/fashion_mnist/problem_manager.py", line 22, in dry_run
    trainer.train()
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva/minerva/fashion_mnist/trainer.py", line 23, in train
    'inference': False}})
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva/minerva/backend/base.py", line 75, in fit_transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva/minerva/backend/base.py", line 81, in fit_transform
    step_output_data = self._cached_fit_transform(step_inputs)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva/minerva/backend/base.py", line 93, in _cached_fit_transform
    self.transformer.save(self.cache_filepath_step_transformer)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva/minerva/backend/models/keras/models.py", line 50, in save
    self.model.save(filepath)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/keras/engine/topology.py", line 2573, in save
    save_model(self, filepath, overwrite, include_optimizer)
  File "/mnt/ml-team/homes/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/keras/models.py", line 60, in save_model
    raise ImportError('`save_model` requires h5py.')
ImportError: `save_model` requires h5py.

Everything worked fine after pip3 install h5py.

Please add h5py to requirements.

[whales, task 3 output] what is validation/test score?

The validation score (2.0059) is not equal to validation loss (1.01667) or the validation accuracy (0.77713). Similarly, test score is hard to interpret. How are these two scores calculated?

226894.311837 | 2018-04-15 01-09-37 minerva >>> epoch 250 current lr: 0.0003252930814335209
226894.312173 | 2018-04-15 01-09-37 minerva >>> epoch 249 loss: 0.03353
226894.312389 | 2018-04-15 01-09-37 minerva >>> epoch 249 accuracy: 0.99986
226981.769858 | 2018-04-15 01-11-05 minerva >>> epoch 249 validation loss: 1.01667
226981.770167 | 2018-04-15 01-11-05 minerva >>> epoch 249 validation accuracy: 0.77713
227067.955128 | 2018-04-15 01-12-31 minerva >>> training finished...
<...>
227715.884304 | Validation score is 2.0059
227715.884506 | Test score is 2.1295

227715.884696 | That is a solid validation
227715.884888 | Sorry, but this score is not high enough to pass the task

please add jupyter to requirements

Without jupyter package the submit mode doesn't work. Try for example

python main.py -- submit --problem fashion_mnist --task_nr 1

to check it.

error "TypeError: 'NoneType' object is not callable" after the experiment

For
python run_minerva.py -- dry_run --problem fashion_mnist
in the end I obtain

Test score is 0.9067
That is a solid validation
Congrats you solved the task!
Exception ignored in: <bound method BaseSession.__del__ of <tensorflow.python.client.session.Session object at 0x7f35e6e7cb38>>
Traceback (most recent call last):
  File "/home/patryk.miziula/Minerva/minerva_venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 595, in __del__
TypeError: 'NoneType' object is not callable```

[whales, task3] task submission ends after 249 epochs with "Sorry, your validation split is messed up. Fix it please."

Submitting the task (via 'neptune run') was unsuccessful and ended after 249 with an error "Sorry, your validation split is messed up. Fix it please." The validation split isn't a part of any task within the Whales problem - perhaps this error is leaking from Fashion-MNIST part.

(sidenote - the log below suggests that the numbering of prints with 'current learning rate' is shifted by one - epoch 250 vs 249, in 5th and 6th row, respectively)

156834.540647 Connection lost. Retrying...
156834.540935 2018-04-06 08-00-15 minerva >>> epoch 249 batch 110 ...
156834.557396 Connection lost. Retrying...
156834.557587 2018-04-06 08-00-17 minerva >>> epoch 249 average batch time: 0:00:04.0
156834.557771 2018-04-06 08-00-17 minerva >>> epoch 250 current lr: 0.0003252930814335209
156834.557955 2018-04-06 08-00-17 minerva >>> epoch 249 loss: 0.03416
156834.558143 2018-04-06 08-00-17 minerva >>> epoch 249 accuracy: 0.99975
156834.840497 Connection lost. Retrying...
156834.840747 Connection restored!
156895.617163 2018-04-06 08-01-23 minerva >>> epoch 249 validation loss: 0.98257
156895.617506 2018-04-06 08-01-23 minerva >>> epoch 249 validation accuracy: 0.78689
156949.918186 2018-04-06 08-02-18 minerva >>> training finished...
157301.446679 2018-04-06 08-08-09 minerva >>> step classifier_network saving transformer...
157301.793026 2018-04-06 08-08-09 minerva >>> step classifier_network saving outputs...
157301.793355 2018-04-06 08-08-09 minerva >>> step classifier_calibrator adapting inputs
157302.072814 2018-04-06 08-08-10 minerva >>> step classifier_calibrator saving transformer...
157302.073103 2018-04-06 08-08-10 minerva >>> step classifier_calibrator saving outputs...
157302.073299 2018-04-06 08-08-10 minerva >>> step classifier_encoder adapting inputs
157302.073493 2018-04-06 08-08-10 minerva >>> step classifier_encoder loading...
157302.073687 2018-04-06 08-08-10 minerva >>> step classifier_encoder transforming...
157302.256781 2018-04-06 08-08-10 minerva >>> step classifier_output adapting inputs
157302.257108 2018-04-06 08-08-10 minerva >>> step classifier_output loading...
157302.257317 2018-04-06 08-08-10 minerva >>> step classifier_output transforming...
157302.744636 2018-04-06 08-08-10 minerva >>> step classifier_encoder adapting inputs
157302.744928 2018-04-06 08-08-10 minerva >>> step classifier_encoder loading...
157302.745115 2018-04-06 08-08-10 minerva >>> step classifier_encoder transforming...
157302.745298 2018-04-06 08-08-10 minerva >>> step classifier_loader adapting inputs
157302.745478 2018-04-06 08-08-10 minerva >>> step classifier_loader loading...
157302.745656 2018-04-06 08-08-10 minerva >>> step classifier_loader transforming...
157302.745832 2018-04-06 08-08-10 minerva >>> step classifier_network unpacking inputs
157302.746007 2018-04-06 08-08-10 minerva >>> step classifier_network loading...
157302.746183 2018-04-06 08-08-10 minerva >>> step classifier_network transforming...
157351.151419 2018-04-06 08-08-59 minerva >>> step classifier_calibrator adapting inputs
157351.151625 2018-04-06 08-08-59 minerva >>> step classifier_calibrator loading...
157351.15182 2018-04-06 08-08-59 minerva >>> step classifier_calibrator transforming...
157351.330583 2018-04-06 08-08-59 minerva >>> step classifier_encoder adapting inputs
157351.330964 2018-04-06 08-08-59 minerva >>> step classifier_encoder loading...
157351.33123 2018-04-06 08-08-59 minerva >>> step classifier_encoder transforming...
157351.331464 2018-04-06 08-08-59 minerva >>> step classifier_output adapting inputs
157351.331668 2018-04-06 08-08-59 minerva >>> step classifier_output loading...
157351.331846 2018-04-06 08-08-59 minerva >>> step classifier_output transforming...
157351.33203 2018-04-06 08-08-59 minerva >>> step classifier_encoder adapting inputs
157351.332214 2018-04-06 08-08-59 minerva >>> step classifier_encoder loading...
157351.332397 2018-04-06 08-08-59 minerva >>> step classifier_encoder transforming...
157351.332591 2018-04-06 08-08-59 minerva >>> step classifier_loader adapting inputs
157351.332774 2018-04-06 08-08-59 minerva >>> step classifier_loader loading...
157351.332958 2018-04-06 08-08-59 minerva >>> step classifier_loader transforming...
157351.333137 2018-04-06 08-08-59 minerva >>> step classifier_network unpacking inputs
157351.333316 2018-04-06 08-08-59 minerva >>> step classifier_network loading...
157351.53075 2018-04-06 08-08-59 minerva >>> step classifier_network transforming...
157411.719687 2018-04-06 08-09-59 minerva >>> step classifier_calibrator adapting inputs
157411.719895 2018-04-06 08-09-59 minerva >>> step classifier_calibrator loading...
157411.720097 2018-04-06 08-09-59 minerva >>> step classifier_calibrator transforming...
157411.720338 2018-04-06 08-09-59 minerva >>> step classifier_encoder adapting inputs
157411.720538 2018-04-06 08-09-59 minerva >>> step classifier_encoder loading...
157411.720747 2018-04-06 08-09-59 minerva >>> step classifier_encoder transforming...
157411.720945 2018-04-06 08-09-59 minerva >>> step classifier_output adapting inputs
157411.721135 2018-04-06 08-09-59 minerva >>> step classifier_output loading...
157411.721333 2018-04-06 08-09-59 minerva >>> step classifier_output transforming...
157411.928652  
157411.92899 Validation score is 1.9927
157411.929189 Test score is 2.2540
157411.929416 Sorry, your validation split is messed up. Fix it please.

(make local dirs and public dirs the same)solution dir paths don't work

Fashion mnist

These work:

  • in neptune.yaml: solution_dir: resources/fashion_mnist/solution
  • command
    python main.py -- dry_run --problem fashion_mnist --train_mode False
    or
    neptune run -- dry_run --problem fashion_mnist --train_mode False

This doesn't work:

  • in neptune.yaml: solution_dir: /public/minerva/resources/fashion_mnist/solution
  • command
neptune send \
--environment keras-2.0-gpu-py3 \
--worker gcp-gpu-medium \
-- dry_run --problem fashion_mnist --train_mode False

Please make resources on github repo and neptune's public exactly the same.

Whales

This doesn't work:

  • in neptune.yaml:
data_dir: resources/whales/data
solution_dir: resources/whales/solution
  • command
    python main.py -- dry_run --problem whales --train_mode False

Error: ValueError: Specified solution_dir is missing 'transformers' directory. Use dry_run with train_mode=True or specify the path to trained pipeline.

It seems that automatic subproblem inference doesn't reach there.

Avoid empty solution functions unless necessary

In some tasks we have to do nothing with solution function, however we can still see this empty function. I'd like to suggest to just erase this function (or, analogously, CONFIG), unless necessary.

Changing run_minerva.py and config.yaml to default Neptune names

I suggest to consider changing names:

  • run_minerva.py -> main.py,
  • config.yaml -> neptune.yaml.
    In this way you can shorten neptune commands, eg. from
neptune run run_minerva.py --config config.yaml -- dry_run --problem fashion_mnist

to

neptune run -- dry_run --problem fashion_mnist

problem with psutil during the installation of requirements

OS: Ubuntu 16.04.
Command: pip3 install -r minerva/requirements.txt
Result (no-error lines before the psutil error omitted):

  Running setup.py bdist_wheel for psutil ... error
  Complete output from command /home/patryk/Documents/edukacyjne/Minerva/1217/minerva/bin/python3.5 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-j2yhq163/psutil/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmpt1r_vi31pip-wheel- --python-tag cp35:
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.5
  creating build/lib.linux-x86_64-3.5/psutil
  copying psutil/_pssunos.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/_common.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/_pslinux.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/__init__.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/_psposix.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/_compat.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/_psosx.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/_pswindows.py -> build/lib.linux-x86_64-3.5/psutil
  copying psutil/_psbsd.py -> build/lib.linux-x86_64-3.5/psutil
  creating build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_windows.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_memory_leaks.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/runner.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/__init__.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_linux.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_testutils.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_system.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_sunos.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_bsd.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_posix.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_process.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_osx.py -> build/lib.linux-x86_64-3.5/psutil/tests
  copying psutil/tests/test_misc.py -> build/lib.linux-x86_64-3.5/psutil/tests
  running build_ext
  building 'psutil._psutil_linux' extension
  creating build/temp.linux-x86_64-3.5
  creating build/temp.linux-x86_64-3.5/psutil
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DPSUTIL_VERSION=430 -I/usr/include/python3.5m -I/home/patryk/Documents/edukacyjne/Minerva/1217/minerva/include/python3.5m -c psutil/_psutil_linux.c -o build/temp.linux-x86_64-3.5/psutil/_psutil_linux.o
  psutil/_psutil_linux.c:12:20: fatal error: Python.h: No such file or directory
  compilation terminated.
  error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
  
  ----------------------------------------
  Failed building wheel for psutil
  Running setup.py clean for psutil
Failed to build psutil
Installing collected packages: psutil, pathlib2, neptune-cli, opencv-python, pandas, pydot, pydot-ng, scikit-learn, torchvision
  Running setup.py install for psutil ... error
    Complete output from command /home/patryk/Documents/edukacyjne/Minerva/1217/minerva/bin/python3.5 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-j2yhq163/psutil/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-tclqna_o-record/install-record.txt --single-version-externally-managed --compile --install-headers /home/patryk/Documents/edukacyjne/Minerva/1217/minerva/include/site/python3.5/psutil:
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-3.5
    creating build/lib.linux-x86_64-3.5/psutil
    copying psutil/_pssunos.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/_common.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/_pslinux.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/__init__.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/_psposix.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/_compat.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/_psosx.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/_pswindows.py -> build/lib.linux-x86_64-3.5/psutil
    copying psutil/_psbsd.py -> build/lib.linux-x86_64-3.5/psutil
    creating build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_windows.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_memory_leaks.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/runner.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/__init__.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_linux.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_testutils.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_system.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_sunos.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_bsd.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_posix.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_process.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_osx.py -> build/lib.linux-x86_64-3.5/psutil/tests
    copying psutil/tests/test_misc.py -> build/lib.linux-x86_64-3.5/psutil/tests
    running build_ext
    building 'psutil._psutil_linux' extension
    creating build/temp.linux-x86_64-3.5
    creating build/temp.linux-x86_64-3.5/psutil
    x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DPSUTIL_VERSION=430 -I/usr/include/python3.5m -I/home/patryk/Documents/edukacyjne/Minerva/1217/minerva/include/python3.5m -c psutil/_psutil_linux.c -o build/temp.linux-x86_64-3.5/psutil/_psutil_linux.o
    psutil/_psutil_linux.c:12:20: fatal error: Python.h: No such file or directory
    compilation terminated.
    error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
    
    ----------------------------------------
Command "/home/patryk/Documents/edukacyjne/Minerva/1217/minerva/bin/python3.5 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-j2yhq163/psutil/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-tclqna_o-record/install-record.txt --single-version-externally-managed --compile --install-headers /home/patryk/Documents/edukacyjne/Minerva/1217/minerva/include/site/python3.5/psutil" failed with error code 1 in /tmp/pip-build-j2yhq163/psutil/

Standard solutions from stackoverflow didn't help.

--train_model False raises an error

python run_minerva.py -- dry_run --problem fashion_mnist

works whereas

python run_minerva.py -- dry_run --problem fashion_mnist --train_mode False

raises an error:

~/Documents/edukacyjne/Minerva/0401/minerva$ python run_minerva.py -- dry_run --problem fashion_mnist --train_mode False
2018-01-11 13-08-12 minerva >>> starting experiment...
Using TensorFlow backend.
2018-01-11 13-08-14 minerva >>> running: None
neptune: Executing in Offline Mode.
2018-01-11 13-08-14 minerva >>> Saving graph to path/to/your/solution/class_predictions_graph.json
2018-01-11 13-08-14 minerva >>> step input unpacking inputs
2018-01-11 13-08-14 minerva >>> step input loading...
2018-01-11 13-08-14 minerva >>> step input transforming...
2018-01-11 13-08-14 minerva >>> step keras_model unpacking inputs
Epoch 1/200
2018-01-11 13:08:15.268968: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-11 13:08:15.269056: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-11 13:08:15.269094: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-11 13:08:15.269123: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-11 13:08:15.269150: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
46/47 [============================>.] - ETA: 1s - loss: 0.4396 - acc: 0.9772/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/callbacks.py:494: RuntimeWarning: Early stopping conditioned on metric `val_loss` which is not available. Available metrics are: acc,loss
  (self.monitor, ','.join(list(logs.keys()))), RuntimeWarning
Traceback (most recent call last):
  File "run_minerva.py", line 46, in <module>
    action()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "run_minerva.py", line 27, in dry_run
    pm.dry_run(sub_problem, train_mode, dev_mode, cloud_mode)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/problem_manager.py", line 16, in dry_run
    _evaluate(trainer)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/problem_manager.py", line 39, in _evaluate
    score_valid, score_test = trainer.evaluate()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/trainer.py", line 22, in evaluate
    score_valid = self._evaluate(X_valid, y_valid)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/trainer.py", line 29, in _evaluate
    'inference': True}})
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 102, in transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 74, in fit_transform
    step_output_data = self._cached_fit_transform(step_inputs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 84, in _cached_fit_transform
    step_output_data = self.transformer.fit_transform(**step_inputs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 206, in fit_transform
    self.fit(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/keras/models_keras.py", line 28, in fit
    **self.training_config)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/engine/training.py", line 2187, in fit_generator
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/callbacks.py", line 73, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/keras/callbacks_keras.py", line 19, in on_epoch_end
    self.ctx.channel_send('Log-loss validation', self.epoch_id, logs['val_loss'])
KeyError: 'val_loss'
Sentry is attempting to send 1 pending error messages
Waiting up to 10 seconds
Press Ctrl-C to quit

_prep_cache issue

For

neptune send --environment pytorch-0.2.0-gpu-py3 --worker gcp-gpu-medium -- dry_eval --problem whales

I obtained

114.465695 | Traceback (most recent call last):
-- | --
114.466053 | File "/usr/local/lib/python3.6/dist-packages/deepsense/neptune/job_wrapper.py", line 138, in <module>
114.466311 | execute()
114.466593 | File "/usr/local/lib/python3.6/dist-packages/deepsense/neptune/job_wrapper.py", line 134, in execute
114.466841 | execfile(job_filepath, job_globals)
114.467081 | File "/usr/local/lib/python3.6/dist-packages/past/builtins/misc.py", line 82, in execfile
114.467322 | exec_(code, myglobals, mylocals)
114.467577 | File "main.py", line 72, in <module>
114.467857 | action()
114.468074 | File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 722, in __call__
114.468385 | return self.main(*args, **kwargs)
114.468668 | File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 697, in main
114.46896 | rv = self.invoke(ctx)
114.46924 | File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1066, in invoke
114.469555 | return _process_result(sub_ctx.command.invoke(sub_ctx))
114.469857 | File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 895, in invoke
114.470132 | return ctx.invoke(self.callback, **ctx.params)
114.470374 | File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 535, in invoke
114.47058 | return callback(*args, **kwargs)
114.470817 | File "main.py", line 28, in dry_eval
114.471031 | dry_run(problem, dev_mode, cloud_mode, train_mode=False)
114.471244 | File "main.py", line 40, in dry_run
114.471457 | pm.dry_run(sub_problem, train_mode, dev_mode, cloud_mode)
114.47169 | File "/neptune/minerva/whales/problem_manager.py", line 26, in dry_run
114.471911 | handle_empty_solution_dir(train_mode, config, pipeline)
114.472125 | File "/neptune/minerva/utils.py", line 54, in handle_empty_solution_dir
114.472339 | transformers_in_pipeline = set(pipeline(config).all_steps.keys())
114.472554 | File "/neptune/minerva/whales/pipelines.py", line 44, in alignment_pipeline
114.47277 | cache_dirpath=config['global']['cache_dirpath'])
114.472991 | File "/neptune/minerva/backend/base.py", line 183, in __init__
114.473226 | super().__init__(*args, **kwargs)
114.473485 | File "/neptune/minerva/backend/base.py", line 24, in __init__
114.473748 | self._prep_cache(cache_dirpath, save_outputs)
114.474027 | File "/neptune/minerva/backend/base.py", line 33, in _prep_cache
114.474336 | os.makedirs(os.path.join(cache_dirpath, dirname), exist_ok=True)
114.474574 | File "/usr/lib/python3.6/os.py", line 220, in makedirs
114.47482 | mkdir(name, mode)
114.475053 | OSError: [Errno 30] Read-only file system: '/public/minerva/resources/whales/solution/alignment/outputs'

I chose:

data_dir: /public/whales
solution_dir: /public/minerva/resources/whales/solution

[whales, task 7] AttributeError("Can't pickle local object 'solution.<locals>.DatasetLocalizer'",))

neptune run main.py -- submit --problem whales --task_nr 7

the above results in the error below.

-- | --
12.114279 | [NbConvertApp] Writing 4898 bytes to /mnt/ml-team/homes/usr/minerva/resources/whales/tasks/task7.py
12.612039 | /mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
12.612379 | from ._conv import register_converters as _register_converters
13.203399 | Using TensorFlow backend.
25.387684 | Traceback (most recent call last):
25.387881 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 103, in execute
25.388077 | execfile(job_filepath, job_globals)
25.388271 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/past/builtins/misc.py", line 82, in execfile
25.388466 | exec_(code, myglobals, mylocals)
25.388659 | File "main.py", line 66, in <module>
25.388854 | action()
25.389049 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
25.389249 | return self.main(*args, **kwargs)
25.389443 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
25.389637 | rv = self.invoke(ctx)
25.38983 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
25.390016 | return _process_result(sub_ctx.command.invoke(sub_ctx))
25.390203 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
25.39039 | return ctx.invoke(self.callback, **ctx.params)
25.390576 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
25.390775 | return callback(*args, **kwargs)
25.390969 | File "main.py", line 61, in submit
25.391163 | pm.submit_task(task_sub_problem, task_nr, file_path, dev_mode)
25.391356 | File "/mnt/ml-team/homes/usr/minerva/minerva/whales/problem_manager.py", line 44, in submit_task
25.391565 | new_trainer.train()
25.39176 | File "/mnt/ml-team/homes/usr/minerva/minerva/whales/trainer.py", line 48, in train
25.391954 | 'train_mode': True,
25.392147 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 75, in fit_transform
25.392339 | step_inputs[input_step.name] = input_step.fit_transform(data)
25.392535 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 75, in fit_transform
25.392728 | step_inputs[input_step.name] = input_step.fit_transform(data)
25.392918 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 81, in fit_transform
25.393109 | step_output_data = self._cached_fit_transform(step_inputs)
25.393302 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 91, in _cached_fit_transform
25.393489 | step_output_data = self.transformer.fit_transform(**step_inputs)
25.393675 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 218, in fit_transform
25.39386 | self.fit(*args, **kwargs)
25.394049 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/models/pytorch/models.py", line 55, in fit
25.394249 | for batch_id, data in enumerate(batch_gen):
25.394443 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 417, in __iter__
25.394635 | return DataLoaderIter(self)
25.394829 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 234, in __init__
25.395021 | w.start()
25.395211 | File "/usr/lib/python3.5/multiprocessing/process.py", line 105, in start
25.395401 | self._popen = self._Popen(self)
25.395623 | File "/usr/lib/python3.5/multiprocessing/context.py", line 212, in _Popen
25.395817 | return _default_context.get_context().Process._Popen(process_obj)
25.396009 | File "/usr/lib/python3.5/multiprocessing/context.py", line 274, in _Popen
25.396201 | return Popen(process_obj)
25.396396 | File "/usr/lib/python3.5/multiprocessing/popen_spawn_posix.py", line 33, in __init__
25.396587 | super().__init__(process_obj)
25.396777 | File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 20, in __init__
25.396966 | self._launch(process_obj)
25.397153 | File "/usr/lib/python3.5/multiprocessing/popen_spawn_posix.py", line 48, in _launch
25.39734 | reduction.dump(process_obj, fp)
25.397531 | File "/usr/lib/python3.5/multiprocessing/reduction.py", line 59, in dump
25.397722 | ForkingPickler(file, protocol).dump(obj)
25.397914 | AttributeError: Can't pickle local object 'solution.<locals>.DatasetLocalizer'
25.398359 |  
25.398564 | During handling of the above exception, another exception occurred:
25.398762 |  
25.398955 | Traceback (most recent call last):
25.399148 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 109, in <module>
25.399342 | execute()
25.413426 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 105, in execute
25.413676 | raise ExperimentExecutionException("Exception during experiment execution", ex)
25.413876 | deepsense.neptune.exceptions.ExperimentExecutionException: ('Exception during experiment execution', AttributeError("Can't pickle local object 'solution.<locals>.DatasetLocalizer'",))


Is --subproblem parameter really needed?

Each single step in whales constantly belongs to exactly one sub problem. So why should user care about sub problem by typing --sub_problem sth in command line? I think the sub problem could be inferred from the task automatically, eventually user could have a possibility (not an obligation) to choose one in dry dun mode.

dry_train and dry_eval?

I propose to exchange:

  • dry_run train_mode=False by dry_eval,
  • dry_run train_mode=True by dry_train.

In this way, the user could consciously choose if:

  • he/she wants to only evaluate results from in-house solution to quickly check if everything (including paths to solutions and data) is set correctly,
  • he/she wants to spend a lot of time to re-train the in-house solution and save it in his/her favourite path.

Also, after these exchanges the readmes should become more understandable.

Alternatively, I propose to exchange train_mode with only_eval/eval_only.

Better distinguishing the result

In the current version the info whether I did or didn't pass the step is just a line in stdout. However, this statement is so important that it'd be nice to distinguish it in a more effective way.

`--train_mode False` in whales doesn't work

For

python run_minerva.py -- dry_run --problem whales --sub_problem localization --train_mode False

I obtain

2018-01-18 13-34-59 minerva >>> starting experiment...
2018-01-18 13-35-01 minerva >>> running: localization
neptune: Executing in Offline Mode.
2018-01-18 13-35-01 minerva >>> step localizer_loader unpacking inputs
2018-01-18 13-35-01 minerva >>> step localizer_loader loading...
2018-01-18 13-35-01 minerva >>> step localizer_loader transforming...
2018-01-18 13-35-01 minerva >>> step localizer_network unpacking inputs
2018-01-18 13-35-01 minerva >>> initializing model weights...
2018-01-18 13-35-01 minerva >>> starting training...
2018-01-18 13-35-01 minerva >>> initial lr: 0.0005
2018-01-18 13-35-01 minerva >>> epoch 0 ...
2018-01-18 13-35-18 minerva >>> epoch 0 batch 0 ...
/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/torch/nn/modules/container.py:67: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
  input = module(input)
2018-01-18 13-35-26 minerva >>> epoch 0 batch 0 loss:     4.85479
2018-01-18 13-35-26 minerva >>> epoch 0 batch 0 accuracy: 0.00000
2018-01-18 13-35-26 minerva >>> epoch 0 average batch time: 0:00:08.0
(...) [ANALOGOUS STUFF FOR BATCHES 1-14]
/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/torch/nn/modules/container.py:67: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
  input = module(input)
2018-01-18 13-36-04 minerva >>> epoch 0 batch 15 loss:     4.70346
2018-01-18 13-36-04 minerva >>> epoch 0 batch 15 accuracy: 0.00000
2018-01-18 13-36-04 minerva >>> epoch 0 model saved to output/path_to_your_solution/checkpoints/localizer_network/model_epoch0.torch
2018-01-18 13-36-04 minerva >>> epoch 1 current lr: 0.0005
2018-01-18 13-36-04 minerva >>> epoch 0 loss:     4.78818
2018-01-18 13-36-04 minerva >>> epoch 0 accuracy: 0.02295
Traceback (most recent call last):
  File "run_minerva.py", line 46, in <module>
    action()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "run_minerva.py", line 27, in dry_run
    pm.dry_run(sub_problem, train_mode, dev_mode, cloud_mode)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/whales/problem_manager.py", line 25, in dry_run
    _evaluate(trainer, sub_problem)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/whales/problem_manager.py", line 49, in _evaluate
    score_valid, score_test = trainer.evaluate()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/trainer.py", line 22, in evaluate
    score_valid = self._evaluate(X_valid, y_valid)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/whales/trainer.py", line 68, in _evaluate
    'train_mode': False,
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 102, in transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 68, in fit_transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 74, in fit_transform
    step_output_data = self._cached_fit_transform(step_inputs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 84, in _cached_fit_transform
    step_output_data = self.transformer.fit_transform(**step_inputs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 206, in fit_transform
    self.fit(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/pytorch/models.py", line 61, in fit
    self.callbacks.on_epoch_end()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/pytorch/callbacks.py", line 86, in on_epoch_end
    callback.on_epoch_end(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/pytorch/callbacks.py", line 154, in on_epoch_end
    val_loss, val_acc = score_model_multi_output(self.model, self.loss_function, self.validation_datagen)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/pytorch/validation.py", line 50, in score_model_multi_output
    for batch_id, data in enumerate(batch_gen):
TypeError: 'NoneType' object is not iterable
Sentry is attempting to send 1 pending error messages
Waiting up to 10 seconds
Press Ctrl-C to quit

The same error arises when I use Neptune cloud. Everything works with default --train_mode True.

what's the difference between dry_train and dry_eval?

For fashion_mnist problem, if you set solution_dir already contained the trained model then the training in the dry_train mode takes only one epoch. Thus, dry_train and dry_eval do the same and there is no need to distinguish them.

Changing 'problems' to 'tasks'

I think it's worth changing particular folders' names from problems to tasks, since they contain tasks, not problems.

[whales, dry_eval] "Sorry, but this score is not high enough to pass the task" in classification

neptune run -- dry_eval --problem whales

The above throws an error after evaluating the classification algorithm [ie. doesn't pass the task], and then goes back to command prompt.

2018-04-03 10-22-58 minerva >>> running: classification
2018-04-03 10-23-00 minerva >>> step classifier_encoder adapting inputs
2018-04-03 10-23-00 minerva >>> step classifier_encoder loading...
2018-04-03 10-23-00 minerva >>> step classifier_encoder transforming...
2018-04-03 10-23-00 minerva >>> step classifier_loader adapting inputs
2018-04-03 10-23-00 minerva >>> step classifier_loader loading...
2018-04-03 10-23-00 minerva >>> step classifier_loader transforming...
2018-04-03 10-23-00 minerva >>> step classifier_network unpacking inputs
2018-04-03 10-23-00 minerva >>> step classifier_network loading...
2018-04-03 10-23-01 minerva >>> step classifier_network transforming...
100%|██████████| 16/16 [00:35<00:00,  2.24s/it]
2018-04-03 10-23-37 minerva >>> step classifier_calibrator adapting inputs
2018-04-03 10-23-37 minerva >>> step classifier_calibrator loading...
2018-04-03 10-23-37 minerva >>> step classifier_calibrator transforming...
2018-04-03 10-23-37 minerva >>> step classifier_encoder adapting inputs
2018-04-03 10-23-37 minerva >>> step classifier_encoder loading...
2018-04-03 10-23-37 minerva >>> step classifier_encoder transforming...
2018-04-03 10-23-37 minerva >>> step classifier_output adapting inputs
2018-04-03 10-23-37 minerva >>> step classifier_output loading...
2018-04-03 10-23-37 minerva >>> step classifier_output transforming...
2018-04-03 10-23-37 minerva >>> step classifier_encoder adapting inputs
2018-04-03 10-23-37 minerva >>> step classifier_encoder loading...
2018-04-03 10-23-37 minerva >>> step classifier_encoder transforming...
2018-04-03 10-23-37 minerva >>> step classifier_loader adapting inputs
2018-04-03 10-23-37 minerva >>> step classifier_loader loading...
2018-04-03 10-23-37 minerva >>> step classifier_loader transforming...
2018-04-03 10-23-37 minerva >>> step classifier_network unpacking inputs
2018-04-03 10-23-37 minerva >>> step classifier_network loading...
2018-04-03 10-23-37 minerva >>> step classifier_network transforming...
100%|██████████| 15/15 [00:26<00:00,  1.74s/it]
2018-04-03 10-24-03 minerva >>> step classifier_calibrator adapting inputs
2018-04-03 10-24-03 minerva >>> step classifier_calibrator loading...
2018-04-03 10-24-03 minerva >>> step classifier_calibrator transforming...
2018-04-03 10-24-03 minerva >>> step classifier_encoder adapting inputs
2018-04-03 10-24-03 minerva >>> step classifier_encoder loading...
2018-04-03 10-24-03 minerva >>> step classifier_encoder transforming...
2018-04-03 10-24-03 minerva >>> step classifier_output adapting inputs
2018-04-03 10-24-03 minerva >>> step classifier_output loading...
2018-04-03 10-24-03 minerva >>> step classifier_output transforming...

Validation score is 2.1188
Test score is 2.0907
That is a solid validation
Sorry, but this score is not high enough to pass the task
Calculated experiment snapshot size: 9.08 MB

1.408208 | 2018-04-03 10-31-59 minerva >>> starting experiment...
-- | --
3.79994 | 2018-04-03 10-32-01 minerva >>> running: alignment
5.394486 | 2018-04-03 10-32-03 minerva >>> step aligner_encoder adapting inputs
5.394776 | 2018-04-03 10-32-03 minerva >>> step aligner_encoder loading...
5.394961 | 2018-04-03 10-32-03 minerva >>> step aligner_encoder transforming...
5.395138 | 2018-04-03 10-32-03 minerva >>> step aligner_loader adapting inputs
5.395315 | 2018-04-03 10-32-03 minerva >>> step aligner_loader loading...
5.39549 | 2018-04-03 10-32-03 minerva >>> step aligner_loader transforming...
5.395664 | 2018-04-03 10-32-03 minerva >>> step aligner_network unpacking inputs
5.395837 | 2018-04-03 10-32-03 minerva >>> step aligner_network loading...
7.896086 | 2018-04-03 10-32-05 minerva >>> step aligner_network transforming...
46.352937 | 2018-04-03 10-32-44 minerva >>> step aligner_unbinner unpacking inputs
46.353462 | 2018-04-03 10-32-44 minerva >>> step aligner_unbinner loading...
46.35368 | 2018-04-03 10-32-44 minerva >>> step aligner_unbinner transforming...
46.353893 | 2018-04-03 10-32-44 minerva >>> step aligner_adjuster adapting inputs
46.575793 | 2018-04-03 10-32-44 minerva >>> step aligner_adjuster loading...
46.576187 | 2018-04-03 10-32-44 minerva >>> step aligner_adjuster transforming...
46.576443 | 2018-04-03 10-32-44 minerva >>> step aligner_output adapting inputs
46.576672 | 2018-04-03 10-32-44 minerva >>> step aligner_output loading...
46.576923 | 2018-04-03 10-32-44 minerva >>> step aligner_output transforming...
46.577152 | 2018-04-03 10-32-44 minerva >>> step aligner_encoder adapting inputs
46.57738 | 2018-04-03 10-32-44 minerva >>> step aligner_encoder loading...
46.57762 | 2018-04-03 10-32-44 minerva >>> step aligner_encoder transforming...
46.577851 | 2018-04-03 10-32-44 minerva >>> step aligner_loader adapting inputs
46.578076 | 2018-04-03 10-32-44 minerva >>> step aligner_loader loading...
46.578317 | 2018-04-03 10-32-44 minerva >>> step aligner_loader transforming...
46.57855 | 2018-04-03 10-32-44 minerva >>> step aligner_network unpacking inputs
46.578779 | 2018-04-03 10-32-44 minerva >>> step aligner_network loading...
46.579016 | 2018-04-03 10-32-44 minerva >>> step aligner_network transforming...
74.03968 | 2018-04-03 10-33-11 minerva >>> step aligner_unbinner unpacking inputs
74.03987 | 2018-04-03 10-33-11 minerva >>> step aligner_unbinner loading...
74.040058 | 2018-04-03 10-33-11 minerva >>> step aligner_unbinner transforming...
74.040246 | 2018-04-03 10-33-11 minerva >>> step aligner_adjuster adapting inputs
74.040454 | 2018-04-03 10-33-11 minerva >>> step aligner_adjuster loading...
74.040641 | 2018-04-03 10-33-11 minerva >>> step aligner_adjuster transforming...
74.040828 | 2018-04-03 10-33-11 minerva >>> step aligner_output adapting inputs
74.041018 | 2018-04-03 10-33-11 minerva >>> step aligner_output loading...
74.041203 | 2018-04-03 10-33-11 minerva >>> step aligner_output transforming...
74.041387 |  
74.041575 | Validation score is 62.5229
74.04176 | Test score is 64.9918
74.041945 | That is a solid validation
74.042129 | Congrats you solved the task!
74.042313 | 2018-04-03 10-33-11 minerva >>> running: localization
75.068115 | 2018-04-03 10-33-12 minerva >>> step localizer_loader unpacking inputs
75.068325 | 2018-04-03 10-33-12 minerva >>> step localizer_loader loading...
75.068492 | 2018-04-03 10-33-12 minerva >>> step localizer_loader transforming...
75.068652 | 2018-04-03 10-33-12 minerva >>> step localizer_network unpacking inputs
75.068768 | 2018-04-03 10-33-12 minerva >>> step localizer_network loading...
75.068882 | 2018-04-03 10-33-12 minerva >>> step localizer_network transforming...
101.504964 | 2018-04-03 10-33-39 minerva >>> step localizer_unbinner unpacking inputs
101.505176 | 2018-04-03 10-33-39 minerva >>> step localizer_unbinner loading...
101.505376 | 2018-04-03 10-33-39 minerva >>> step localizer_unbinner transforming...
101.505558 | 2018-04-03 10-33-39 minerva >>> step localizer_output adapting inputs
101.505748 | 2018-04-03 10-33-39 minerva >>> step localizer_output loading...
101.505929 | 2018-04-03 10-33-39 minerva >>> step localizer_output transforming...
101.506116 | 2018-04-03 10-33-39 minerva >>> step localizer_loader unpacking inputs
101.506312 | 2018-04-03 10-33-39 minerva >>> step localizer_loader loading...
101.506507 | 2018-04-03 10-33-39 minerva >>> step localizer_loader transforming...
101.506699 | 2018-04-03 10-33-39 minerva >>> step localizer_network unpacking inputs
101.506881 | 2018-04-03 10-33-39 minerva >>> step localizer_network loading...
101.50707 | 2018-04-03 10-33-39 minerva >>> step localizer_network transforming...
128.154553 | 2018-04-03 10-34-06 minerva >>> step localizer_unbinner unpacking inputs
128.154753 | 2018-04-03 10-34-06 minerva >>> step localizer_unbinner loading...
128.154942 | 2018-04-03 10-34-06 minerva >>> step localizer_unbinner transforming...
128.155128 | 2018-04-03 10-34-06 minerva >>> step localizer_output adapting inputs
128.15532 | 2018-04-03 10-34-06 minerva >>> step localizer_output loading...
128.155509 | 2018-04-03 10-34-06 minerva >>> step localizer_output transforming...
128.155701 |  
128.155905 | Validation score is 108.2913
128.156099 | Test score is 94.9431
128.156303 | That is a solid validation
128.156496 | Congrats you solved the task!
128.156686 | 2018-04-03 10-34-06 minerva >>> running: classification
129.684731 | 2018-04-03 10-34-07 minerva >>> step classifier_encoder adapting inputs
129.684952 | 2018-04-03 10-34-07 minerva >>> step classifier_encoder loading...
129.685101 | 2018-04-03 10-34-07 minerva >>> step classifier_encoder transforming...
129.685222 | 2018-04-03 10-34-07 minerva >>> step classifier_loader adapting inputs
129.685368 | 2018-04-03 10-34-07 minerva >>> step classifier_loader loading...
129.685489 | 2018-04-03 10-34-07 minerva >>> step classifier_loader transforming...
129.685635 | 2018-04-03 10-34-07 minerva >>> step classifier_network unpacking inputs
129.68579 | 2018-04-03 10-34-07 minerva >>> step classifier_network loading...
129.685929 | 2018-04-03 10-34-07 minerva >>> step classifier_network transforming...
158.137679 | 2018-04-03 10-34-36 minerva >>> step classifier_calibrator adapting inputs
158.138025 | 2018-04-03 10-34-36 minerva >>> step classifier_calibrator loading...
158.138231 | 2018-04-03 10-34-36 minerva >>> step classifier_calibrator transforming...
158.138611 | 2018-04-03 10-34-36 minerva >>> step classifier_encoder adapting inputs
158.138809 | 2018-04-03 10-34-36 minerva >>> step classifier_encoder loading...
158.138997 | 2018-04-03 10-34-36 minerva >>> step classifier_encoder transforming...
158.139212 | 2018-04-03 10-34-36 minerva >>> step classifier_output adapting inputs
158.139738 | 2018-04-03 10-34-36 minerva >>> step classifier_output loading...
158.139933 | 2018-04-03 10-34-36 minerva >>> step classifier_output transforming...
158.140115 | 2018-04-03 10-34-36 minerva >>> step classifier_encoder adapting inputs
158.140311 | 2018-04-03 10-34-36 minerva >>> step classifier_encoder loading...
158.140506 | 2018-04-03 10-34-36 minerva >>> step classifier_encoder transforming...
158.140697 | 2018-04-03 10-34-36 minerva >>> step classifier_loader adapting inputs
158.14089 | 2018-04-03 10-34-36 minerva >>> step classifier_loader loading...
158.141069 | 2018-04-03 10-34-36 minerva >>> step classifier_loader transforming...
158.141254 | 2018-04-03 10-34-36 minerva >>> step classifier_network unpacking inputs
158.141435 | 2018-04-03 10-34-36 minerva >>> step classifier_network loading...
158.347917 | 2018-04-03 10-34-36 minerva >>> step classifier_network transforming...
196.301351 | 2018-04-03 10-35-14 minerva >>> step classifier_calibrator adapting inputs
196.301537 | 2018-04-03 10-35-14 minerva >>> step classifier_calibrator loading...
196.301715 | 2018-04-03 10-35-14 minerva >>> step classifier_calibrator transforming...
196.301893 | 2018-04-03 10-35-14 minerva >>> step classifier_encoder adapting inputs
196.302067 | 2018-04-03 10-35-14 minerva >>> step classifier_encoder loading...
196.302249 | 2018-04-03 10-35-14 minerva >>> step classifier_encoder transforming...
196.302466 | 2018-04-03 10-35-14 minerva >>> step classifier_output adapting inputs
196.302687 | 2018-04-03 10-35-14 minerva >>> step classifier_output loading...
196.302904 | 2018-04-03 10-35-14 minerva >>> step classifier_output transforming...
196.505624 |  
196.505941 | Validation score is 2.1188
196.506135 | Test score is 2.0907
196.506323 | That is a solid validation
196.506537 | Sorry, but this score is not high enough to pass the task



neptune send: pip failed to install the requirements

For

neptune send -- dry_eval --problem fashion_mnist

I obtain

5.019441 | [pip]    Could not find a version that satisfies the requirement  ipython==6.2.1 (from -r /tmp/tmpryjdlc (line 21)) (from versions: 0.10,  0.10.1, 0.10.2, 0.11, 0.12, 0.12.1, 0.13, 0.13.1, 0.13.2, 1.0.0, 1.1.0,  1.2.0, 1.2.1, 2.0.0, 2.1.0, 2.2.0, 2.3.0, 2.3.1, 2.4.0, 2.4.1, 3.0.0,  3.1.0, 3.2.0, 3.2.1, 3.2.2, 3.2.3, 4.0.0b1, 4.0.0, 4.0.1, 4.0.2, 4.0.3,  4.1.0rc1, 4.1.0rc2, 4.1.0, 4.1.1, 4.1.2, 4.2.0, 4.2.1, 5.0.0b1, 5.0.0b2,  5.0.0b3, 5.0.0b4, 5.0.0rc1, 5.0.0, 5.1.0, 5.2.0, 5.2.1, 5.2.2, 5.3.0,  5.4.0, 5.4.1, 5.5.0)
-- | --
5.019586 | [pip] No matching distribution found for ipython==6.2.1 (from -r /tmp/tmpryjdlc (line 21))
5.019714 | Traceback (most recent call last):
5.019839 | File "/usr/local/lib/python2.7/dist-packages/deepsense/neptune/job_wrapper.py", line 138, in <module>
5.020117 | execute()
5.020519 | File "/usr/local/lib/python2.7/dist-packages/deepsense/neptune/job_wrapper.py", line 119, in execute
5.020902 | install_requirements()
5.021025 | File "/usr/local/lib/python2.7/dist-packages/deepsense/neptune/job_wrapper.py", line 51, in install_requirements
5.021172 | install_pip_requirements(os.environ['PIP_REQUIREMENTS'])
5.021552 | File "/usr/local/lib/python2.7/dist-packages/deepsense/neptune/job_wrapper.py", line 44, in _check_pip_install_result
5.021796 | raise RuntimeError('pip failed to install the requirements. '
5.024408 | RuntimeError: pip failed to install the requirements. For more details, see the stdout/stderr channels.

[whales, task 5] error after 149 epochs: FileNotFoundError(2, 'No such file or directory'))

neptune run -- submit --problem whales --task_nr 5

raises an error after the 149th epoch:

2018-04-15 11-05-41 minerva >>> epoch 149 batch 112 accuracy: 0.19444

262538.163211 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 103, in execute
-- | --
262538.163383 | execfile(job_filepath, job_globals)
262538.163556 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/past/builtins/misc.py", line 82, in execfile
262538.163723 | exec_(code, myglobals, mylocals)
262538.163892 | File "main.py", line 66, in <module>
262538.164061 | action()
262538.16423 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
262538.164399 | return self.main(*args, **kwargs)
262538.164568 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
262538.164776 | rv = self.invoke(ctx)
262538.165005 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
262538.165232 | return _process_result(sub_ctx.command.invoke(sub_ctx))
262538.16546 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
262538.165716 | return ctx.invoke(self.callback, **ctx.params)
262538.165954 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
262538.166181 | return callback(*args, **kwargs)
262538.166413 | File "main.py", line 61, in submit
262538.166645 | pm.submit_task(task_sub_problem, task_nr, file_path, dev_mode)
262538.166854 | File "/mnt/ml-team/homes/usr/minerva/minerva/whales/problem_manager.py", line 44, in submit_task
262538.167046 | new_trainer.train()
262538.167237 | File "/mnt/ml-team/homes/usr/minerva/minerva/whales/trainer.py", line 48, in train
262538.167427 | 'train_mode': True,
262538.167617 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 75, in fit_transform
262538.167806 | step_inputs[input_step.name] = input_step.fit_transform(data)
262538.167994 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 75, in fit_transform
262538.168179 | step_inputs[input_step.name] = input_step.fit_transform(data)
262538.168367 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 81, in fit_transform
262538.168556 | step_output_data = self._cached_fit_transform(step_inputs)
262538.168743 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 91, in _cached_fit_transform
262538.168971 | step_output_data = self.transformer.fit_transform(**step_inputs)
262538.169201 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/base.py", line 218, in fit_transform
262538.169429 | self.fit(*args, **kwargs)
262538.169659 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/models/pytorch/models.py", line 61, in fit
262538.169913 | self.callbacks.on_epoch_end()
262538.170143 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/models/pytorch/callbacks.py", line 87, in on_epoch_end
262538.17037 | callback.on_epoch_end(*args, **kwargs)
262538.170591 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/models/pytorch/callbacks.py", line 320, in on_epoch_end
262538.170778 | save_model(self.model, full_path)
262538.170972 | File "/mnt/ml-team/homes/usr/minerva/minerva/backend/models/pytorch/utils.py", line 68, in save_model
262538.171159 | torch.save(model.state_dict(), path)
262538.17134 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/torch/serialization.py", line 135, in save
262538.384228 | return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickle_protocol))
262538.384434 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/torch/serialization.py", line 115, in _with_file_like
262538.384553 | f = open(f, mode)
262538.384664 | FileNotFoundError: [Errno 2] No such file or directory: 'resources/whales/solution/localization/submit_solution/checkpoints/localizer_network/model_epoch149.torch'
262538.384773 |  
262538.384882 | During handling of the above exception, another exception occurred:
262538.384988 |  
262538.385094 | Traceback (most recent call last):
262538.385199 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 109, in <module>
262538.385304 | execute()
262538.385408 | File "/mnt/ml-team/homes/usr/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 105, in execute
262538.385511 | raise ExperimentExecutionException("Exception during experiment execution", ex)
262538.385615 | deepsense.neptune.exceptions.ExperimentExecutionException: ('Exception during experiment execution', FileNotFoundError(2, 'No such file or directory'))


strange things during dryrun on fashion_mnist

When I run

python run_minerva.py -- dry_run --problem fashion_mnist

I receive

2018-01-09 16-19-22 minerva-whales >>> starting experiment...
Using TensorFlow backend.
2018-01-09 16-19-23 minerva-whales >>> running: None
neptune: Executing in Offline Mode.
2018-01-09 16-19-23 minerva-whales >>> Saving graph to /mnt/ml-team/minerva/cache/whales/new_experiment/alignment/class_predictions_graph.json
2018-01-09 16-19-24 minerva-whales >>> step input unpacking inputs
2018-01-09 16-19-24 minerva-whales >>> step input saving transformer...
2018-01-09 16-19-24 minerva-whales >>> step input saving outputs...
2018-01-09 16-19-24 minerva-whales >>> step keras_model unpacking inputs
Epoch 1/200
2018-01-09 16:19:25.148275: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-09 16:19:25.148334: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-09 16:19:25.148367: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-09 16:19:25.148394: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-09 16:19:25.148421: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
46/47 [============================>.] - ETA: 1s - loss: 0.3966 - acc: 0.9724/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/callbacks.py:494: RuntimeWarning: Early stopping conditioned on metric `val_loss` which is not available. Available metrics are: loss,acc
  (self.monitor, ','.join(list(logs.keys()))), RuntimeWarning
Traceback (most recent call last):
  File "run_minerva.py", line 46, in <module>
    action()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "run_minerva.py", line 27, in dry_run
    pm.dry_run(sub_problem, train_mode, dev_mode, cloud_mode)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/problem_manager.py", line 16, in dry_run
    _evaluate(trainer)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/problem_manager.py", line 39, in _evaluate
    score_valid, score_test = trainer.evaluate()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/trainer.py", line 22, in evaluate
    score_valid = self._evaluate(X_valid, y_valid)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/trainer.py", line 29, in _evaluate
    'inference': True}})
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 102, in transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 74, in fit_transform
    step_output_data = self._cached_fit_transform(step_inputs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 84, in _cached_fit_transform
    step_output_data = self.transformer.fit_transform(**step_inputs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 206, in fit_transform
    self.fit(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/keras/models_keras.py", line 28, in fit
    **self.training_config)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/engine/training.py", line 2187, in fit_generator
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/callbacks.py", line 73, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/keras/callbacks_keras.py", line 21, in on_epoch_end
    self.ctx.channel_send('Log-loss validation', self.epoch_id, logs['val_loss'])
KeyError: 'val_loss'
Sentry is attempting to send 1 pending error messages
Waiting up to 10 seconds
Press Ctrl-C to quit
Exception ignored in: <bound method BaseSession.__del__ of <tensorflow.python.client.session.Session object at 0x7fc4dd75bdd8>>
Traceback (most recent call last):
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 595, in __del__
TypeError: 'NoneType' object is not callable

Three things are strange here:

  1. minerva-whales in the first line, although I run fashion_mnist problem.
  2. It seems it performs training despite that I don't use --train_mode and the train mode is off by default.
  3. We can see an error about val_acc which is unavailable.

tasks are not registered

For

python main.py -- submit --problem whales --task_nr 1 --filepath resources/whales/tasks/task1.ipynb

I obtain

Traceback (most recent call last):
  File "main.py", line 72, in <module>
    action()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "main.py", line 67, in submit
    pm.submit_task(task_subproblem, task_nr, filepath, dev_mode, cloud_mode)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/whales/problem_manager.py", line 42, in submit_task
    task_handler = registered_tasks[task_nr](trainer)
KeyError: 1

I checked that the error raises because registered_tasks is {}, but this is all I understand. I encountered the same issue for fashion_mnist problem.

bug in neptune send for submit mode

After typing

neptune send run_minerva.py --environment keras-2.0-gpu-py3 --worker gcp-gpu-medium --config config.yaml -- submit --problem fashion_mnist --task_nr 2 --filepath resources/fashion_mnist/problems/task2.ipynb

with default task2.ipynb I see the following in Neptune:

18.442197 | 2018-01-17 11-30-08 minerva >>> starting experiment...
-- | --
18.801251 | Using TensorFlow backend.
34.768749 | 2018-01-17 11-30-24 minerva >>> Saving graph to /output/path_to_your_solution/class_predictions_graph.json
35.123845 | [NbConvertApp] WARNING \| pattern '/neptune/resources/fashion_mnist/problems/task2.ipynb' matched no files
35.366995 | Traceback (most recent call last):
35.367113 | File "/usr/local/lib/python3.5/dist-packages/deepsense/neptune/job_wrapper.py", line 138, in <module>
35.36723 | execute()
35.367347 | File "/usr/local/lib/python3.5/dist-packages/deepsense/neptune/job_wrapper.py", line 134, in execute
35.367463 | execfile(job_filepath, job_globals)
35.36759 | File "/usr/local/lib/python3.5/dist-packages/past/builtins/misc.py", line 82, in execfile
35.367724 | exec_(code, myglobals, mylocals)
35.367855 | File "run_minerva.py", line 46, in <module>
35.367979 | action()
35.368106 | File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 722, in __call__
35.368223 | return self.main(*args, **kwargs)
35.368341 | File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 697, in main
35.368456 | rv = self.invoke(ctx)
35.368649 | File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 1066, in invoke
35.368779 | return _process_result(sub_ctx.command.invoke(sub_ctx))
35.368922 | File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 895, in invoke
35.369046 | return ctx.invoke(self.callback, **ctx.params)
35.369161 | File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 535, in invoke
35.369324 | return callback(*args, **kwargs)
35.369478 | File "run_minerva.py", line 41, in submit
35.36963 | pm.submit_task(sub_problem, task_nr, filepath, dev_mode, cloud_mode)
35.369746 | File "/neptune/minerva/fashion_mnist/problem_manager.py", line 25, in submit_task
35.369861 | user_task_solution, user_config = _fetch_task_solution(filepath)
35.369984 | File "/neptune/minerva/fashion_mnist/problem_manager.py", line 33, in _fetch_task_solution
35.370102 | with TaskSolutionParser(filepath) as task_solution:
35.370218 | File "/neptune/minerva/backend/task_manager.py", line 45, in __enter__
35.370334 | if module_filename not in os.listdir(module_dir):
35.3705 | FileNotFoundError: [Errno 2] No such file or directory: '/neptune/resources/fashion_mnist/problems'

My config.yaml file is:

project-key: MIN

name: minerva

parameters:

  # Local setup
#  data_dir: path/to/your/data # for instance resources/whales/data
#  solution_dir:  output/path_to_your_solution # for instance /output/resources/whales/solution/localization

  # Cloud setup
  data_dir: /public/whales
  solution_dir: /output/path_to_your_solution

exclude:
  - resources
  - output
  - neptune.log
  - offline_job.log
  - .idea
  - .git
  - .ipynb_checkpoints

# Comment if local
pip-requirements-file: requirements.txt

Everything works fine for the dry run mode, i.e., for typing:

neptune send run_minerva.py \
--environment keras-2.0-gpu-py3 \
--worker gcp-gpu-medium \
--config config.yaml \
-- dry_run --problem fashion_mnist

Problem with task 1 of fashion_mnist

For

CONFIG = {'input_size':28,
          'classes':10}

def solution(input_size, classes):
    input_shape = (input_size, input_size, 1)
    images = Input(shape=input_shape)
    
    x = Conv2D(16, 3, padding='same', activation='relu')(images)
    x = Conv2D(16, 3, padding='same', activation='relu')(x)
    x = MaxPool2D()(x)
    
    x = Conv2D(16, 3, padding='same', activation='relu')(x)
    x = Conv2D(16, 3, padding='same', activation='relu')(x)
    x = MaxPool2D()(x)
    
    x = Flatten()(x)
    
    x = Dense(256, activation='relu')(x)
    x = Dropout(0.5)(x)
    x = Dense(256, activation='relu')(x)
    x = Dropout(0.5)(x)
    
    predictions = Dense(classes, activation='softmax', name='output')(x)
    
    model = Model(inputs=images, outputs=predictions)
    return model

I obtain

(minerva_venv) patryk@patryk-miziula:~/Documents/edukacyjne/Minerva/0401/minerva$ python run_minerva.py -- submit --problem fashion_mnist --task_nr 1 --filepath resources/fashion_mnist/problems/task1.ipynb
2018-01-11 18-34-57 minerva >>> starting experiment...
Using TensorFlow backend.
neptune: Executing in Offline Mode.
2018-01-11 18-34-58 minerva >>> Saving graph to output/path_to_your_solution/class_predictions_graph.json
[NbConvertApp] Converting notebook resources/fashion_mnist/problems/task1.ipynb to python
[NbConvertApp] Writing 2031 bytes to resources/fashion_mnist/problems/task1.py
Traceback (most recent call last):
  File "run_minerva.py", line 46, in <module>
    action()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "run_minerva.py", line 41, in submit
    pm.submit_task(sub_problem, task_nr, filepath, dev_mode, cloud_mode)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/problem_manager.py", line 25, in submit_task
    task_handler = registered_tasks[task_nr](trainer)
KeyError: 1
Sentry is attempting to send 1 pending error messages
Waiting up to 10 seconds
Press Ctrl-C to quit

Relatively high entry treshold in task 1 in fashion mnist

I think some inexperienced users may encounter some difficulties in the first task of the first problem they see (Task 1 of fashion mnist).

  • If you don't know DL then you won't learn it from that task and you won't obtain any clues how and where to start.
  • If you know DL, but don't know Keras, then you'll have a hard time trying to learn it from that task and, for instance, you may end up struggling with writing down the model in the sequential regime whereas the functional one is required there.

One can give similar notes to the other tasks.

Generally, I think that adding some additional comments for newbies would broaden the potential Minerva's target with little effort.

[whales, task 3] Error: no such option: -s

In task 3:

This is an alignment subtask so add -s classification to the execution command

neptune run -- submit --problem whales --task_nr 3 -s classification

The above throws an error:

Error: no such option: -s

Default `--filepath`

I think it's worth mentioning the default filepath for submitting - just to the task place in repo, e.g.

resources/fashion_mnist/problems/task1.ipynb

for task 1 from fashion mnist. In this way we get rid of a redundant parameter in the command line.

[whales, task 7] AttributeError: Can't pickle local object 'solution.<locals>.DatasetLocalizer'

Task 7 raises an error:

  File "main.py", line 66, in <module>
    action()
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-env/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-env/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-env/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-env/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-env/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "main.py", line 61, in submit
    pm.submit_task(task_sub_problem, task_nr, file_path, dev_mode)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/whales/problem_manager.py", line 44, in submit_task
    new_trainer.train()
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/whales/trainer.py", line 48, in train
    'train_mode': True,
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/backend/base.py", line 75, in fit_transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/backend/base.py", line 75, in fit_transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/backend/base.py", line 82, in fit_transform
    step_output_data = self._cached_fit_transform(step_inputs)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/backend/base.py", line 92, in _cached_fit_transform
    step_output_data = self.transformer.fit_transform(**step_inputs)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/backend/base.py", line 219, in fit_transform
    self.fit(*args, **kwargs)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-training-materials/minerva/backend/models/pytorch/models.py", line 55, in fit
    for batch_id, data in enumerate(batch_gen):
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-env/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 417, in __iter__
    return DataLoaderIter(self)
  File "/mnt/ml-team/homes/piotr.tarasiewicz/minerva/minerva-env/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 234, in __init__
    w.start()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 105, in start
    self._popen = self._Popen(self)
  File "/usr/lib/python3.5/multiprocessing/context.py", line 212, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/usr/lib/python3.5/multiprocessing/context.py", line 274, in _Popen
    return Popen(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_spawn_posix.py", line 33, in __init__
    super().__init__(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_fork.py", line 20, in __init__
    self._launch(process_obj)
  File "/usr/lib/python3.5/multiprocessing/popen_spawn_posix.py", line 48, in _launch
    reduction.dump(process_obj, fp)
  File "/usr/lib/python3.5/multiprocessing/reduction.py", line 59, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'solution.<locals>.DatasetLocalizer'

Class DatasetLocalizer can't be pickeled since it's not definded at the top level of the module.
https://docs.python.org/3/library/pickle.html#what-can-be-pickled-and-unpickled

I tried to move class definition to the outside of the function, but since it's the function that is being imported it can't find the module.


  File "<string>", line 1, in <module>
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 106, in spawn_main
    exitcode = _main(fd)
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 116, in _main
    self = pickle.load(from_parent)
ImportError: No module named 'task7'

Edit:
similar stackoverflow topic:
https://stackoverflow.com/questions/36994839/i-can-pickle-local-objects-if-i-use-a-derived-class

[fashion-MNIST, task 4] KeyError while submitting the solution

neptune run -- submit --problem fashion_mnist --task_nr 4

-- | --

5.372691 | /mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
5.372979 | from ._conv import register_converters as _register_converters
5.373169 | Using TensorFlow backend.
12.010692 | [NbConvertApp] Converting notebook /mnt/ml-team/homes/rafal.jakubanis/minerva/resources/fashion_mnist/tasks/task4.ipynb to python
12.892425 | [NbConvertApp] Writing 641 bytes to /mnt/ml-team/homes/rafal.jakubanis/minerva/resources/fashion_mnist/tasks/task4.py
13.172734 | Traceback (most recent call last):
13.173071 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 103, in execute
13.17331 | execfile(job_filepath, job_globals)
13.17354 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/past/builtins/misc.py", line 82, in execfile
13.173774 | exec_(code, myglobals, mylocals)
13.174001 | File "main.py", line 66, in <module>
13.174223 | action()
13.174454 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
13.174719 | return self.main(*args, **kwargs)
13.174942 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_ve
nv/lib/python3.5/site-packages/click/core.py", line 697, in main
13.175152 | rv = self.invoke(ctx)
13.175377 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
13.175583 | return _process_result(sub_ctx.command.invoke(sub_ctx))
13.175779 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
13.17597 | return ctx.invoke(self.callback, **ctx.params)
13.17616 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
13.176348 | return callback(*args, **kwargs)
13.176538 | File "main.py", line 61, in submit
13.176747 | pm.submit_task(task_sub_problem, task_nr, file_path, dev_mode)
13.176932 | File "/mnt/ml-team/homes/rafal.jakubanis/minerva/minerva/fashion_mnist/problem_manager.py", line 34, in submit_task
13.177115 | new_trainer = task_handler.substitute(user_task_solution, user_config)
13.177299 | File "/mnt/ml-team/homes/rafal.jakubanis/minerva/minerva/backend/task_manager.py", line 14, in substitute
13.177482 | self.modify_trainer(user_solution, user_config)
13.177665 | File "/mnt/ml-team/homes/rafal.jakubanis/minerva/minerva/fashion_mnist/tasks.py", line 54, in modify_trainer
13.177847 | self.trainer.pipeline.get_step('loader').is_substituted = True
13.17803 | File "/mnt/ml-team/homes/rafal.jakubanis/minerva/minerva/backend/base.py", line 54, in get_step
13.178213 | return self.all_steps[name]
13.178396 | KeyError: 'loader'
13.178578 |  
13.178761 | During handling of the above exception, another exception occurred:
13.178949 |  
13.179133 | Traceback (most recent call last):
13.179326 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 109, in <module>
13.17951 | execute()
13.179693 | File "/mnt/ml-team/homes/rafal.jakubanis/envs/minerva_venv/lib/python3.5/site-packages/deepsense/neptune/job_wrapper.py", line 105, in execute
13.179914 | raise ExperimentExecutionException("Exception during experiment execution", ex)
13.180092 | deepsense.neptune.exceptions.ExperimentExecutionException: ('Exception during experiment execution', KeyError('loader',))
 


train_mode=False in dry_run mode

I strongly suggest to set default train_mode as False in dry_run mode. In this way the user by default can quickly check if everything (including paths to original solutions) works. However, in my opinion train_mode=True should remain as possible to set to allow users to re-train their solutions and save them in their favourite path.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.