substra / substra-backend Goto Github PK
View Code? Open in Web Editor NEWBackend of the Substra software
Home Page: https://docs.substra.org
License: Apache License 2.0
Backend of the Substra software
Home Page: https://docs.substra.org
License: Apache License 2.0
As a consequence of the move from gcr.io to docker hub we can now make use of the public images of substrafoundation/substra-tools
instead of eu.gcr.io/substra-208412/substra-tools
There are references of this image in the Dockerfiles and charts here:
charts/substra-backend/values.yaml
12: # - eu.gcr.io/substra-208412/substra-tools:0.0.1
fixtures/chunantes/objectives/objective0/Dockerfile
1:FROM eu.gcr.io/substra-208412/substra-tools:0.0.1
fixtures/owkin/objectives/objective0/Dockerfile
1:FROM eu.gcr.io/substra-208412/substra-tools:0.0.1
but also inside the zip and tar.gz files here:
./fixtures/chunantes/algos/algo0/algo.zip
./fixtures/chunantes/algos/algo4/algo.zip
./fixtures/chunantes/algos/algo2/algo.zip
./fixtures/chunantes/algos/algo1/algo.tar.gz
./fixtures/chunantes/algos/algo3/algo.tar.gz
./fixtures/chunantes/algos/algo0/algo.tar.gz
./fixtures/chunantes/algos/algo4/algo.tar.gz
I may have forgot a location but it should cover all the cases we use right now.
I was adding lots of chained tuples on the demo env and the execution of the very first one failed with the following traceback in the worker:
[2020-01-09 10:29:11,067: ERROR/ForkPoolWorker-1] ['PHANTOM_READ_CONFLICT', 'PHANTOM_READ_CONFLICT', 'PHANTOM_READ_CONFLICT', 'PHANTOM_READ_CONFLICT']
Traceback (most recent call last):
File "/usr/src/app/substrapp/ledger_utils.py", line 167, in call_ledger
response = loop.run_until_complete(chaincode_calls[call_type](**params))
File "/usr/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
return future.result()
File "/usr/local/lib/python3.6/dist-packages/hfc/fabric/client.py", line 1729, in chaincode_invoke
raise Exception(statuses)
Exception: ['PHANTOM_READ_CONFLICT', 'PHANTOM_READ_CONFLICT', 'PHANTOM_READ_CONFLICT', 'PHANTOM_READ_CONFLICT']
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/src/app/substrapp/ledger_utils.py", line 180, in call_ledger
response = [r for r in e.args[0] if r.response.status != 200][0].response.message
File "/usr/src/app/substrapp/ledger_utils.py", line 180, in <listcomp>
response = [r for r in e.args[0] if r.response.status != 200][0].response.message
AttributeError: 'str' object has no attribute 'response'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/src/app/substrapp/tasks/tasks.py", line 472, in on_success
log_success_tuple(tuple_type, subtuple['key'], retval['result'])
File "/usr/src/app/substrapp/ledger_utils.py", line 371, in log_success_tuple
_update_tuple_status(tuple_type, tuple_key, 'done', extra_kwargs=extra_kwargs)
File "/usr/src/app/substrapp/ledger_utils.py", line 324, in _update_tuple_status
update_ledger(fcn=invoke_fcn, args=invoke_args, sync=True)
File "/usr/src/app/substrapp/ledger_utils.py", line 107, in _wrapper
return fn(*args, **kwargs)
File "/usr/src/app/substrapp/ledger_utils.py", line 233, in update_ledger
return _invoke_ledger(*args, **kwargs)
File "/usr/src/app/substrapp/ledger_utils.py", line 212, in _invoke_ledger
response = call_ledger('invoke', fcn=fcn, args=args, kwargs=params)
File "/usr/src/app/substrapp/ledger_utils.py", line 182, in call_ledger
raise LedgerError(str(e))
substrapp.ledger_utils.LedgerError: ['PHANTOM_READ_CONFLICT', 'PHANTOM_READ_CONFLICT', 'PHANTOM_READ_CONFLICT', 'PHANTOM_READ_CONFLICT']
[2020-01-09 10:29:11,078: INFO/ForkPoolWorker-1] Task substrapp.tasks.tasks.compute_task[96d4f391-356b-41a8-a90f-419a43c8ce00] succeeded in 2.1781632349884603s: {'worker': 'Org1.worker', 'queue': 'Org1.worker', 'computePlanID': '0ac595c5cfcd1711446bcb8f68df720e09f0252
aac948994e7d54e236d198536', 'result': {'end_head_model_file_hash': '6485c28e994d905ef74b77026a48b18914cf1562049085a329f99981841185ed', 'end_head_model_file': 'https://substra-backend.org1.substra-demo.owkin.com/model/6485c28e994d905ef74b77026a48b18914cf1562049085a329f
99981841185ed/file/', 'end_trunk_model_file_hash': 'ed30311078bfc91cec7a2b6d027bfa4cd45de7f5056609fe29d92d4e5ffad984', 'end_trunk_model_file': 'https://substra-backend.org1.substra-demo.owkin.com/model/ed30311078bfc91cec7a2b6d027bfa4cd45de7f5056609fe29d92d4e5ffad984/f
ile/'}}
See #94 for a potential fix.
Sometimes, when we start substra-backend with docker-compose (cf docker/start.py
) with dev
settings, the node-register app can fail (see logs after) and block the container as runserver
will not exit.
In production, we start substra-backend with uwsgi
which has a need-app
parameters that prevent from this issue
INFO - 2019-11-04 13:43:30,885 - events.apps - Start the event application.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/hfc/fabric/client.py", line 1711, in chaincode_invoke
timeout=wait_for_event_timeout)
File "/usr/local/lib/python3.6/asyncio/tasks.py", line 362, in wait_for
raise futures.TimeoutError()
concurrent.futures._base.TimeoutError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/src/app/substrapp/ledger_utils.py", line 159, in call_ledger
response = loop.run_until_complete(chaincode_calls[call_type](**params))
File "/usr/local/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
return future.result()
File "/usr/local/lib/python3.6/site-packages/hfc/fabric/client.py", line 1721, in chaincode_invoke
raise TimeoutError('waitForEvent timed out.')
TimeoutError: waitForEvent timed out.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "manage.py", line 15, in <module>
execute_from_command_line(sys.argv)
File "/usr/local/lib/python3.6/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
utility.execute()
File "/usr/local/lib/python3.6/site-packages/django/core/management/__init__.py", line 357, in execute
django.setup()
File "/usr/local/lib/python3.6/site-packages/django/__init__.py", line 24, in setup
apps.populate(settings.INSTALLED_APPS)
File "/usr/local/lib/python3.6/site-packages/django/apps/registry.py", line 120, in populate
app_config.ready()
File "/usr/src/app/node-register/apps.py", line 10, in ready
invoke_ledger(fcn='registerNode', args=[''], sync=True)
exception calling callback for <Future at 0x7f60aeeb2cc0 state=finished raised _Rendezvous>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.6/site-packages/aiogrpc/utils.py", line 126, in _next
return next(self._iterator)
File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 388, in __next__
return self._next()
File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 382, in _next
raise self
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.CANCELLED
details = "Locally cancelled by application!"
debug_error_string = "None"
>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/concurrent/futures/_base.py", line 324, in _invoke_callbacks
callback(self)
File "/usr/local/lib/python3.6/asyncio/futures.py", line 417, in _call_set_state
dest_loop.call_soon_threadsafe(_set_state, destination, source)
File "/usr/local/lib/python3.6/asyncio/base_events.py", line 637, in call_soon_threadsafe
self._check_closed()
File "/usr/local/lib/python3.6/asyncio/base_events.py", line 377, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed File "/usr/src/app/substrapp/ledger_utils.py", line 90, in _wrapper
return fn(*args, **kwargs)
File "/usr/src/app/substrapp/ledger_utils.py", line 208, in invoke_ledger
response = call_ledger('invoke', fcn=fcn, args=args, kwargs=params)
File "/usr/src/app/substrapp/ledger_utils.py", line 161, in call_ledger
raise LedgerTimeout(str(e))
substrapp.ledger_utils.LedgerTimeout: waitForEvent timed out.
exception calling callback for <Future at 0x7f60aef01cf8 state=finished raised _Rendezvous>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.6/site-packages/aiogrpc/utils.py", line 126, in _next
return next(self._iterator)
File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 388, in __next__
return self._next()
File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 382, in _next
raise self
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.CANCELLED
details = "Locally cancelled by application!"
debug_error_string = "None"
>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/concurrent/futures/_base.py", line 324, in _invoke_callbacks
callback(self)
File "/usr/local/lib/python3.6/asyncio/futures.py", line 417, in _call_set_state
dest_loop.call_soon_threadsafe(_set_state, destination, source)
File "/usr/local/lib/python3.6/asyncio/base_events.py", line 637, in call_soon_threadsafe
self._check_closed()
File "/usr/local/lib/python3.6/asyncio/base_events.py", line 377, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
Requests error status 400: {"message":[{"name":["This field may not be null."]}],"pkhash":"3a768e71c323e3cf62fb43c5fca6b4e8f7f2975bc13972163a8946e7c5ee6b8c"}
substra.sdk.exceptions.InvalidRequest: 400 Client Error: Bad Request for url: http://substra-backend.node-1.com/data_manager/: [{'name': ['This field may not be null.']}]
> /Users/mypath/register_my_dataset.py(119)main()
-> dataset_key = client.add_dataset(DATASET, exist_ok=True)['pkhash']
(Pdb) pp DATASET
{'data_opener': './dataset/opener.py',
'description': './dataset/description.md',
'name': 'My Dataset',
'permissions': {'authorized_ids': [], 'public': True},
'type': 'csv'}
This error has been reported twice today.
Downgrading to earlier version of the backend and hlf-k8s seems to fix the issue.
With backend
/celeryworker
/celerybeat
on version 0.0.12-alpha.4
:
# After starting network/peers
$ docker images | grep worker
substrafoundation/celeryworker 0.0.12-alpha.4 2351fa3aa980 4 days ago 794MB
# Add traintuple...
# It takes ~5 min. With version 0.0.11 it's much faster.
# After the traintuple has been processed
$ docker images | grep worker
substrafoundation/celeryworker latest c2ae5a9c7489 3 days ago 794MB
substrafoundation/celeryworker 0.0.12-alpha.4 2351fa3aa980 4 days ago 794MB
substrafoundation/celeryworker 0.0.12-alpha.3 14611cb4d69f 11 days ago 794MB
substrafoundation/celeryworker 0.0.12-alpha.2 1df63f2ea2e5 2 weeks ago 766MB
substrafoundation/celeryworker 0.0.12-alpha.1 29afa2954ce2 2 weeks ago 766MB
substrafoundation/celeryworker 0.0.11 098db9729666 6 weeks ago 766MB
substrafoundation/celeryworker 0.0.11-alpha.3 d010a2653162 6 weeks ago 766MB
substrafoundation/celeryworker 0.0.11-alpha.2 dc825690761c 6 weeks ago 766MB
substrafoundation/celeryworker 0.0.11-alpha.1 33370e58b413 6 weeks ago 766MB
substrafoundation/celeryworker 0.0.10 f3a397018bd4 7 weeks ago 766MB
substrafoundation/celeryworker dev 4a4002ef6992 8 weeks ago 766MB
substrafoundation/celeryworker 0.0.9 2d5a2efe39e5 2 months ago 765MB
The default "local" installation of the susbtra stack (skaffold run
) assume that custom /etc/hosts
entries present on the host are used during DNS resolution from inside running pods.
That appears to be the case in some setups:
However, it appears not to be the case in other setups:
So, effectively, the default local installation of the substra stack only works in some environments.
While trying to add a bunch of assets, I regularly end up with the following message (as returned by the SDK):
ERROR substra.sdk.rest_client:rest_client.py:116 Requests error status 400: {"message":"<_Rendezvous of RPC that terminated with:\n\tstatus = StatusCode.UNAVAILABLE\n\tdetails = \"failed to connect to all addresses\"\n\tdebug_error_string = \"{\"created\":\"@1576051644.524371597\",\"description\":\"Failed to pick subchannel\",\"file\":\"src/core/ext/filters/client_channel/client_channel.cc\",\"file_line\":3934,\"referenced_errors\":[{\"created\":\"@1576051644.524366407\",\"description\":\"failed to connect to all addresses\",\"file\":\"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc\",\"file_line\":393,\"grpc_status\":14}]}\"\n>"}
When under load, nginx sporadically returns 502 responses
Repro
(Linux, docker driver, minikube w/ ingress addon)
substra login
for i in `seq 200`; do
substra get traintuple $i &
done
In the logs
2020/07/13 17:44:56 [error] 2217#2217: *672834 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 172.17.0.1, server: substra-backend.node-1.com, request: "GET /traintuple/184/ HTTP/1.1", upstream: "http://172.18.0.47:8000/traintuple/184/", host: "substra-backend.node-1.com"
[...]
172.17.0.1 - - [13/Jul/2020:17:44:56 +0000] "GET /traintuple/184/ HTTP/1.1" 400 26 "-" "python-requests/2.24.0" 260 0.027 [org-1-backend-org-1-substra-backend-server-http] [] 172.18.0.47:8000, 172.18.0.47:8000 0, 26 0.000, 0.024 502, 400 be8dae30c7f82749a8b130ddf459875f
For 200 consecutive requests, I consistently get 1-4 "502" responses.
Interestingly, the first time I run the test, I only get one 502. When I re-run the test, I get 3-4 502s. I then keep on getting 3-4 502s in subsequent tests. This might be related to the fact that we currently use the cheaper
algorithm.
Here is an example URL that will trigger the error 500:
http://substra-backend.node-1.com/model/?search=objective%253Aname%253ASkin%252520Lesion%252520Classification%252520Objective
Bug found by @GuillaumeCisco
This is required as the current local folder deletion mechanism is not compatible with adding tuples to a "finished" compute plan.
With the new version of server deployment, the static files are generated in the init container in production but are not present in the server container at the end. We should remove this init container or we should copy those static in the server container once generated
The model view has a lot of peculiarities:
/model
doesn't return models per se, but instead returns the list of all traintuples/composite traintuples/aggregatetuples each with the associated certified testtuple (even though this concept doesn't really exists anymore)/model/<traintuple_key>
only works for traintuples and returns the traintuple and all the linked testtuples. It also creates a local cache of each outModel. It fails with a 500 if the key doesn't match a traintuple but rather a composite traintuple for example./model/<tuple_key>/details
(where tuple_key
is either a composite_traintuple_key
, a traintuple_key
or an aggregate_key
) returns the tuple and all the linked testtuples/model/<model_hash>/file
streams the content of an outModelA streamlined schema could be:
/model
returns the list of all traintuples/composite traintuples/aggregatuples/model/<tuple_key>
returns the matching traintuple / composite traintuple / aggregatetuple with all linked testuples/model/<tuple_key>/file
streams the content of the outModel (for traintuple / aggregatetuple) or the content of the outTrunkModel (for composite traintuple)If the medias volume is not mounted properly on the worker the tuple execution fails with the following message:
ERROR 2020-04-15 13:43:08,556 substrapp.tasks.tasks 708 139767323916096 [00-01-0126-1e30844]
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/celery/app/trace.py", line 385, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/celery/app/trace.py", line 650, in __protected_call__
return self.run(*args, **kwargs)
File "/usr/src/app/substrapp/tasks/tasks.py", line 550, in compute_task
max_retries=int(getattr(settings, 'CELERY_TASK_MAX_RETRIES')))
File "/usr/local/lib/python3.6/dist-packages/celery/app/task.py", line 704, in retry
raise_with_context(exc)
File "/usr/src/app/substrapp/tasks/tasks.py", line 543, in compute_task
prepare_materials(subtuple, tuple_type)
File "/usr/src/app/substrapp/tasks/utils.py", line 26, in timed
result = function(*args, **kw)
File "/usr/src/app/substrapp/tasks/tasks.py", line 580, in prepare_materials
prepare_opener(directory, subtuple)
File "/usr/src/app/substrapp/tasks/utils.py", line 26, in timed
result = function(*args, **kw)
File "/usr/src/app/substrapp/tasks/tasks.py", line 333, in prepare_opener
raise Exception('DataOpener Hash in Subtuple is not the same as in local db')
Exception: DataOpener Hash in Subtuple is not the same as in local db
It doesn't help to find the cause of the error. This is ambiguous and does not help for debugging.
The function get_hash
must be improved.
To be consistent and to follow PEP8 rules regarding python files naming, all middleware files (located in backend/libs folder) should follow the snake_case convention.
I ran into a complexe issue on a wait_for_event
which timeout
It's not perfectly reproducible but I often see it.
When creating a traintuple with LEDGER_CALL_RETRY = True
, default in prod
(in case of False
, default in dev
settings, the user will have a Timeout error
)
create second traintuple
Object with key(s) '363f70dcc3bf22fdce65e36c957e855b7cd3e2828e6909f34ccc97ee6218541a' already exists.
If we look at the log in the backend
exception calling callback for <Future at 0x7f1217d4bb00 state=finished raised _Rendezvous>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.6/site-packages/aiogrpc/utils.py", line 126, in _next
return next(self._iterator)
File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 388, in __next__
return self._next()
File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 382, in _next
raise self
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.CANCELLED
details = "Locally cancelled by application!"
debug_error_string = "None"
>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/concurrent/futures/_base.py", line 324, in _invoke_callbacks
callback(self)
File "/usr/local/lib/python3.6/asyncio/futures.py", line 417, in _call_set_state
dest_loop.call_soon_threadsafe(_set_state, destination, source)
File "/usr/local/lib/python3.6/asyncio/base_events.py", line 637, in call_soon_threadsafe
self._check_closed()
File "/usr/local/lib/python3.6/asyncio/base_events.py", line 377, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
exception calling callback for <Future at 0x7f1217dba940 state=finished raised _Rendezvous>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.6/site-packages/aiogrpc/utils.py", line 126, in _next
return next(self._iterator)
File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 388, in __next__
return self._next()
File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 382, in _next
raise self
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.CANCELLED
details = "Locally cancelled by application!"
debug_error_string = "None"
>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/concurrent/futures/_base.py", line 324, in _invoke_callbacks
callback(self)
File "/usr/local/lib/python3.6/asyncio/futures.py", line 417, in _call_set_state
dest_loop.call_soon_threadsafe(_set_state, destination, source)
File "/usr/local/lib/python3.6/asyncio/base_events.py", line 637, in call_soon_threadsafe
self._check_closed()
File "/usr/local/lib/python3.6/asyncio/base_events.py", line 377, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
Function invoke_ledger failed (<class 'substrapp.ledger_utils.LedgerTimeout'>): waitForEvent timed out. retrying in 2s
We can see that the wait_for_event
timeouts and triggers the retry strategy. But the traintuple
was
already commited in the ledger and create a todo
event.
So when retrying, we just see a 409
which is return to the user.
INFO - 2019-10-24 09:23:47,455 - events.apps - Processing task a2171a1c09738c677748346d22d2b5eea47f874a3b4f4b75224674235892de72: type=traintuple status=todo with tx status: VALID
[24/Oct/2019 09:23:47] "POST /traintuple/ HTTP/1.1" 409 115
It misleads the final user.
Morevover, it could turn in a big issue as for asset which are in localdb (a failure as 409
) will trigger a instance.delete()
The current auto-generated swagger documentation (/doc
) is broken, probably as a result of our last DRF update.
The library it relied on is also deprecated since mid-2019, so instead of just fixing it we'll need to switch to something new.
A potential replacement is https://github.com/axnsan12/drf-yasg
We need to make sure that the fix swagger doc specifies correctly how file uploads are handled. That was the purpose of the SchemaGenerator
class in backend/views.py
.
[backend-org-1-substra-backend-server-54b7879994-j262b substra-backend] INFO - 2019-12-16 15:27:30,977 - events.apps - Processing task 3298e66712747719fd670cba24164479a4c778fa217856e441c98a6ca98ee770: type=testtuple status=todo with tx status: MVCC_READ_CONFLICT
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] [2019-12-16 15:27:30,994: INFO/MainProcess] Received task: substrapp.tasks.tasks.prepare_tuple[3298e66712747719fd670cba24164479a4c778fa217856e441c98a6ca98ee770]
[backend-org-2-substra-backend-server-5bcfbc8888-7tdqh substra-backend] INFO - 2019-12-16 15:27:31,007 - events.apps - Processing task 3298e66712747719fd670cba24164479a4c778fa217856e441c98a6ca98ee770: type=testtuple status=todo with tx status: MVCC_READ_CONFLICT
[backend-org-2-substra-backend-server-5bcfbc8888-7tdqh substra-backend] DEBUG - 2019-12-16 15:27:31,007 - events.apps - Skipping task 3298e66712747719fd670cba24164479a4c778fa217856e441c98a6ca98ee770: owner does not match (MyOrg1MSP vs MyOrg2MSP)
[backend-org-2-substra-backend-worker-64d655d689-cf7kg worker] [2019-12-16 15:27:31,010: ERROR/ForkPoolWorker-1] MVCC read conflict for ('logSuccessCompositeTrain', ['{"key": "503e6b9b0700c9f857cdf750d641e5d758720b93e9d8bbb6fb6836fdc8a3c849", "log": "", "outHeadModel": {"hash": "5cfbe4ac6f29b0a34fa881c209239fc91f547f89327deaa5a6dea5026eab0e68", "storageAddress": "http://substra-backend.node-2.com/model/5cfbe4ac6f29b0a34fa881c209239fc91f547f89327deaa5a6dea5026eab0e68/file/"}, "outTrunkModel": {"hash": "4af8d42042e5a3ad19bd12e98e1ff731c56a856a5f9263bb96cfc497c45e00c4", "storageAddress": "http://substra-backend.node-2.com/model/4af8d42042e5a3ad19bd12e98e1ff731c56a856a5f9263bb96cfc497c45e00c4/file/"}}'])
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] [2019-12-16 15:27:31,027: INFO/ForkPoolWorker-1] DISCOVERY: adding channel peers query
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] [2019-12-16 15:27:31,027: INFO/ForkPoolWorker-1] DISCOVERY: adding config query
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] [2019-12-16 15:27:31,027: INFO/ForkPoolWorker-1] DISCOVERY: adding chaincodes/collection query
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] [2019-12-16 15:27:31,103: INFO/ForkPoolWorker-1] create peer delivery stream
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] [2019-12-16 15:27:31,107: INFO/ForkPoolWorker-1] create peer delivery stream
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] [2019-12-16 15:27:31,134: INFO/MainProcess] Received task: substrapp.tasks.tasks.compute_task[28fc383e-dfd7-4368-bc46-29f0d173afc7]
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] [2019-12-16 15:27:31,145: INFO/ForkPoolWorker-1] Task substrapp.tasks.tasks.prepare_tuple[8dc933581e8e54c448c6a288f291d60625a0fd0ebcd7acf117d25ce4cce62d89] succeeded in 0.19399090000115393s: None
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] [2019-12-16 15:27:31,174: INFO/ForkPoolWorker-1] DISCOVERY: adding channel peers query
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] [2019-12-16 15:27:31,174: INFO/ForkPoolWorker-1] DISCOVERY: adding config query
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] [2019-12-16 15:27:31,174: INFO/ForkPoolWorker-1] DISCOVERY: adding chaincodes/collection query
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] [2019-12-16 15:27:31,206: ERROR/ForkPoolWorker-1] Task substrapp.tasks.tasks.prepare_tuple[3298e66712747719fd670cba24164479a4c778fa217856e441c98a6ca98ee770] raised unexpected: update testtuple 3298e66712747719fd670cba24164479a4c778fa217856e441c98a6ca98ee770 failed: cannot change status from waiting to doing
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] Traceback (most recent call last):
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] File "/usr/local/lib/python3.6/dist-packages/celery/app/trace.py", line 382, in trace_task
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] R = retval = fun(*args, **kwargs)
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] File "/usr/local/lib/python3.6/dist-packages/celery/app/trace.py", line 641, in __protected_call__
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] return self.run(*args, **kwargs)
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] File "/usr/src/app/substrapp/tasks/tasks.py", line 449, in prepare_tuple
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] log_start_tuple(tuple_type, subtuple['key'])
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] File "/usr/src/app/substrapp/ledger_utils.py", line 330, in log_start_tuple
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] _update_tuple_status(tuple_type, tuple_key, 'doing')
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] File "/usr/src/app/substrapp/ledger_utils.py", line 324, in _update_tuple_status
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] update_ledger(fcn=invoke_fcn, args=invoke_args, sync=True)
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] File "/usr/src/app/substrapp/ledger_utils.py", line 107, in _wrapper
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] return fn(*args, **kwargs)
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] File "/usr/src/app/substrapp/ledger_utils.py", line 233, in update_ledger
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] return _invoke_ledger(*args, **kwargs)
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] File "/usr/src/app/substrapp/ledger_utils.py", line 212, in _invoke_ledger
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] response = call_ledger('invoke', fcn=fcn, args=args, kwargs=params)
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] File "/usr/src/app/substrapp/ledger_utils.py", line 196, in call_ledger
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] raise exception_class.from_response(response)
[backend-org-1-substra-backend-worker-859bdb6d6-q9c6g worker] substrapp.ledger_utils.LedgerResponseError: update testtuple 3298e66712747719fd670cba24164479a4c778fa217856e441c98a6ca98ee770 failed: cannot change status from waiting to doing
It would be really helpful when debugging.
Could be set either by the backend(s) or the chaincode.
When I run skaffold dev
in the substra-backend repo, I have the following error:
Step 8/16 : RUN pip3 install -r requirements.txt
---> Running in e5731adf43f3
Collecting git+git://github.com/hyperledger/fabric-sdk-py.git@df19cf51ff4f21507869184901988c094658367a (from -r requirements.txt (line 31))
Cloning git://github.com/hyperledger/fabric-sdk-py.git (to revision df19cf51ff4f21507869184901988c094658367a) to /tmp/pip-req-build-ydgheoes
Running command git clone -q git://github.com/hyperledger/fabric-sdk-py.git /tmp/pip-req-build-ydgheoes
fatal: unable to connect to github.com:
github.com[0: 140.82.114.4]: errno=Connection timed out
ERROR: Command errored out with exit status 128: git clone -q git://github.com/hyperledger/fabric-sdk-py.git /tmp/pip-req-build-ydgheoes Check the logs for full command output.
I am on Ubuntu 18.04 (inside a VM).
When (outside of substra-backend repo) I run:
git clone git://github.com/hyperledger/fabric-sdk-py.git
it also fails (Connection timed out).
But
git clone https://github.com/hyperledger/fabric-sdk-py.git
works.
I'm trying to start all the workers/schedulers for the 2 orgs setup.
If I run the command:
DJANGO_SETTINGS_MODULE=backend.settings.dev BACKEND_ORG=owkin BACKEND_DEFAULT_PORT=8000 BACKEND_PEER_PORT_EXTERNAL=9051 celery -E -A backend worker -l info -B -n owkin -Q owkin,scheduler,celery --hostname owkin.scheduler
the logs get attached to my terminal and I can't run the commands to start the following instances. Thus I decided to pas the --detach
argument, which works, but when trying to lauch a new celery worker I get an error saying:
ERROR: Pidfile (celeryd.pid) already exists. Seems we're already running? (pid: 4815) Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/usr/lib/python3.6/multiprocessing/util.py", line 319, in _exit_function p.join() File "/usr/lib/python3.6/multiprocessing/process.py", line 122, in join assert self._parent_pid == os.getpid(), 'can only join a child process' AssertionError: can only join a child process
I could run the command with the --pidfile=
(with no path) flag, but is that the right way to go?
Thanks!
Currently, several tools don't support the skaffold setup:
generate_assets.py
populate.py
(missing documentation)Spoiler alert: This seems to be firefox-only issue, so you know...
I am getting a 403 - Forbidden: Request for an Unsupported Host Nome
(webpage title) when I try to access http://substra-backend.node-1.com/, but everything is fine on http://substra-backend.node-2.com/. Frontend 1 & 2 are working as expected. And curl http://substra-backend.node-1.com/
is working perfectly. CLI login on this node is OK. I am not finding errors in the logs...
The webpage displayed is:
Unknown Host Request Forbidden
Your request to this server is for a Host Name that is unknown to this server or unsupported by this server.
Additional Information:
You are seeing this message because a request for a Web Site or Domain Name was directed to this server, but this server has not been configured to support requests for that Web Site or Domain Name. Possible causes are (1) the domain name you were requesting has been incorrectly configured to point to the IP address of this server, (2) a Host File on your system/network has been incorrectly configured to direct requests for the domain name you specified to the IP address of this server, or (3) another web site on the internet has been incorrectly configured to redirect requests for the domain name you specified to the IP address of this server. It may also be possible that the incorrect configuration is not accidental but rather intentional designed with a malevolent intent. You should contact the holder or the administrator of the domain name you were requesting for investigation and resolution.
GET
Request headers:Host: substra-backend.node-1.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
DNT: 1
Connection: keep-alive
Cookie: csrftoken=CRw0ntGtzVEeIhHWW9PMPQeGXqZoR1orD8996ZSwnodIohqLTTFPvxTRadENrwAw; sessionid=lgip14h9bko6vq3jlm6kxdpuxpbfymmj
Upgrade-Insecure-Requests: 1
If-Modified-Since: Mon, 21 Mar 2005 22:15:26 GMT
If-None-Match: "5a6-3f2da10a25b80"
Cache-Control: max-age=0
GET
Response headers (attention: http error is 304 (Not Modified) and not 403 as displayed in the webpage title):HTTP/1.1 304 Not Modified
Date: Mon, 10 Feb 2020 14:14:55 GMT
Connection: keep-alive
Keep-Alive: timeout=30
Server: Apache/2
ETag: "5a6-3f2da10a25b80"
Expires: Mon, 10 Feb 2020 15:14:55 GMT
Cache-Control: max-age=3600
Vary: Host
Accept-Ranges: bytes
Age: 0
Configuration:
Do you have any clue to make it work with Firefox? Any specific log to check?
While adding a bunch of tuples individually while a whole other bunch of tuples is being executed, I got the following error in the worker's log. The traceback is incomplete (keyboard mishaps and the logs were gone), but I hope this is enough to investigate
[2020-01-16 10:09:16,738: ERROR/ForkPoolWorker-1] exception calling callback for <Future at 0x7f79d456f630 state=finished raised _MultiThreadedRendezvous>
Traceback (most recent call last):
File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.6/dist-packages/aiogrpc/utils.py", line 126, in _next
return next(self._iterator)
File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 416, in __next__
return self._next()
File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 703, in _next
raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.CANCELLED
details = "Locally cancelled by application!"
debug_error_string = "None"
>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 324, in _invoke_callbacks
callback(self)
File "/usr/lib/python3.6/asyncio/futures.py", line 417, in _call_set_state
dest_loop.call_soon_threadsafe(_set_state, destination, source)
File "/usr/lib/python3.6/asyncio/base_events.py", line 637, in call_soon_threadsafe
self._check_closed()
File "/usr/lib/python3.6/asyncio/base_events.py", line 377, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
[2020-01-16 10:09:16,743: ERROR/ForkPoolWorker-1] exception calling callback for <Future at 0x7f79d458abe0 state=finished raised _MultiThreadedRendezvous>
Traceback (most recent call last):
File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.6/dist-packages/aiogrpc/utils.py", line 126, in _next
return next(self._iterator)
File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 416, in __next__
return self._next()
File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 703, in _next
raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.CANCELLED
details = "Locally cancelled by application!"
debug_error_string = "None"
>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 324, in _invoke_callbacks
callback(self)
File "/usr/lib/python3.6/asyncio/futures.py", line 417, in _call_set_state
dest_loop.call_soon_threadsafe(_set_state, destination, source)
File "/usr/lib/python3.6/asyncio/base_events.py", line 637, in call_soon_threadsafe
Hello,
It looks like a MVCC READ CONFLICT
occurred when registering the success of a training task. Attached are the logs.
logs.txt
For the context, I had submitted two composite_traintuples
and an aggregatetuple
with inmodels
the two composite_traintuples
. The error occured for one of the two composite_traintuple
.
I am not sure to understand why we get it here. Is it because when logging the success of this task, when updating the status of its child (the aggregatetuple
), it reads an older version of the other parents (the other composite_traintuple
) which status has been updated at the same time ?
Couldn't we add a retry mechanism in order to fix this ?
The mission download_assets (PR) change the way that URL is used and expose to the user.
Indeed the solution to hide the ledger URL and remplace it by the node url make url useless.
For instance, if we want to download an asset owned by the node 2 on this url:
http://substrabac.node2/dataset/key/download
from the node 1, the feature implemented proxified the url to http://substrabac.node1/dataset/node2key/download
and hide the fact that the asset comes from node 2 to the user.
We can still know who owned the asset by looking to the owner
attribute if there is one.
Moreover, a solution like tranforming http://substrabac.node2/dataset/key/download
to http://substrabac.node1/proxy?url=http://substrabac.node2/dataset/node2key/download
wasn't chosen so when we look to the list of assets on a node, all urls are proxified implicitly:
On node 1
dataset
[
[
{
"objectiveKey": "3d70ab46d710dacb0f48cb42db4874fac14e048a0d415e266aad38c09591ee71",
"description": {
"hash": "15863c2af1fcfee9ca6f61f04be8a0eaaf6a45e4d50c421788d450d198e580f1",
"storageAddress": "http://owkin.substrabac:8000/data_manager/8dd01465003a9b1e01c99c904d86aa518b3a5dd9dc8d40fe7d075c726ac073ca/description/"
},
"key": "8dd01465003a9b1e01c99c904d86aa518b3a5dd9dc8d40fe7d075c726ac073ca",
"name": "ISIC 2018",
"opener": {
"hash": "8dd01465003a9b1e01c99c904d86aa518b3a5dd9dc8d40fe7d075c726ac073ca",
"storageAddress": "http://owkin.substrabac:8000/data_manager/8dd01465003a9b1e01c99c904d86aa518b3a5dd9dc8d40fe7d075c726ac073ca/opener/"
},
"owner": "chu-nantesMSP",
"permissions": {
"process": {
"public": true,
"authorizedIDs": []
}
},
"type": "Images"
},
{
"objectiveKey": "3d70ab46d710dacb0f48cb42db4874fac14e048a0d415e266aad38c09591ee71",
"description": {
"hash": "258bef187a166b3fef5cb86e68c8f7e154c283a148cd5bc344fec7e698821ad3",
"storageAddress": "http://owkin.substrabac:8000/data_manager/ce9f292c72e9b82697445117f9c2d1d18ce0f8ed07ff91dadb17d668bddf8932/description/"
},
"key": "ce9f292c72e9b82697445117f9c2d1d18ce0f8ed07ff91dadb17d668bddf8932",
"name": "Simplified ISIC 2018",
"opener": {
"hash": "ce9f292c72e9b82697445117f9c2d1d18ce0f8ed07ff91dadb17d668bddf8932",
"storageAddress": "http://owkin.substrabac:8000/data_manager/ce9f292c72e9b82697445117f9c2d1d18ce0f8ed07ff91dadb17d668bddf8932/opener/"
},
"owner": "owkinMSP",
"permissions": {
"process": {
"public": true,
"authorizedIDs": []
}
},
"type": "Images"
}
]
]
algo
[
[
{
"key": "0acc5180e09b6a6ac250f4e3c172e2893f617aa1c22ef1f379019d20fe44142f",
"name": "Neural Network",
"content": {
"hash": "0acc5180e09b6a6ac250f4e3c172e2893f617aa1c22ef1f379019d20fe44142f",
"storageAddress": "http://owkin.substrabac:8000/algo/0acc5180e09b6a6ac250f4e3c172e2893f617aa1c22ef1f379019d20fe44142f/file/"
},
"description": {
"hash": "b9463411a01ea00869bdffce6e59a5c100a4e635c0a9386266cad3c77eb28e9e",
"storageAddress": "http://owkin.substrabac:8000/algo/0acc5180e09b6a6ac250f4e3c172e2893f617aa1c22ef1f379019d20fe44142f/description/"
},
"owner": "chu-nantesMSP",
"permissions": {
"process": {
"public": true,
"authorizedIDs": []
}
}
},
{
"key": "9c3d8777e11fd72cbc0fd672bec3a0848f8518b4d56706008cc05f8a1cee44f9",
"name": "Random Forest",
"content": {
"hash": "9c3d8777e11fd72cbc0fd672bec3a0848f8518b4d56706008cc05f8a1cee44f9",
"storageAddress": "http://owkin.substrabac:8000/algo/9c3d8777e11fd72cbc0fd672bec3a0848f8518b4d56706008cc05f8a1cee44f9/file/"
},
"description": {
"hash": "4acea40c4b51996c88ef279c5c9aa41ab77b97d38c5ca167e978a98b2e402675",
"storageAddress": "http://owkin.substrabac:8000/algo/9c3d8777e11fd72cbc0fd672bec3a0848f8518b4d56706008cc05f8a1cee44f9/description/"
},
"owner": "chu-nantesMSP",
"permissions": {
"process": {
"public": true,
"authorizedIDs": []
}
}
},
{
"key": "7c9f9799bf64c10002381583a9ffc535bc3f4bf14d6f0c614d3f6f868f72a9d5",
"name": "Logistic regression",
"content": {
"hash": "7c9f9799bf64c10002381583a9ffc535bc3f4bf14d6f0c614d3f6f868f72a9d5",
"storageAddress": "http://owkin.substrabac:8000/algo/7c9f9799bf64c10002381583a9ffc535bc3f4bf14d6f0c614d3f6f868f72a9d5/file/"
},
"description": {
"hash": "124a0425b746d7072282d167b53cb6aab3a31bf1946dae89135c15b0126ebec3",
"storageAddress": "http://owkin.substrabac:8000/algo/7c9f9799bf64c10002381583a9ffc535bc3f4bf14d6f0c614d3f6f868f72a9d5/description/"
},
"owner": "owkinMSP",
"permissions": {
"process": {
"public": true,
"authorizedIDs": []
}
}
}
]
]
On node 2
dataset
[
[
{
"objectiveKey": "3d70ab46d710dacb0f48cb42db4874fac14e048a0d415e266aad38c09591ee71",
"description": {
"hash": "15863c2af1fcfee9ca6f61f04be8a0eaaf6a45e4d50c421788d450d198e580f1",
"storageAddress": "http://chunantes.substrabac:8001/data_manager/8dd01465003a9b1e01c99c904d86aa518b3a5dd9dc8d40fe7d075c726ac073ca/description/"
},
"key": "8dd01465003a9b1e01c99c904d86aa518b3a5dd9dc8d40fe7d075c726ac073ca",
"name": "ISIC 2018",
"opener": {
"hash": "8dd01465003a9b1e01c99c904d86aa518b3a5dd9dc8d40fe7d075c726ac073ca",
"storageAddress": "http://chunantes.substrabac:8001/data_manager/8dd01465003a9b1e01c99c904d86aa518b3a5dd9dc8d40fe7d075c726ac073ca/opener/"
},
"owner": "chu-nantesMSP",
"permissions": {
"process": {
"public": true,
"authorizedIDs": []
}
},
"type": "Images"
},
{
"objectiveKey": "3d70ab46d710dacb0f48cb42db4874fac14e048a0d415e266aad38c09591ee71",
"description": {
"hash": "258bef187a166b3fef5cb86e68c8f7e154c283a148cd5bc344fec7e698821ad3",
"storageAddress": "http://chunantes.substrabac:8001/data_manager/ce9f292c72e9b82697445117f9c2d1d18ce0f8ed07ff91dadb17d668bddf8932/description/"
},
"key": "ce9f292c72e9b82697445117f9c2d1d18ce0f8ed07ff91dadb17d668bddf8932",
"name": "Simplified ISIC 2018",
"opener": {
"hash": "ce9f292c72e9b82697445117f9c2d1d18ce0f8ed07ff91dadb17d668bddf8932",
"storageAddress": "http://chunantes.substrabac:8001/data_manager/ce9f292c72e9b82697445117f9c2d1d18ce0f8ed07ff91dadb17d668bddf8932/opener/"
},
"owner": "owkinMSP",
"permissions": {
"process": {
"public": true,
"authorizedIDs": []
}
},
"type": "Images"
}
]
]
algo
[
[
{
"key": "0acc5180e09b6a6ac250f4e3c172e2893f617aa1c22ef1f379019d20fe44142f",
"name": "Neural Network",
"content": {
"hash": "0acc5180e09b6a6ac250f4e3c172e2893f617aa1c22ef1f379019d20fe44142f",
"storageAddress": "http://chunantes.substrabac:8001/algo/0acc5180e09b6a6ac250f4e3c172e2893f617aa1c22ef1f379019d20fe44142f/file/"
},
"description": {
"hash": "b9463411a01ea00869bdffce6e59a5c100a4e635c0a9386266cad3c77eb28e9e",
"storageAddress": "http://chunantes.substrabac:8001/algo/0acc5180e09b6a6ac250f4e3c172e2893f617aa1c22ef1f379019d20fe44142f/description/"
},
"owner": "chu-nantesMSP",
"permissions": {
"process": {
"public": true,
"authorizedIDs": []
}
}
},
{
"key": "9c3d8777e11fd72cbc0fd672bec3a0848f8518b4d56706008cc05f8a1cee44f9",
"name": "Random Forest",
"content": {
"hash": "9c3d8777e11fd72cbc0fd672bec3a0848f8518b4d56706008cc05f8a1cee44f9",
"storageAddress": "http://chunantes.substrabac:8001/algo/9c3d8777e11fd72cbc0fd672bec3a0848f8518b4d56706008cc05f8a1cee44f9/file/"
},
"description": {
"hash": "4acea40c4b51996c88ef279c5c9aa41ab77b97d38c5ca167e978a98b2e402675",
"storageAddress": "http://chunantes.substrabac:8001/algo/9c3d8777e11fd72cbc0fd672bec3a0848f8518b4d56706008cc05f8a1cee44f9/description/"
},
"owner": "chu-nantesMSP",
"permissions": {
"process": {
"public": true,
"authorizedIDs": []
}
}
},
{
"key": "7c9f9799bf64c10002381583a9ffc535bc3f4bf14d6f0c614d3f6f868f72a9d5",
"name": "Logistic regression",
"content": {
"hash": "7c9f9799bf64c10002381583a9ffc535bc3f4bf14d6f0c614d3f6f868f72a9d5",
"storageAddress": "http://chunantes.substrabac:8001/algo/7c9f9799bf64c10002381583a9ffc535bc3f4bf14d6f0c614d3f6f868f72a9d5/file/"
},
"description": {
"hash": "124a0425b746d7072282d167b53cb6aab3a31bf1946dae89135c15b0126ebec3",
"storageAddress": "http://chunantes.substrabac:8001/algo/7c9f9799bf64c10002381583a9ffc535bc3f4bf14d6f0c614d3f6f868f72a9d5/description/"
},
"owner": "owkinMSP",
"permissions": {
"process": {
"public": true,
"authorizedIDs": []
}
}
}
]
]
As we see, the url become nearly useless as we can always infer it from the key and the node url where we want to download the asset (user/front api) or from the key and the node url of the owner of the asset (server/back api)
It raises question on the mapping between node ID
and its node URL
, that could be done in the ledger for instance.
Feel free to comment :) !
Hello there,
One of the daemonset deployed when you use the feature flag "gpu" (daemonset-nvidia-plugin.yaml) is using the wrong api version and it result in this error: no matches for kind "DaemonSet" in version "extensions/v1beta1"
I suggest removing the feature and pointing to the Nvidia device plugin documentation: https://github.com/NVIDIA/k8s-device-plugin
HTTP errors occurring during the algo fetch aren't reported in the logs. This makes it hard to understand what is going on.
For instance, in this issue, the logs only error show hash doesn't match [...]
which is unhelpful: the real source of the problem is the HTTP 403 error which occurred prior to the hash computation.
It seems that there is first an error in the chaincode and that then this error is not properly catched in the backend, raising a new exception.
It can be reproduced using this end to end test: Substra/substra-tests#106
Traceback from server
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] INFO - 2020-05-06 14:20:48,103 - substrapp.ledger_utils - smartcontract invoke:updateDataSample; elaps=93.40ms; error=LedgerBadRequest
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] ERROR - 2020-05-06 14:20:48,165 - django.request - Internal Server Error: /data_sample/bulk_update/
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] Traceback (most recent call last):
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/src/app/substrapp/views/datasample.py", line 250, in bulk_update
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] data = ledger.update_datasample(args)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/src/app/substrapp/ledger.py", line 148, in update_datasample
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] return _create_asset('updateDataSample', args=args)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/src/app/substrapp/ledger.py", line 93, in _create_asset
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] return __create_asset(fcn, args=args, sync=True, **extra_kwargs)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/src/app/substrapp/ledger.py", line 88, in __create_asset
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] return invoke_ledger(fcn=fcn, args=args, sync=sync, **extra_kwargs)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/src/app/substrapp/ledger_utils.py", line 147, in _wrapper
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] return fn(*args, **kwargs)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/src/app/substrapp/ledger_utils.py", line 326, in invoke_ledger
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] return _invoke_ledger(*args, **kwargs)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/src/app/substrapp/ledger_utils.py", line 310, in _invoke_ledger
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] response = call_ledger('invoke', fcn=fcn, args=args, kwargs=params)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/src/app/substrapp/ledger_utils.py", line 285, in call_ledger
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] return _call_ledger(call_type, fcn, *args, **kwargs)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/src/app/substrapp/ledger_utils.py", line 275, in _call_ledger
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] _raise_for_status(response)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/src/app/substrapp/ledger_utils.py", line 116, in _raise_for_status
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] raise exception_class.from_response_dict(response)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] substrapp.ledger_utils.LedgerBadRequest: problem when reading json arg: {"hashes": "6ce482fae0cf23dc654a18667b2a194ce7e7c1191e8385777d591003f98cd7fd", "dataManagerKeys": "f5df98681ebb1d4737f4707eb2ba379a49e513be33b2ae30f19c48fbdfcb7df9"}, error is: json: cannot unmarshal string into Go struct field inputUpdateDataSample.hashes of type []string
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend]
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] During handling of the above exception, another exception occurred:
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend]
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] Traceback (most recent call last):
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/local/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] response = get_response(request)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py", line 115, in _get_response
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] response = self.process_exception_by_middleware(e, request)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py", line 113, in _get_response
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] response = wrapped_callback(request, *callback_args, **callback_kwargs)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/local/lib/python3.6/site-packages/django/views/decorators/csrf.py", line 54, in wrapped_view
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] return view_func(*args, **kwargs)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/local/lib/python3.6/site-packages/rest_framework/viewsets.py", line 114, in view
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] return self.dispatch(request, *args, **kwargs)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/local/lib/python3.6/site-packages/rest_framework/views.py", line 505, in dispatch
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] response = self.handle_exception(exc)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/local/lib/python3.6/site-packages/rest_framework/views.py", line 465, in handle_exception
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] self.raise_uncaught_exception(exc)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/local/lib/python3.6/site-packages/rest_framework/views.py", line 476, in raise_uncaught_exception
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] raise exc
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/local/lib/python3.6/site-packages/rest_framework/views.py", line 502, in dispatch
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] response = handler(request, *args, **kwargs)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] File "/usr/src/app/substrapp/views/datasample.py", line 252, in bulk_update
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] return Response({'message': str(e.msg)}, status=e.st)
[backend-org-1-substra-backend-server-545687975c-v5bg2 substra-backend] AttributeError: 'LedgerBadRequest' object has no attribute 'st'
Traceback from SDK:
File "/Users/samlesu/.virtualenvs/sb/lib/python3.7/site-packages/substra/sdk/client.py", line 787, in link_dataset_with_data_samples
data=data,
File "/Users/samlesu/.virtualenvs/sb/lib/python3.7/site-packages/substra/sdk/utils.py", line 170, in wrapper
return f(*args, **kwargs)
File "/Users/samlesu/.virtualenvs/sb/lib/python3.7/site-packages/substra/sdk/rest_client.py", line 192, in request
**request_kwargs,
File "/Users/samlesu/.virtualenvs/sb/lib/python3.7/site-packages/substra/sdk/rest_client.py", line 170, in _request
return self.__request(request_name, url, **request_kwargs)
File "/Users/samlesu/.virtualenvs/sb/lib/python3.7/site-packages/substra/sdk/rest_client.py", line 156, in __request
raise exceptions.InternalServerError.from_request_exception(e)
substra.sdk.exceptions.InternalServerError: 500 Server Error: Internal Server Error for url: http://substra-backend.node-1.com/data_sample/bulk_update/
I tried login through the substra client using a node-to-node login with both valid and invalid password. This means using the /api-token-auth
endpoint.
In both cases, I get this 500 response:
Requests error status 500: ValueError at /api-token-auth/
save() prohibited to prevent data loss due to unsaved related object 'user'.
Request Method: POST
Request URL: http://substra-backend.node-1.com/api-token-auth/
Django Version: 2.1.11
Python Executable: /usr/local/bin/python
Python Version: 3.6.10
Python Path: ['/usr/src/app', '/usr/local/lib/python36.zip', '/usr/local/lib/python3.6', '/usr/local/lib/python3.6/lib-dynload', '/usr/local/lib/python3.6/site-packages', '/usr/src/app', '/usr/src/app', '/usr/src/app/libs']
Server time: Tue, 7 Jan 2020 14:05:39 +0000
Installed Applications:
['django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'django.contrib.sites',
'django_celery_results',
'rest_framework_swagger',
'rest_framework',
'rest_framework.authtoken',
'rest_framework_simplejwt.token_blacklist',
'substrapp',
'node',
'users',
'corsheaders',
'events',
'node-register']
Installed Middleware:
['corsheaders.middleware.CorsMiddleware',
'django.middleware.security.SecurityMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.auth.middleware.RemoteUserMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
'libs.SQLPrintingMiddleware.SQLPrintingMiddleware',
'libs.HealthCheckMiddleware.HealthCheckMiddleware']
Traceback:
File "/usr/local/lib/python3.6/site-packages/django/db/models/query.py" in get_or_create
486. return self.get(**lookup), False
File "/usr/local/lib/python3.6/site-packages/django/db/models/query.py" in get
399. self.model._meta.object_name
During handling of the above exception (Token matching query does not exist.), another exception occurred:
File "/usr/local/lib/python3.6/site-packages/django/core/handlers/exception.py" in inner
34. response = get_response(request)
File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py" in _get_response
126. response = self.process_exception_by_middleware(e, request)
File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py" in _get_response
124. response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/usr/local/lib/python3.6/site-packages/django/views/decorators/csrf.py" in wrapped_view
54. return view_func(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/django/views/generic/base.py" in view
68. return self.dispatch(request, *args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/rest_framework/views.py" in dispatch
483. response = self.handle_exception(exc)
File "/usr/local/lib/python3.6/site-packages/rest_framework/views.py" in handle_exception
443. self.raise_uncaught_exception(exc)
File "/usr/local/lib/python3.6/site-packages/rest_framework/views.py" in dispatch
480. response = handler(request, *args, **kwargs)
File "/usr/src/app/backend/views.py" in post
124. token, created = Token.objects.get_or_create(user=user)
File "/usr/local/lib/python3.6/site-packages/django/db/models/manager.py" in manager_method
82. return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/django/db/models/query.py" in get_or_create
488. return self._create_object_from_params(lookup, params)
File "/usr/local/lib/python3.6/site-packages/django/db/models/query.py" in _create_object_from_params
522. obj = self.create(**params)
File "/usr/local/lib/python3.6/site-packages/django/db/models/query.py" in create
413. obj.save(force_insert=True, using=self.db)
File "/usr/local/lib/python3.6/site-packages/rest_framework/authtoken/models.py" in save
35. return super(Token, self).save(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/django/db/models/base.py" in save
670. "unsaved related object '%s'." % field.name
Exception Type: ValueError at /api-token-auth/
Exception Value: save() prohibited to prevent data loss due to unsaved related object 'user'.
Request information:
USER: AnonymousUser
GET: No GET data
POST:
username = 'MyOrg1MSP'
password = 'selfSecret1'
FILES: No FILES data
COOKIES: No cookie data
META:
BACKEND_DB_NAME = 'substra'
BACKEND_DB_PWD = 'postgres'
BACKEND_DB_USER = 'postgres'
BACKEND_DEFAULT_PORT = '8000'
BACKEND_ORG = 'MyOrg1'
BACKEND_ORG_1_POSTGRESQL_PORT = 'tcp://10.99.253.201:5432'
BACKEND_ORG_1_POSTGRESQL_PORT_5432_TCP = 'tcp://10.99.253.201:5432'
BACKEND_ORG_1_POSTGRESQL_PORT_5432_TCP_ADDR = '10.99.253.201'
BACKEND_ORG_1_POSTGRESQL_PORT_5432_TCP_PORT = '5432'
BACKEND_ORG_1_POSTGRESQL_PORT_5432_TCP_PROTO = 'tcp'
BACKEND_ORG_1_POSTGRESQL_SERVICE_HOST = '10.99.253.201'
BACKEND_ORG_1_POSTGRESQL_SERVICE_PORT = '5432'
BACKEND_ORG_1_POSTGRESQL_SERVICE_PORT_POSTGRESQL = '5432'
BACKEND_ORG_1_RABBITMQ_PORT = 'tcp://10.108.182.80:4369'
BACKEND_ORG_1_RABBITMQ_PORT_15672_TCP = 'tcp://10.108.182.80:15672'
BACKEND_ORG_1_RABBITMQ_PORT_15672_TCP_ADDR = '10.108.182.80'
BACKEND_ORG_1_RABBITMQ_PORT_15672_TCP_PORT = '15672'
BACKEND_ORG_1_RABBITMQ_PORT_15672_TCP_PROTO = 'tcp'
BACKEND_ORG_1_RABBITMQ_PORT_25672_TCP = 'tcp://10.108.182.80:25672'
BACKEND_ORG_1_RABBITMQ_PORT_25672_TCP_ADDR = '10.108.182.80'
BACKEND_ORG_1_RABBITMQ_PORT_25672_TCP_PORT = '25672'
BACKEND_ORG_1_RABBITMQ_PORT_25672_TCP_PROTO = 'tcp'
BACKEND_ORG_1_RABBITMQ_PORT_4369_TCP = 'tcp://10.108.182.80:4369'
BACKEND_ORG_1_RABBITMQ_PORT_4369_TCP_ADDR = '10.108.182.80'
BACKEND_ORG_1_RABBITMQ_PORT_4369_TCP_PORT = '4369'
BACKEND_ORG_1_RABBITMQ_PORT_4369_TCP_PROTO = 'tcp'
BACKEND_ORG_1_RABBITMQ_PORT_5672_TCP = 'tcp://10.108.182.80:5672'
BACKEND_ORG_1_RABBITMQ_PORT_5672_TCP_ADDR = '10.108.182.80'
BACKEND_ORG_1_RABBITMQ_PORT_5672_TCP_PORT = '5672'
BACKEND_ORG_1_RABBITMQ_PORT_5672_TCP_PROTO = 'tcp'
BACKEND_ORG_1_RABBITMQ_SERVICE_HOST = '10.108.182.80'
BACKEND_ORG_1_RABBITMQ_SERVICE_PORT = '4369'
BACKEND_ORG_1_RABBITMQ_SERVICE_PORT_AMQP = '5672'
BACKEND_ORG_1_RABBITMQ_SERVICE_PORT_DIST = '25672'
BACKEND_ORG_1_RABBITMQ_SERVICE_PORT_EPMD = '4369'
BACKEND_ORG_1_RABBITMQ_SERVICE_PORT_STATS = '15672'
BACKEND_ORG_1_SUBSTRA_BACKEND_FLOWER_PORT = 'tcp://10.97.215.88:5555'
BACKEND_ORG_1_SUBSTRA_BACKEND_FLOWER_PORT_5555_TCP = 'tcp://10.97.215.88:5555'
BACKEND_ORG_1_SUBSTRA_BACKEND_FLOWER_PORT_5555_TCP_ADDR = '10.97.215.88'
BACKEND_ORG_1_SUBSTRA_BACKEND_FLOWER_PORT_5555_TCP_PORT = '5555'
BACKEND_ORG_1_SUBSTRA_BACKEND_FLOWER_PORT_5555_TCP_PROTO = 'tcp'
BACKEND_ORG_1_SUBSTRA_BACKEND_FLOWER_SERVICE_HOST = '10.97.215.88'
BACKEND_ORG_1_SUBSTRA_BACKEND_FLOWER_SERVICE_PORT = '5555'
BACKEND_ORG_1_SUBSTRA_BACKEND_FLOWER_SERVICE_PORT_HTTP = '5555'
BACKEND_ORG_1_SUBSTRA_BACKEND_SERVER_PORT = 'tcp://10.99.149.30:8000'
BACKEND_ORG_1_SUBSTRA_BACKEND_SERVER_PORT_8000_TCP = 'tcp://10.99.149.30:8000'
BACKEND_ORG_1_SUBSTRA_BACKEND_SERVER_PORT_8000_TCP_ADDR = '10.99.149.30'
BACKEND_ORG_1_SUBSTRA_BACKEND_SERVER_PORT_8000_TCP_PORT = '8000'
BACKEND_ORG_1_SUBSTRA_BACKEND_SERVER_PORT_8000_TCP_PROTO = 'tcp'
BACKEND_ORG_1_SUBSTRA_BACKEND_SERVER_SERVICE_HOST = '10.99.149.30'
BACKEND_ORG_1_SUBSTRA_BACKEND_SERVER_SERVICE_PORT = '8000'
BACKEND_ORG_1_SUBSTRA_BACKEND_SERVER_SERVICE_PORT_HTTP = '8000'
BACKEND_PEER_PORT = 'internal'
CELERY_BROKER_URL = 'amqp://rabbitmq:rabbitmq@backend-org-1-rabbitmq:5672//'
CONTENT_LENGTH = '39'
CONTENT_TYPE = 'application/x-www-form-urlencoded'
DATABASE_HOST = 'backend-org-1-postgresql'
DEFAULT_DOMAIN = 'http://substra-backend.node-1.com'
DJANGO_SETTINGS_MODULE = 'backend.settings.server.dev'
GATEWAY_INTERFACE = 'CGI/1.1'
GPG_KEY = '0D96DF4D4110E5C43FBFB17F2D347EA6AA65421D'
GRPC_MAX_RECEIVE_MESSAGE_LENGTH = '0'
GRPC_MAX_SEND_MESSAGE_LENGTH = '0'
GRPC_SSL_CIPHER_SUITES = 'ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384'
HOME = '/root'
HOSTNAME = 'backend-org-1-substra-backend-server-5f4bc85f8-2pz82'
HTTP_ACCEPT = 'application/json;version=0.0'
HTTP_ACCEPT_ENCODING = 'gzip, deflate'
HTTP_HOST = 'substra-backend.node-1.com'
HTTP_USER_AGENT = 'python-requests/2.22.0'
HTTP_X_FORWARDED_FOR = '192.168.65.3'
HTTP_X_FORWARDED_HOST = 'substra-backend.node-1.com'
HTTP_X_FORWARDED_PORT = '80'
HTTP_X_FORWARDED_PROTO = 'http'
HTTP_X_ORIGINAL_URI = '/api-token-auth/'
HTTP_X_REAL_IP = '192.168.65.3'
HTTP_X_REQUEST_ID = '29e880b1643d1386411caecab8207997'
HTTP_X_SCHEME = 'http'
KUBERNETES_PORT = 'tcp://10.96.0.1:443'
KUBERNETES_PORT_443_TCP = 'tcp://10.96.0.1:443'
KUBERNETES_PORT_443_TCP_ADDR = '10.96.0.1'
KUBERNETES_PORT_443_TCP_PORT = '443'
KUBERNETES_PORT_443_TCP_PROTO = 'tcp'
KUBERNETES_SERVICE_HOST = '10.96.0.1'
KUBERNETES_SERVICE_PORT = '443'
KUBERNETES_SERVICE_PORT_HTTPS = '443'
LANG = 'C.UTF-8'
LEDGER_CONFIG_FILE = '/conf/MyOrg1/substra-backend/conf.json'
MEDIA_ROOT = '/tmp/org-1/medias/'
NETWORK_ORG_1_PEER_1_CA_PORT = 'tcp://10.98.200.198:7054'
NETWORK_ORG_1_PEER_1_CA_PORT_7054_TCP = 'tcp://10.98.200.198:7054'
NETWORK_ORG_1_PEER_1_CA_PORT_7054_TCP_ADDR = '10.98.200.198'
NETWORK_ORG_1_PEER_1_CA_PORT_7054_TCP_PORT = '7054'
NETWORK_ORG_1_PEER_1_CA_PORT_7054_TCP_PROTO = 'tcp'
NETWORK_ORG_1_PEER_1_CA_SERVICE_HOST = '10.98.200.198'
NETWORK_ORG_1_PEER_1_CA_SERVICE_PORT = '7054'
NETWORK_ORG_1_PEER_1_CA_SERVICE_PORT_HTTP = '7054'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_CONTROLLER_PORT = 'tcp://10.109.205.170:80'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_CONTROLLER_PORT_443_TCP = 'tcp://10.109.205.170:443'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_CONTROLLER_PORT_443_TCP_ADDR = '10.109.205.170'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_CONTROLLER_PORT_443_TCP_PORT = '443'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_CONTROLLER_PORT_443_TCP_PROTO = 'tcp'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_CONTROLLER_PORT_80_TCP = 'tcp://10.109.205.170:80'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_CONTROLLER_PORT_80_TCP_ADDR = '10.109.205.170'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_CONTROLLER_PORT_80_TCP_PORT = '80'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_CONTROLLER_PORT_80_TCP_PROTO = 'tcp'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_CONTROLLER_SERVICE_HOST = '10.109.205.170'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_CONTROLLER_SERVICE_PORT = '80'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_CONTROLLER_SERVICE_PORT_HTTP = '80'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_CONTROLLER_SERVICE_PORT_HTTPS = '443'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_DEFAULT_BACKEND_PORT = 'tcp://10.111.238.187:80'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_DEFAULT_BACKEND_PORT_80_TCP = 'tcp://10.111.238.187:80'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_DEFAULT_BACKEND_PORT_80_TCP_ADDR = '10.111.238.187'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_DEFAULT_BACKEND_PORT_80_TCP_PORT = '80'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_DEFAULT_BACKEND_PORT_80_TCP_PROTO = 'tcp'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_DEFAULT_BACKEND_SERVICE_HOST = '10.111.238.187'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_DEFAULT_BACKEND_SERVICE_PORT = '80'
NETWORK_ORG_1_PEER_1_NGINX_INGRESS_DEFAULT_BACKEND_SERVICE_PORT_HTTP = '80'
NETWORK_ORG_1_PEER_1_PORT = 'tcp://10.98.102.223:7051'
NETWORK_ORG_1_PEER_1_PORT_7051_TCP = 'tcp://10.98.102.223:7051'
NETWORK_ORG_1_PEER_1_PORT_7051_TCP_ADDR = '10.98.102.223'
NETWORK_ORG_1_PEER_1_PORT_7051_TCP_PORT = '7051'
NETWORK_ORG_1_PEER_1_PORT_7051_TCP_PROTO = 'tcp'
NETWORK_ORG_1_PEER_1_PORT_7053_TCP = 'tcp://10.98.102.223:7053'
NETWORK_ORG_1_PEER_1_PORT_7053_TCP_ADDR = '10.98.102.223'
NETWORK_ORG_1_PEER_1_PORT_7053_TCP_PORT = '7053'
NETWORK_ORG_1_PEER_1_PORT_7053_TCP_PROTO = 'tcp'
NETWORK_ORG_1_PEER_1_SERVICE_HOST = '10.98.102.223'
NETWORK_ORG_1_PEER_1_SERVICE_PORT = '7051'
NETWORK_ORG_1_PEER_1_SERVICE_PORT_EVENT = '7053'
NETWORK_ORG_1_PEER_1_SERVICE_PORT_REQUEST = '7051'
PATH = '/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'
PATH_INFO = '/api-token-auth/'
PWD = '/usr/src/app'
PYTHONUNBUFFERED = '1'
PYTHON_GET_PIP_SHA256 = 'b86f36cc4345ae87bfd4f10ef6b2dbfa7a872fbff70608a1e43944d283fd0eee'
PYTHON_GET_PIP_URL = 'https://github.com/pypa/get-pip/raw/ffe826207a010164265d9cc807978e3604d18ca0/get-pip.py'
PYTHON_PIP_VERSION = '19.3.1'
PYTHON_VERSION = '3.6.10'
QUERY_STRING = ''
REMOTE_ADDR = '10.1.1.123'
REMOTE_HOST = ''
REQUEST_METHOD = 'POST'
SCRIPT_NAME = ''
SERVER_NAME = 'backend-org-1-substra-backend-server-5f4bc85f8-2pz82'
SERVER_PORT = '8000'
SERVER_PROTOCOL = 'HTTP/1.1'
SERVER_SOFTWARE = 'WSGIServer/0.2'
SHLVL = '0'
TZ = 'UTC'
_ = '/usr/local/bin/python'
wsgi.errors = <_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'>
wsgi.file_wrapper = ''
wsgi.input = <django.core.handlers.wsgi.LimitedStream object at 0x7f3257f0e160>
wsgi.multiprocess = False
wsgi.multithread = True
wsgi.run_once = False
wsgi.url_scheme = 'http'
wsgi.version = '(1, 0)'
Settings:
Using settings module backend.settings.server.dev
ABSOLUTE_URL_OVERRIDES = {}
ADMINS = []
ALLOWED_HOSTS = ['*']
APPEND_SLASH = True
AUTHENTICATION_BACKENDS = ['django.contrib.auth.backends.ModelBackend', 'node.authentication.NodeBackend']
AUTH_PASSWORD_VALIDATORS = '********************'
AUTH_USER_MODEL = 'auth.User'
BASE_DIR = '/usr/src/app/backend'
CACHES = {'default': {'BACKEND': 'django.core.cache.backends.locmem.LocMemCache'}}
CACHE_MIDDLEWARE_ALIAS = 'default'
CACHE_MIDDLEWARE_KEY_PREFIX = '********************'
CACHE_MIDDLEWARE_SECONDS = 600
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_BROKER_URL = "('amqp://rabbitmq:rabbitmq@backend-org-1-rabbitmq:5672//',)"
CELERY_RESULT_BACKEND = 'django-db'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TASK_MAX_RETRIES = 1
CELERY_TASK_RETRY_DELAY_SECONDS = 0
CELERY_TASK_SERIALIZER = 'json'
CELERY_TASK_TRACK_STARTED = True
CELERY_WORKER_CONCURRENCY = 1
CORS_ALLOW_CREDENTIALS = True
CORS_ALLOW_HEADERS = "('accept', 'accept-encoding', 'authorization', 'content-type', 'dnt', 'origin', 'user-agent', 'x-csrftoken', 'x-requested-with', 'token')"
CORS_ORIGIN_ALLOW_ALL = True
CSRF_COOKIE_AGE = 31449600
CSRF_COOKIE_DOMAIN = None
CSRF_COOKIE_HTTPONLY = False
CSRF_COOKIE_NAME = 'csrftoken'
CSRF_COOKIE_PATH = '/'
CSRF_COOKIE_SAMESITE = 'Lax'
CSRF_COOKIE_SECURE = False
CSRF_FAILURE_VIEW = 'django.views.csrf.csrf_failure'
CSRF_HEADER_NAME = 'HTTP_X_CSRFTOKEN'
CSRF_TRUSTED_ORIGINS = []
CSRF_USE_SESSIONS = False
DATABASES = {'default': {'ENGINE': 'django.db.backends.postgresql_psycopg2', 'NAME': 'substra', 'USER': 'postgres', 'PASSWORD': '********************', 'HOST': 'backend-org-1-postgresql', 'PORT': 5432, 'ATOMIC_REQUESTS': False, 'AUTOCOMMIT': True, 'CONN_MAX_AGE': 0, 'OPTIONS': {}, 'TIME_ZONE': None, 'TEST': {'CHARSET': None, 'COLLATION': None, 'NAME': None, 'MIRROR': None}}}
DATABASE_ROUTERS = []
DATA_UPLOAD_MAX_MEMORY_SIZE = 2621440
DATA_UPLOAD_MAX_NUMBER_FIELDS = 10000
DATETIME_FORMAT = 'N j, Y, P'
DATETIME_INPUT_FORMATS = ['%Y-%m-%d %H:%M:%S', '%Y-%m-%d %H:%M:%S.%f', '%Y-%m-%d %H:%M', '%Y-%m-%d', '%m/%d/%Y %H:%M:%S', '%m/%d/%Y %H:%M:%S.%f', '%m/%d/%Y %H:%M', '%m/%d/%Y', '%m/%d/%y %H:%M:%S', '%m/%d/%y %H:%M:%S.%f', '%m/%d/%y %H:%M', '%m/%d/%y']
DATE_FORMAT = 'N j, Y'
DATE_INPUT_FORMATS = ['%Y-%m-%d', '%m/%d/%Y', '%m/%d/%y', '%b %d %Y', '%b %d, %Y', '%d %b %Y', '%d %b, %Y', '%B %d %Y', '%B %d, %Y', '%d %B %Y', '%d %B, %Y']
DEBUG = True
DEBUG_PROPAGATE_EXCEPTIONS = False
DECIMAL_SEPARATOR = '.'
DEFAULT_CHARSET = 'utf-8'
DEFAULT_CONTENT_TYPE = 'text/html'
DEFAULT_DOMAIN = 'http://substra-backend.node-1.com'
DEFAULT_EXCEPTION_REPORTER_FILTER = 'django.views.debug.SafeExceptionReporterFilter'
DEFAULT_FILE_STORAGE = 'django.core.files.storage.FileSystemStorage'
DEFAULT_FROM_EMAIL = 'webmaster@localhost'
DEFAULT_INDEX_TABLESPACE = ''
DEFAULT_PORT = '8000'
DEFAULT_TABLESPACE = ''
DISALLOWED_USER_AGENTS = []
EMAIL_BACKEND = 'django.core.mail.backends.smtp.EmailBackend'
EMAIL_HOST = 'localhost'
EMAIL_HOST_PASSWORD = '********************'
EMAIL_HOST_USER = ''
EMAIL_PORT = 25
EMAIL_SSL_CERTFILE = None
EMAIL_SSL_KEYFILE = '********************'
EMAIL_SUBJECT_PREFIX = '[Django] '
EMAIL_TIMEOUT = None
EMAIL_USE_LOCALTIME = False
EMAIL_USE_SSL = False
EMAIL_USE_TLS = False
EXPIRY_TOKEN_LIFETIME = '********************'
FILE_CHARSET = 'utf-8'
FILE_UPLOAD_DIRECTORY_PERMISSIONS = None
FILE_UPLOAD_HANDLERS = ['django.core.files.uploadhandler.MemoryFileUploadHandler', 'django.core.files.uploadhandler.TemporaryFileUploadHandler']
FILE_UPLOAD_MAX_MEMORY_SIZE = 2621440
FILE_UPLOAD_PERMISSIONS = None
FILE_UPLOAD_TEMP_DIR = None
FIRST_DAY_OF_WEEK = 0
FIXTURE_DIRS = []
FORCE_SCRIPT_NAME = None
FORMAT_MODULE_PATH = None
FORM_RENDERER = 'django.forms.renderers.DjangoTemplates'
IGNORABLE_404_URLS = []
INSTALLED_APPS = ['django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', 'django.contrib.sites', 'django_celery_results', 'rest_framework_swagger', 'rest_framework', 'rest_framework.authtoken', 'rest_framework_simplejwt.token_blacklist', 'substrapp', 'node', 'users', 'corsheaders', 'events', 'node-register']
INTERNAL_IPS = []
LANGUAGES = [('af', 'Afrikaans'), ('ar', 'Arabic'), ('ast', 'Asturian'), ('az', 'Azerbaijani'), ('bg', 'Bulgarian'), ('be', 'Belarusian'), ('bn', 'Bengali'), ('br', 'Breton'), ('bs', 'Bosnian'), ('ca', 'Catalan'), ('cs', 'Czech'), ('cy', 'Welsh'), ('da', 'Danish'), ('de', 'German'), ('dsb', 'Lower Sorbian'), ('el', 'Greek'), ('en', 'English'), ('en-au', 'Australian English'), ('en-gb', 'British English'), ('eo', 'Esperanto'), ('es', 'Spanish'), ('es-ar', 'Argentinian Spanish'), ('es-co', 'Colombian Spanish'), ('es-mx', 'Mexican Spanish'), ('es-ni', 'Nicaraguan Spanish'), ('es-ve', 'Venezuelan Spanish'), ('et', 'Estonian'), ('eu', 'Basque'), ('fa', 'Persian'), ('fi', 'Finnish'), ('fr', 'French'), ('fy', 'Frisian'), ('ga', 'Irish'), ('gd', 'Scottish Gaelic'), ('gl', 'Galician'), ('he', 'Hebrew'), ('hi', 'Hindi'), ('hr', 'Croatian'), ('hsb', 'Upper Sorbian'), ('hu', 'Hungarian'), ('ia', 'Interlingua'), ('id', 'Indonesian'), ('io', 'Ido'), ('is', 'Icelandic'), ('it', 'Italian'), ('ja', 'Japanese'), ('ka', 'Georgian'), ('kab', 'Kabyle'), ('kk', 'Kazakh'), ('km', 'Khmer'), ('kn', 'Kannada'), ('ko', 'Korean'), ('lb', 'Luxembourgish'), ('lt', 'Lithuanian'), ('lv', 'Latvian'), ('mk', 'Macedonian'), ('ml', 'Malayalam'), ('mn', 'Mongolian'), ('mr', 'Marathi'), ('my', 'Burmese'), ('nb', 'Norwegian Bokmål'), ('ne', 'Nepali'), ('nl', 'Dutch'), ('nn', 'Norwegian Nynorsk'), ('os', 'Ossetic'), ('pa', 'Punjabi'), ('pl', 'Polish'), ('pt', 'Portuguese'), ('pt-br', 'Brazilian Portuguese'), ('ro', 'Romanian'), ('ru', 'Russian'), ('sk', 'Slovak'), ('sl', 'Slovenian'), ('sq', 'Albanian'), ('sr', 'Serbian'), ('sr-latn', 'Serbian Latin'), ('sv', 'Swedish'), ('sw', 'Swahili'), ('ta', 'Tamil'), ('te', 'Telugu'), ('th', 'Thai'), ('tr', 'Turkish'), ('tt', 'Tatar'), ('udm', 'Udmurt'), ('uk', 'Ukrainian'), ('ur', 'Urdu'), ('vi', 'Vietnamese'), ('zh-hans', 'Simplified Chinese'), ('zh-hant', 'Traditional Chinese')]
LANGUAGES_BIDI = ['he', 'ar', 'fa', 'ur']
LANGUAGE_CODE = 'en-us'
LANGUAGE_COOKIE_AGE = None
LANGUAGE_COOKIE_DOMAIN = None
LANGUAGE_COOKIE_NAME = 'django_language'
LANGUAGE_COOKIE_PATH = '/'
LEDGER = {'name': 'MyOrg1', 'core_peer_mspconfigpath': '/var/hyperledger/msp', 'channel_name': 'mychannel', 'chaincode_name': 'mycc', 'chaincode_version': '1.0', 'client': {'name': 'user', 'org': 'MyOrg1', 'state_store': '/tmp/hfc-cvs', 'key_path': '********************', 'cert_path': '/var/hyperledger/msp/signcerts/cert.pem', 'msp_id': 'MyOrg1MSP'}, 'peer': {'name': 'peer', 'host': 'network-org-1-peer-1.org-1', 'port': {'internal': 7051, 'external': 7051}, 'docker_core_dir': '/var/hyperledger/fabric_cfg', 'tlsCACerts': '/var/hyperledger/ca/cacert.pem', 'clientKey': '********************', 'clientCert': '/var/hyperledger/tls/client/pair/tls.crt', 'grpcOptions': {'grpc-max-send-message-length': 15, 'grpc.ssl_target_name_override': 'network-org-1-peer-1.org-1'}}, 'requestor': <hfc.fabric.user.User object at 0x7f327278e550>, 'hfc': <function get_hfc_client at 0x7f3276035598>}
LEDGER_CALL_RETRY = True
LEDGER_CONFIG_FILE = '/conf/MyOrg1/substra-backend/conf.json'
LEDGER_MAX_RETRY_TIMEOUT = 5
LEDGER_SYNC_ENABLED = True
LOCALE_PATHS = []
LOGGING = {'version': 1, 'disable_existing_loggers': False, 'formatters': {'verbose': {'format': '%(levelname)s %(asctime)s %(module)s %(process)d %(thread)d %(message)s'}, 'simple': {'format': '%(levelname)s - %(asctime)s - %(name)s - %(message)s'}}, 'filters': {'require_debug_false': {'()': 'django.utils.log.RequireDebugFalse'}}, 'handlers': {'mail_admins': {'level': 'ERROR', 'filters': ['require_debug_false'], 'class': 'django.utils.log.AdminEmailHandler'}, 'console': {'level': 'DEBUG', 'class': 'logging.StreamHandler', 'formatter': 'simple'}, 'error_file': {'level': 'INFO', 'filename': '/usr/src/app/backend.log', 'class': 'logging.handlers.RotatingFileHandler', 'maxBytes': 1048576, 'backupCount': 2, 'formatter': 'verbose'}}, 'loggers': {'django.request': {'handlers': ['mail_admins', 'error_file'], 'level': 'INFO', 'propagate': False}, 'events': {'handlers': ['console'], 'level': 'DEBUG', 'propagate': True}}}
LOGGING_CONFIG = 'logging.config.dictConfig'
LOGIN_REDIRECT_URL = '/accounts/profile/'
LOGIN_URL = '/accounts/login/'
LOGOUT_REDIRECT_URL = None
MANAGERS = []
MEDIA_ROOT = '/tmp/org-1/medias/'
MEDIA_URL = '/media/'
MESSAGE_STORAGE = 'django.contrib.messages.storage.fallback.FallbackStorage'
MIDDLEWARE = ['corsheaders.middleware.CorsMiddleware', 'django.middleware.security.SecurityMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.middleware.common.CommonMiddleware', 'django.middleware.csrf.CsrfViewMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.contrib.auth.middleware.RemoteUserMiddleware', 'django.contrib.messages.middleware.MessageMiddleware', 'django.middleware.clickjacking.XFrameOptionsMiddleware', 'libs.SQLPrintingMiddleware.SQLPrintingMiddleware', 'libs.HealthCheckMiddleware.HealthCheckMiddleware']
MIGRATION_MODULES = {}
MONTH_DAY_FORMAT = 'F j'
NUMBER_GROUPING = 0
ORG = 'MyOrg1'
ORG_NAME = 'MyOrg1'
PASSWORD_HASHERS = '********************'
PASSWORD_RESET_TIMEOUT_DAYS = '********************'
PEER_PORT = 7051
PREPEND_WWW = False
PROJECT_ROOT = '/usr/src/app'
REST_FRAMEWORK = {'TEST_REQUEST_DEFAULT_FORMAT': 'json', 'DEFAULT_RENDERER_CLASSES': ('rest_framework.renderers.JSONRenderer', 'rest_framework.renderers.BrowsableAPIRenderer'), 'DEFAULT_AUTHENTICATION_CLASSES': ['users.authentication.SecureJWTAuthentication', 'libs.expiryTokenAuthentication.ExpiryTokenAuthentication', 'libs.sessionAuthentication.CustomSessionAuthentication'], 'DEFAULT_PERMISSION_CLASSES': ['rest_framework.permissions.IsAuthenticated'], 'UNICODE_JSON': False, 'DEFAULT_VERSIONING_CLASS': 'libs.versioning.AcceptHeaderVersioningRequired', 'ALLOWED_VERSIONS': ('0.0',), 'DEFAULT_VERSION': '0.0'}
ROOT_URLCONF = 'backend.urls'
SECRET_FILE = '********************'
SECRET_KEY = '********************'
SECURE_BROWSER_XSS_FILTER = False
SECURE_CONTENT_TYPE_NOSNIFF = False
SECURE_HSTS_INCLUDE_SUBDOMAINS = False
SECURE_HSTS_PRELOAD = False
SECURE_HSTS_SECONDS = 0
SECURE_PROXY_SSL_HEADER = None
SECURE_REDIRECT_EXEMPT = []
SECURE_SSL_HOST = None
SECURE_SSL_REDIRECT = False
SERVER_EMAIL = 'root@localhost'
SESSION_CACHE_ALIAS = 'default'
SESSION_COOKIE_AGE = 1209600
SESSION_COOKIE_DOMAIN = None
SESSION_COOKIE_HTTPONLY = True
SESSION_COOKIE_NAME = 'sessionid'
SESSION_COOKIE_PATH = '/'
SESSION_COOKIE_SAMESITE = 'Lax'
SESSION_COOKIE_SECURE = False
SESSION_ENGINE = 'django.contrib.sessions.backends.db'
SESSION_EXPIRE_AT_BROWSER_CLOSE = False
SESSION_FILE_PATH = None
SESSION_SAVE_EVERY_REQUEST = False
SESSION_SERIALIZER = 'django.contrib.sessions.serializers.JSONSerializer'
SETTINGS_MODULE = 'backend.settings.server.dev'
SHORT_DATETIME_FORMAT = 'm/d/Y P'
SHORT_DATE_FORMAT = 'm/d/Y'
SIGNING_BACKEND = 'django.core.signing.TimestampSigner'
SILENCED_SYSTEM_CHECKS = []
SIMPLE_JWT = {'ACCESS_TOKEN_LIFETIME': '********************', 'AUTH_HEADER_TYPES': ('JWT',)}
SITE_HOST = 'substra-backend.MyOrg1.xyz'
SITE_ID = 1
SITE_PORT = '8000'
STATICFILES_DIRS = []
STATICFILES_FINDERS = ['django.contrib.staticfiles.finders.FileSystemFinder', 'django.contrib.staticfiles.finders.AppDirectoriesFinder']
STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.StaticFilesStorage'
STATIC_ROOT = None
STATIC_URL = '/static/'
SUBSTRA_FOLDER = '/substra'
TASK = {'CAPTURE_LOGS': True, 'CLEAN_EXECUTION_ENVIRONMENT': True, 'CACHE_DOCKER_IMAGES': False}
TEMPLATES = [{'BACKEND': 'django.template.backends.django.DjangoTemplates', 'DIRS': [], 'APP_DIRS': True, 'OPTIONS': {'context_processors': ['django.template.context_processors.debug', 'django.template.context_processors.request', 'django.contrib.auth.context_processors.auth', 'django.contrib.messages.context_processors.messages']}}]
TEST_NON_SERIALIZED_APPS = []
TEST_RUNNER = 'django.test.runner.DiscoverRunner'
THOUSAND_SEPARATOR = ','
TIME_FORMAT = 'P'
TIME_INPUT_FORMATS = ['%H:%M:%S', '%H:%M:%S.%f', '%H:%M']
TIME_ZONE = 'UTC'
TRUE_VALUES = {'TRUE', 1, '1', 'on', 'yes', 'true', 'ON', 'YES', 'True', 'Y', 'On', 'y', 't', 'T'}
USE_I18N = True
USE_L10N = True
USE_THOUSAND_SEPARATOR = False
USE_TZ = True
USE_X_FORWARDED_HOST = False
USE_X_FORWARDED_PORT = False
WSGI_APPLICATION = 'backend.wsgi.application'
X_FRAME_OPTIONS = 'SAMEORIGIN'
YEAR_MONTH_FORMAT = 'F Y'
You're seeing this error because you have DEBUG = True in your
Django settings file. Change that to False, and Django will
display a standard page generated by the handler for this status code.
Traceback:
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] [2019-12-09 21:33:15,334: ERROR/ForkPoolWorker-1] exception calling callback for <Future at 0x7f50303a4908 state=finished raised _Rendezvous>
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] Traceback (most recent call last):
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] result = self.fn(*self.args, **self.kwargs)
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/local/lib/python3.6/dist-packages/aiogrpc/utils.py", line 126, in _next
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] return next(self._iterator)
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 392, in __next__
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] return self._next()
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 561, in _next
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] raise self
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] status = StatusCode.CANCELLED
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] details = "Locally cancelled by application!"
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] debug_error_string = "None"
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] >
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker]
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] During handling of the above exception, another exception occurred:
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker]
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] Traceback (most recent call last):
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/lib/python3.6/concurrent/futures/_base.py", line 324, in _invoke_callbacks
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] callback(self)
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/lib/python3.6/asyncio/futures.py", line 417, in _call_set_state
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] dest_loop.call_soon_threadsafe(_set_state, destination, source)
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/lib/python3.6/asyncio/base_events.py", line 637, in call_soon_threadsafe
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] self._check_closed()
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/lib/python3.6/asyncio/base_events.py", line 377, in _check_closed
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] raise RuntimeError('Event loop is closed')
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] RuntimeError: Event loop is closed
[backend-org-1-substra-backend-server-6b867c94c-p7t5q substra-backend]
[backend-org-1-substra-backend-server-6b867c94c-p7t5q substra-backend]
[backend-org-1-substra-backend-server-6b867c94c-p7t5q substra-backend] [SQL Queries for] /composite_traintuple/7d3419fa32def5678c555ce985305b12faa1e5c2cbc9cff449f4d5a17e3d0b84/
[backend-org-1-substra-backend-server-6b867c94c-p7t5q substra-backend]
[backend-org-1-substra-backend-server-6b867c94c-p7t5q substra-backend] [0.003] SELECT authtoken_token.key, authtoken_token.user_id, authtoken_token.created, auth_user.id, auth_user.password, auth_user.last_login, auth_user.is_superuser, auth_user.username, auth_user.first_name, auth_user.last_name, auth_user.email, auth_user.is_staff, auth_user.is_active, auth_user.date_joined FROM authtoken_token INNER JOIN auth_user ON (authtoken_token.user_id = auth_user.id) WHERE authtoken_token.key = '41c18852d55e6dfdc751f75c7f84bdb850a7f0cc'
[backend-org-1-substra-backend-server-6b867c94c-p7t5q substra-backend]
[backend-org-1-substra-backend-server-6b867c94c-p7t5q substra-backend] [TOTAL TIME: 0.003 seconds (1 queries)]
[backend-org-1-substra-backend-server-6b867c94c-p7t5q substra-backend] [09/Dec/2019 21:33:15] "GET /composite_traintuple/7d3419fa32def5678c555ce985305b12faa1e5c2cbc9cff449f4d5a17e3d0b84/ HTTP/1.1" 200 1806
[backend-org-1-substra-backend-server-6b867c94c-p7t5q substra-backend] [09/Dec/2019 21:33:16] "GET /readiness HTTP/1.1" 200 2
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] [2019-12-09 21:33:17,333: WARNING/ForkPoolWorker-1] Function invoke_ledger failed (<class 'substrapp.ledger_utils.LedgerTimeout'>): waitForEvent timed out. retrying in 2s
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] [2019-12-09 21:33:17,386: INFO/ForkPoolWorker-1] DISCOVERY: adding channel peers query
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] [2019-12-09 21:33:17,386: INFO/ForkPoolWorker-1] DISCOVERY: adding config query
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] [2019-12-09 21:33:17,387: INFO/ForkPoolWorker-1] DISCOVERY: adding chaincodes/collection query
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] [2019-12-09 21:33:17,468: ERROR/ForkPoolWorker-1] Task substrapp.tasks.tasks.prepare_tuple[7d3419fa32def5678c555ce985305b12faa1e5c2cbc9cff449f4d5a17e3d0b84] raised unexpected: cannot update traintuple 7d3419fa32def5678c555ce985305b12faa1e5c2cbc9cff449f4d5a17e3d0b84 - status already doing
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] Traceback (most recent call last):
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/local/lib/python3.6/dist-packages/celery/app/trace.py", line 382, in trace_task
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] R = retval = fun(*args, **kwargs)
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/local/lib/python3.6/dist-packages/celery/app/trace.py", line 641, in __protected_call__
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] return self.run(*args, **kwargs)
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/src/app/substrapp/tasks/tasks.py", line 428, in prepare_tuple
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] log_start_tuple(tuple_type, subtuple['key'])
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/src/app/substrapp/ledger_utils.py", line 98, in _wrapper
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] return fn(*args, **kwargs)
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/src/app/substrapp/ledger_utils.py", line 320, in log_start_tuple
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] sync=True)
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/src/app/substrapp/ledger_utils.py", line 98, in _wrapper
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] return fn(*args, **kwargs)
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/src/app/substrapp/ledger_utils.py", line 210, in invoke_ledger
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] response = call_ledger('invoke', fcn=fcn, args=args, kwargs=params)
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] File "/usr/src/app/substrapp/ledger_utils.py", line 187, in call_ledger
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] raise exception_class.from_response(response)
[backend-org-2-substra-backend-worker-56c79d57bd-tdv9b worker] substrapp.ledger_utils.LedgerResponseError: cannot update traintuple 7d3419fa32def5678c555ce985305b12faa1e5c2cbc9cff449f4d5a17e3d0b84 - status already doing
Seen while executing the test (random failure): tests/test_execution_compute_plan.py::test_compute_plan_aggregate_composite_traintuples
This issue was first noticed by @maeldebon (see Substra/substra#112 and Substra/substra#113)
The composite traintuple serializer requires the out_trunk_model_permission to contain a "public" value but then ignores it and replaces it with False.
We should just not require the value at all.
Setup:
The worker in node 1 encounters an error: it tries to delete an image that doesn't exist:
ERROR 2020-05-27 19:38:56,487 substrapp.tasks.tasks 696 140190742099776 404 Client Error: Not Found ("reference does not exist")
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 261, in _raise_for_status
response.raise_for_status()
File "/usr/local/lib/python3.6/dist-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.35/images/substra/algo_dc32f8dd?force=True&noprune=False
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/src/app/substrapp/tasks/tasks.py", line 983, in remove_algo_images
client.images.remove(algo_docker, force=True)
File "/usr/local/lib/python3.6/dist-packages/docker/models/images.py", line 463, in remove
self.client.api.remove_image(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/docker/utils/decorators.py", line 19, in wrapped
return f(self, resource_id, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/docker/api/image.py", line 495, in remove_image
return self._result(res, True)
File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 267, in _result
self._raise_for_status(response)
File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 263, in _raise_for_status
raise create_api_error_from_http_exception(e)
File "/usr/local/lib/python3.6/dist-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.NotFound: 404 Client Error: Not Found ("reference does not exist")
(After a quick investigation: it looks like we're not catch
ing the correct exception type (NotFound
instead of ImageNotFound
)
The unicity of an objective is currently only based on the description content. This allows objective with the same metrics to be uploaded multiple time so that they are associated with different datasets.
This however prevents the creation of an objective with the same description but a different metrics archive.
We could have the unicity check include both description and metrics, which would allow the same flexibility when it comes to re-uploading a metrics with another description but would also allow to upload multiple metrics with the same description.
Follow-up to #124
Currently, some important information isn't displayed in the logs. Not having access this information makes it harder to troubleshoot errors/bugs. In particular, some of the algo exceptions on prod are being swallowed (every exception until the final retry), so we potentially lose some critical troubleshooting data.
Not available from the logs:
Option 1
Restore celery logs: #213
Option 2
Maybe celery logging gives out too much information. In that case, we could explore writing log messages ourselves (e.g. logger.info(f'Starting task {task_id}')
etc.)
Option 3
(your idea here?)
Traceback:
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] Traceback (most recent call last):
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] File "/usr/src/app/substrapp/tasks/tasks.py", line 460, in compute_task
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] res = do_task(subtuple, tuple_type)
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] File "/usr/src/app/substrapp/tasks/tasks.py", line 540, in do_task
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] local_volume = client.volumes.get(volume_id=volume_id)
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] File "/usr/local/lib/python3.6/dist-packages/docker/models/volumes.py", line 76, in get
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] return self.prepare_model(self.client.api.inspect_volume(volume_id))
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] File "/usr/local/lib/python3.6/dist-packages/docker/api/volume.py", line 114, in inspect_volume
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] return self._result(self._get(url), True)
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 235, in _result
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] self._raise_for_status(response)
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 231, in _raise_for_status
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] raise create_api_error_from_http_exception(e)
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] File "/usr/local/lib/python3.6/dist-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] raise cls(e, response=response, explanation=explanation)
[backend-org-2-substra-backend-worker-655dc75d9b-h8fxq worker] docker.errors.NotFound: 404 Client Error: Not Found ("get local-acbe3fabadb14a63d29e2ae9e23388d4b7e05037f0528ef12562281ef1f034c8-MyOrg2: no such volume")
To reproduce, use substra-tests:
pytest tests/test_execution_compute_plan.py::test_compute_plan
While trying to run the tests on either the demo env, lots of calls fail with the following traceback:
│ substra-backend Traceback (most recent call last): │
│ substra-backend File "/usr/local/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner │
│ substra-backend response = get_response(request) │
│ substra-backend File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py", line 126, in _get_response │
│ substra-backend response = self.process_exception_by_middleware(e, request) │
│ substra-backend File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py", line 124, in _get_response │
│ substra-backend response = wrapped_callback(request, *callback_args, **callback_kwargs) │
│ substra-backend File "/usr/local/lib/python3.6/site-packages/django/views/decorators/csrf.py", line 54, in wrapped_view │
│ substra-backend return view_func(*args, **kwargs) │
│ substra-backend File "/usr/local/lib/python3.6/site-packages/rest_framework/viewsets.py", line 103, in view │
│ substra-backend return self.dispatch(request, *args, **kwargs) │
│ substra-backend File "/usr/local/lib/python3.6/site-packages/rest_framework/views.py", line 483, in dispatch │
│ substra-backend response = self.handle_exception(exc) │
│ substra-backend File "/usr/local/lib/python3.6/site-packages/rest_framework/views.py", line 443, in handle_exception │
│ substra-backend self.raise_uncaught_exception(exc) │
│ substra-backend File "/usr/local/lib/python3.6/site-packages/rest_framework/views.py", line 480, in dispatch │
│ substra-backend response = handler(request, *args, **kwargs) │
│ substra-backend File "./node/views/node.py", line 18, in list │
│ substra-backend nodes = query_ledger(fcn=self.ledger_query_call) │
│ substra-backend File "./substrapp/ledger_utils.py", line 98, in _wrapper │
│ substra-backend return fn(*args, **kwargs) │
│ substra-backend File "./substrapp/ledger_utils.py", line 192, in query_ledger │
│ substra-backend return call_ledger('query', fcn=fcn, args=args) │
│ substra-backend File "./substrapp/ledger_utils.py", line 123, in call_ledger │
│ substra-backend with get_hfc() as (loop, client): │
│ substra-backend File "/usr/local/lib/python3.6/contextlib.py", line 81, in __enter__ │
│ substra-backend return next(self.gen) │
│ substra-backend File "./substrapp/ledger_utils.py", line 113, in get_hfc │
│ substra-backend loop, client = LEDGER['hfc']() │
│ substra-backend File "./backend/settings/deps/ledger.py", line 103, in get_hfc_client │
│ substra-backend update_client_with_discovery(client, results) │
│ substra-backend File "./backend/settings/deps/ledger.py", line 123, in update_client_with_discovery │
│ substra-backend peer_info = msp[0] │
│ substra-backend IndexError: list index out of range
The field out_trunk_model_permissions
:
PermissionsSerializer
for the compute planPrivatePermissionsSerializer
for the single viewThis field should be consistent in both views.
At first sight it seems we forgot to update the compute plan view in this commit.
We should probably add another requirements file (for example requirements-dev.txt
) that would include dependencies of scripts used for local development (start.py, populate.py etc.)
Sous MacOS Catalina
Substra-backend --> branch master --> skaffold dev --> casse sur la step 4
Il arrive pas à récupérer cette ressource et cela bloque l'install : http://archive.ubuntu.com/ubuntu/pool/main/p/publicsuffix/publicsuffix_20180223.1310-1_all.deb
Une suggestion please ? :'(
Step 4/13 : RUN apt-get install -y git curl netcat
---> Running in 7f0fd0bf027b
Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
git-man krb5-locales less libbsd0 libcurl3-gnutls libcurl4 libedit2
liberror-perl libgssapi-krb5-2 libk5crypto3 libkeyutils1 libkrb5-3
libkrb5support0 libnghttp2-14 libpsl5 librtmp1 libssl1.0.0 libx11-6
libx11-data libxau6 libxcb1 libxdmcp6 libxext6 libxmuu1 multiarch-support
netcat-traditional openssh-client publicsuffix xauth
Suggested packages:
gettext-base git-daemon-run | git-daemon-sysvinit git-doc git-el git-email
git-gui gitk gitweb git-cvs git-mediawiki git-svn krb5-doc krb5-user
keychain libpam-ssh monkeysphere ssh-askpass
The following NEW packages will be installed:
curl git git-man krb5-locales less libbsd0 libcurl3-gnutls libcurl4 libedit2
liberror-perl libgssapi-krb5-2 libk5crypto3 libkeyutils1 libkrb5-3
libkrb5support0 libnghttp2-14 libpsl5 librtmp1 libssl1.0.0 libx11-6
libx11-data libxau6 libxcb1 libxdmcp6 libxext6 libxmuu1 multiarch-support
netcat netcat-traditional openssh-client publicsuffix xauth
0 upgraded, 32 newly installed, 0 to remove and 6 not upgraded.
Need to get 8949 kB of archives.
After this operation, 50.7 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic/main amd64 multiarch-support amd64 2.27-3ubuntu1 [6916 B]
Get:2 http://archive.ubuntu.com/ubuntu bionic/main amd64 libxau6 amd64 1:1.0.8-1 [8376 B]
Get:3 http://archive.ubuntu.com/ubuntu bionic/main amd64 libbsd0 amd64 0.8.7-1 [41.5 kB]
Get:4 http://archive.ubuntu.com/ubuntu bionic/main amd64 libxdmcp6 amd64 1:1.1.2-3 [10.7 kB]
Get:5 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libxcb1 amd64 1.13-2~ubuntu18.04 [45.5 kB]
Get:6 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libx11-data all 2:1.6.4-3ubuntu0.2 [113 kB]
Get:7 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libx11-6 amd64 2:1.6.4-3ubuntu0.2 [569 kB]
Get:8 http://archive.ubuntu.com/ubuntu bionic/main amd64 libxext6 amd64 2:1.3.3-1 [29.4 kB]
Get:9 http://archive.ubuntu.com/ubuntu bionic/main amd64 less amd64 487-0.1 [112 kB]
Get:10 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 krb5-locales all 1.16-2ubuntu0.1 [13.5 kB]
Get:11 http://archive.ubuntu.com/ubuntu bionic/main amd64 libedit2 amd64 3.1-20170329-1 [76.9 kB]
Get:12 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libkrb5support0 amd64 1.16-2ubuntu0.1 [30.9 kB]
Get:13 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libk5crypto3 amd64 1.16-2ubuntu0.1 [85.6 kB]
Get:14 http://archive.ubuntu.com/ubuntu bionic/main amd64 libkeyutils1 amd64 1.5.9-9.2ubuntu2 [8720 B]
Get:15 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libkrb5-3 amd64 1.16-2ubuntu0.1 [279 kB]
Get:16 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libgssapi-krb5-2 amd64 1.16-2ubuntu0.1 [122 kB]
Get:17 http://archive.ubuntu.com/ubuntu bionic/main amd64 libpsl5 amd64 0.19.1-5build1 [41.8 kB]
Get:18 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libssl1.0.0 amd64 1.0.2n-1ubuntu5.3 [1088 kB]
Get:19 http://archive.ubuntu.com/ubuntu bionic/main amd64 libxmuu1 amd64 2:1.1.2-2 [9674 B]
Get:20 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 openssh-client amd64 1:7.6p1-4ubuntu0.3 [614 kB]
Get:21 http://archive.ubuntu.com/ubuntu bionic/main amd64 publicsuffix all 20180223.1310-1 [97.6 kB]
Err:21 http://archive.ubuntu.com/ubuntu bionic/main amd64 publicsuffix all 20180223.1310-1
Undetermined Error [IP: 91.189.88.31 80]
Get:22 http://archive.ubuntu.com/ubuntu bionic/main amd64 xauth amd64 1:1.0.10-1 [24.6 kB]
Get:23 http://archive.ubuntu.com/ubuntu bionic/main amd64 libnghttp2-14 amd64 1.30.0-1ubuntu1 [77.8 kB]
Get:24 http://archive.ubuntu.com/ubuntu bionic/main amd64 librtmp1 amd64 2.4+20151223.gitfa8646d.1-1 [54.2 kB]
Get:25 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libcurl4 amd64 7.58.0-2ubuntu3.8 [214 kB]
Get:26 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 curl amd64 7.58.0-2ubuntu3.8 [159 kB]
Get:27 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libcurl3-gnutls amd64 7.58.0-2ubuntu3.8 [213 kB]
Get:28 http://archive.ubuntu.com/ubuntu bionic/main amd64 liberror-perl all 0.17025-1 [22.8 kB]
Get:29 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 git-man all 1:2.17.1-1ubuntu0.5 [803 kB]
Get:30 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 git amd64 1:2.17.1-1ubuntu0.5 [3912 kB]
Get:31 http://archive.ubuntu.com/ubuntu bionic/universe amd64 netcat-traditional amd64 1.10-41.1 [61.7 kB]
Get:32 http://archive.ubuntu.com/ubuntu bionic/universe amd64 netcat all 1.10-41.1 [3436 B]
Fetched 8851 kB in 5s (1666 kB/s)
E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/p/publicsuffix/publicsuffix_20180223.1310-1_all.deb Undetermined Error [IP: 91.189.88.31 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
FATA[0174] exiting dev mode because first build failed: build failed: build failed: building [substrafoundation/celeryworker]: build artifact: unable to stream build output: The command '/bin/sh -c apt-get install -y git curl netcat' returned a non-zero code: 100
When I try to use the leaderboard
command on a key which doesn't exists, I get the following error:
Error: Request failed: InternalServerError: 500 Server Error: Internal Server Error for url: http://substra-backend.node-1.com/objective/foo/leaderboard/?sort=desc
Maybe we could have a more specific error explaining that the key doesn't exists?
While executing tuples that output pretty large models (1GB), I ran into the following issue:
ERROR 2020-02-05 15:30:50,114 substrapp.tasks.tasks 15 140710911403840 [00-01-0004-969dae2]
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 261, in _raise_for_status
response.raise_for_status()
File "/usr/local/lib/python3.6/dist-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 409 Client Error: Conflict for url: http+docker://localhost/v1.35/containers/create?name=compositeTraintuple_12031be8_train
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/celery/app/trace.py", line 385, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/celery/app/trace.py", line 650, in __protected_call__
return self.run(*args, **kwargs)
File "/usr/src/app/substrapp/tasks/tasks.py", line 530, in compute_task
max_retries=int(getattr(settings, 'CELERY_TASK_MAX_RETRIES')))
File "/usr/local/lib/python3.6/dist-packages/celery/app/task.py", line 704, in retry
raise_with_context(exc)
File "/usr/src/app/substrapp/tasks/tasks.py", line 524, in compute_task
res = do_task(subtuple, tuple_type)
File "/usr/src/app/substrapp/tasks/tasks.py", line 590, in do_task
org_name
File "/usr/src/app/substrapp/tasks/tasks.py", line 743, in _do_task
environment=environment
File "/usr/src/app/substrapp/tasks/utils.py", line 252, in compute_docker
client.containers.run(**task_args)
File "/usr/local/lib/python3.6/dist-packages/docker/models/containers.py", line 803, in run
detach=detach, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/docker/models/containers.py", line 861, in create
resp = self.client.api.create_container(**create_kwargs)
File "/usr/local/lib/python3.6/dist-packages/docker/api/container.py", line 430, in create_container
return self.create_container_from_config(config, name)
File "/usr/local/lib/python3.6/dist-packages/docker/api/container.py", line 441, in create_container_from_config
return self._result(res, True)
File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 267, in _result
self._raise_for_status(response)
File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 263, in _raise_for_status
raise create_api_error_from_http_exception(e)
File "/usr/local/lib/python3.6/dist-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 409 Client Error: Conflict ("Conflict. The container name "/compositeTraintuple_12031be8_train" is already in use by container "3d5cb300703d3dcec4321ee83612d7c0cee2a83743faf070dfa54e969461f8e2". You have to remove (or rename) that container to be a
ble to reuse that name.")
hlf-k8s creates configuration files at <substra folder>/conf/config/conf-<org>.json
. However the backend's start.py script looks for files at <substra folder>/conf/<org>/substra-backend/conf.json
. As a result, the backend's start.py script doesn't launch any backend.
Hello there,
It would be nice to document somewhere in the Helm chart that passwords needs to be at least 20 characters long. Ideally, having a validation schema would help a lot the operators (and make my life a little bit easier).
When a training task pod is evicted by kubernetes, the backend waits forever for the pod to either complete or fail.
However, since "Evicted" is not a failure condition that is currently supported in the backend:
Evicted
state foreverdoing
state forever.substra
repositoryIt is possible to create an aggregatetuple without passing any --in-model-key in the command.
The in_models_keys
field of an aggregatetuple isn't required and has a minimum length of 0.
As the sole purpose of the aggregatetuple is to aggregate at least 2 models together, the in_models_keys
should be required and should have a minimum length of 2.
While inspecting the logs of the worker I stumbled upon:
[2020-02-05 16:01:41,804: ERROR/ForkPoolWorker-1] _GatheringFuture exception was never retrieved
future: <_GatheringFuture finished exception=CancelledError()>
concurrent.futures._base.CancelledError
INFO 2020-02-05 16:01:57,298 substrapp.ledger_utils 15 140710911403840 smartcontract invoke:logSuccessTrain; elaps=16270.34ms; error=None
[2020-02-05 16:01:57,524: ERROR/ForkPoolWorker-1] _GatheringFuture exception was never retrieved
future: <_GatheringFuture finished exception=CancelledError()>
concurrent.futures._base.CancelledError
For a composite traintuple, if the head and the trunk out models are identical, the saving fails.
That's because not only do the two models have the same composite traintuple key (expected), they ALSO have the same value.
The hash is computed from the traintuplekey and the value. So the hashes are the same for the head and the trunk model, which leads to a pkhash conflict.
There was an aborted attempt at fixing this issue.
The following fields should be optional:
I run into a particular issue.
The event app crashes because of a socket closed. I didn't manage to have a local setup to reproduce it easily.
Here some logs:
On peer side
2020-01-22 09:20:21.257 UTC [comm.grpc.server] 1 -> INFO 1083 streaming call completed grpc.service=protos.Deliver grpc.method=Deliver grpc.peer_address=10.1.1.1:43510 grpc.peer_subject="CN=user,OU=peer,O=Hyperledger,ST=North Carolina, C=US" error="context finished before block retrieved: context canceled" grpc.code=Unknown grpc.call_duration=30.002261971s
On backend side
Traceback (most recent call last):
File "/usr/local/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/local/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "./events/apps.py", line 134, in wait
loop.run_until_complete(stream)
File "/usr/local/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
return future.result()
File "/usr/local/lib/python3.6/site-packages/hfc/fabric/channel/channel_eventhub.py", line 545, in handle_stream
async for event in stream:
File "/usr/local/lib/python3.6/site-packages/aiogrpc/utils.py", line 138, in __anext__
return await asyncio.shield(self._next_future, loop=self._loop)
File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.6/site-packages/aiogrpc/utils.py", line 126, in _next
return next(self._iterator)
File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 392, in __next__
return self._next()
File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 561, in _next
raise self
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Socket closed"
debug_error_string = "{"created":"@1579684585.868278838","description":"Error received from peer ipv4:194.167.143.126:443","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"Socket closed","grpc_status":14}"
I found some information that we may need to change some grpc parameters
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.