Consolidating discussion about the reasons for a reload configuration endpoint and its API design from (https://basecamp.com/1791706/projects/15357148/messages/92126983) so they can be discussed on the call.
The primary reason for this endpoint at the moment is to trigger the renewal of a devices' TLS certificate at a defined point in time when the device is not in use, but this API could also be used; to trigger software updates, other configuration changes or support for monitoring data export eg. Prometheus.
Issue
- Some devices may not be able to replace the existing certificate without affecting the devices primary operation (eg. Having to restart the application to install the certificate leading to loss of control and video)
- All possible steps should be taken to avoid this, such as being able to reload just the NMOS module (SIGHUP), Apache has a feature to reload a new config without restarting called ‘graceful’ (https://httpd.apache.org/docs/2.4/stopping.html)
- Some devices may be unable to generate a new key pair (RSA or ECDSA) without affecting the devices primary operation.
- All possible steps should be taken to avoid this, such as using key algorithms that are less resource intensive and appropriate hardware acceleration.
Ultimately if a certificate expires, the device should perform the renew, regardless of if this will affect the primary operation of the device.
Practical Examples
- mbed TLS (https://tls.mbed.org/), cannot replace certificate without reload and generating RSA key creates an exception. Not tested generating ECDSA keys.
Proposal
The proposal is to add a new endpoint to the NMOS Specs /reload-config
.
The /reload-config
endpoint will cause the device to trigger the certificate renewal process of all its certificates, during this time the primary operation of the device maybe affected.
The /reload-config
endpoint must have authentication, as the effect of calling it is disruptive to the operation of the device.
Prometheus has a similar feature for triggering the reload of new configurations using an API: sending a HTTP POST request to the /-/reload endpoint https://prometheus.io/docs/prometheus/latest/configuration/configuration/
If the certificate renewal operation is unsuccessful, the device should carry on using the original certificate if still valid and the operation should be re-tried again at an appropriate time.
An automatic or manual check should be perform after issuing the command to check the certificate has been renewed.
HTTP POST
https://<hostname>/x-nmos/<TBC>/<version>/reload-config
Response:
HTTP 202: The request has been received but not yet acted upon
HTTP 403: Forbidden, client does not have the required access rights to perform this actions
Outstanding Questions:
- Is this API really required or could alternative work around be found? If just to allow key generation on low power devices EST server side key generation could be used, although support for the server side generation endpoint in EST implementations is limited.
If it is decided that the endpoint is required, further design decision are required
- Should a Node poll an endpoint on a controller or the controller send a request to an endpoint? The preference seems to be push based from a controller, discussion here: https://basecamp.com/1791706/projects/15357148/messages/92126983
- If running on Nodes, should it be its on API or added to an existing API spec eg. Node API
- How would you discover this endpoint?
- Security consideration of an API that can restart your application?
On the next call I would like to come to a decision as to whether this API is required.
Could anyone with practical examples of why this API is required please get in contact and I can anonymously add it to the list of practical examples