nixos / nixops Goto Github PK
View Code? Open in Web Editor NEWNixOps is a tool for deploying to NixOS machines in a network or cloud.
Home Page: https://nixos.org/nixops
License: GNU Lesser General Public License v3.0
NixOps is a tool for deploying to NixOS machines in a network or cloud.
Home Page: https://nixos.org/nixops
License: GNU Lesser General Public License v3.0
It would be nice to be able to query what the value is of a NixOS configuration value similar to nixos-option, e.g.
charon option machine1.services.logstash.enable
which would show the value of services.logstash.enable on machine1. Perhaps even one that allows querying it for all machines.
Obviously, it's not desirable to store LUKS encryption keys on the target machine. But not having them there breaks unattended reboots. We should be able support unattended boots like this:
EC2 can briefly return "instance ID does not exist" errors after an instance has been created. Charon should handle this properly.
Example:
$ charon -s apache-ec2-multizone.json deploy
creating EC2 instance ‘backend1’ (AMI ‘ami-732c1407’, type ‘m1.small’, region ‘eu-west-1’)...
creating EC2 instance ‘backend2’ (AMI ‘ami-d9409fb0’, type ‘m1.small’, region ‘us-east-1’)...
creating EC2 instance ‘proxy’ (AMI ‘ami-732c1407’, type ‘m1.small’, region ‘eu-west-1’)...
waiting for IP address of ‘backend2’... [pending] error: EC2ResponseError: 400 Bad Request
<?xml version="1.0" encoding="UTF-8"?>
<Response><Errors><Error><Code>InvalidInstanceID.NotFound</Code><Message>The instance ID 'i-e1e51ba9' does not exist</Message></Error></Errors><RequestID>bb416482-01c1-433d-9214-6d24cda529ea</RequestID></Response>
Some actions, like resizing EC2 instance, produce a warning to stop an instance before redeploying. It would be nice if charon can stop the machine for us, and then resize the machine.
When trying to connect to an IPv6 address, the nc instruction, which checks whether TCP port 22 accepts connections reports the following:
.Error: Couldn't resolve host "2001:610:685:1:216:3eff:fe01:00fe"
.Error: Couldn't resolve host "2001:610:685:1:216:3eff:fe01:00fe"
.Error: Couldn't resolve host "2001:610:685:1:216:3eff:fe01:00fe"
.Error: Couldn't resolve host "2001:610:685:1:216:3eff:fe01:00fe"
and charon never gets past this phase
Apparently, netcat does not understand IPv6 addresses
Similar to current charon ssh command, it would be nice to have scp feature to easily transfer files to and from a machine.
If I run "charon destroy" in a directory without a network.json file, I get the following error
[HEAD_HG batch]$ charon destroy
error: [Errno 2] No such file or directory: '........./network.json'
and it leaves a network.json.lock file laying around.
Current output of "charon info" is not useful for automated processing. So some simple ASCII (or JSON / XML) output would be useful.
Currently Charon performs LUKS and filesystem creation. That's annoying because it requires the necessary tools to be present in the base image (unless we copy them over beforehand using nix-copy-closure). It would be better to move this to an Upstart task executed before mountall. Then it's part of the system closure, and it simplifies Charon.
Elastic IPs can be taken by any instance at any point. If another instance grabs it, charon deploy on the original instance with that elastic ip goes wrong (ssh host key different). charon deploy --check does not fix it. Needed to remove elasticIP from json to fix situation.
It would be nice to have a flag like
charon deploy --extra foo.nix
that adds foo.nix to the model for this deployment only. We could even have options on the command line, e.g.
charon deploy --extra-option 'webserver.services.httpd.enable = false'
to disable Apache for this one deployment.
It should be possible to specify the EC2 account to use (i.e. the EC2 access keys) in the deployment model.
e.g. SystemTypes, Create User. Check which others are missing.
For some cases, it might be desirable to specify network (or some attribute of network, e.g. network.description) in multiple files and use mkOverride to determine which gets priority. Right now a list is created, which doesn't work in all cases.
It would be nice to be able to refer to instance state, like instance-id / region / etc from within the charon network specification.
Charon currently sets up VPN connections over SSH for EC2 machines in different regions. It would be very useful if this were possible between arbitrary machines, e.g., EC2 machines in the same region, or non-EC2 ("physical") machines.
To ensure that we don't accidentally make unencrypted connections, Charon should generate VPN IP addresses and use those in /etc/hosts and other places.
Charon could use a Nix profile to store all previous top-level machine configurations. Then we could roll back al or some machines to a previous configuration.
Charon's deployment.* options used to be part of NixOS and so included in the configuration.nix(5) manpage, but now they're part of the Charon distribution. So a separate manpage should be generated.
Steps to reproduce:
It would be useful to have the ability to change the logical name of a deployed machine in the state file. For instance, in a network consisting a primary and secondary (hot standby) database server, if the primary machine fails, we could fail over like this:
This is also useful to support machine name changes in the logical model without destroying/creating machines.
Nix builders should be able to call Nix to build things. This is essential if we want to use Nix as a "low-level" build tool (i.e. as a Make replacement), since then we need to support derivations that unpack a source distribution containing a Nix expression to do the rest of the build.
Currently "charon deploy --dry-run" just runs "nix-build --dry-run", so it shows what derivations would be built. But it should also show what machines/volumes would be created, and so on.
Charon needs automated tests. Only problem is that testing the VirtualBox and EC2 backends is hard in a Hydra job.
Sometimes when you make a mistake in one of the Nix expressions, you can't track its origins which can be very annoying. It would be nice to have a --show-trace option, that can you help you a bit.
Implementing this is not so hard I think. Just pass this parameter to any nix-build invocation.
n/t
VirtualBox is pretty but buggy. A QEMU/KVM backend would be nicer.
When activation fails on one machine, we should still let the other machines continue. Interrupting activation of a configuration can give an inconsistent system.
Add option to specify a persistent disk (like EBS), with size and maybe (?) FS type as configuration options.
n/t
It is possible for the Nix expressions defining the Charon network to change between the first and second evaluation, for instance if you edit the expressions while running "charon deploy" in the background. Example scenario (which happened to me):
Kinda tricky to prevent this. We could at least check after the second evaluation that the timestamps on the network expressions haven't changed.
It might happen sometimes that an EC2 EBS volume get manually (outside of charon) detached. It would be nice if charon could recover from this, by re-attaching the volumes that are supposed to be there, assuming they are still available. Perhaps a stop/start is needed to do this cleanly.
We already lock the JSON file if we rewrite it, but this doesn't prevent multiple simultaneous invocations of "charon deploy" from messing with each other. So an exclusive lock should be acquired globally.
If volume is detached from original machine, it should not be deleted on termination of the original machine.
n/t
When specifying deployment.targetEnv = "adhoc"; charon reports this option as invalid
Maybe Charon should (by default) ask for interactive confirmation when it tries to do something destructive (like deleting an EBS volume). Of course, for non-interactive use there should be a flag to force the answer to yes or no.
Might be nice if Charon could automatically allocate/deallocate elastic IPs.
If, in the process of activating a new configuration, any Upstart job fails, it should be reported to the user. And maybe an automatic rollback should be started.
Encrypted disk support means that we must get the key from somewhere, possibly a remote location. So this should be pluggable.
Of course this becomes extra tricky if the disk is mounted in the initrd. It might be an option to get the key from the EC2 userdata.
If the state file of a Charon deployment is lost, it would be nice if it could be reconstructed automatically given the UUID of the network.
A related feature would be to enumerate all known machine instances in all backends (along with their network UUIDs).
So you could recover like this:
$ charon discover-machines ec2 33bced96-5f26-11e1-b9d7-9630d48abec1 foo ec2 33bced96-5f26-11e1-b9d7-9630d48abec1 bar $ charon -s ./state.json recover 33bced96-5f26-11e1-b9d7-9630d48abec1
Attaching an elastic IP while the machine hasn't started yet seems to fail randomly. So wait until it's started.
This is likely to fail when a server is stopped and started, especially when part of the nix store is on ephemeral, which is lost on stop/start.
The Perl bindings are a problem on some platforms (in particular Cygwin and cross-compiled environments), so it would be nice if there was a configure flag to disable building them.
However, the Perl bindings are now used all over the place (e.g. in download-using-manifests), so we'd need some fallback code to handle the absense of the bindings. For instance, functions such as isValidPath() and queryPathHash() can be emulated slowly by calling nix-store.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.