Giter VIP home page Giter VIP logo

colmena's Introduction

Colmena

Matrix Channel Stable Manual Unstable Manual Build

Colmena is a simple, stateless NixOS deployment tool modeled after NixOps and morph, written in Rust. It's a thin wrapper over Nix commands like nix-instantiate and nix-copy-closure, and supports parallel deployment.

Now with 100% more flakes! See Tutorial with Flakes below.

$ colmena apply --on @tag-a
[INFO ] Enumerating nodes...
[INFO ] Selected 7 out of 45 hosts.
  (...) ✅ 0s Build successful
  sigma 🕗 7s copying path '/nix/store/h6qpk8rwm3dh3zsl1wlj1jharzf8aw9f-unit-haigha-agent.service' to 'ssh://[email protected]'...
  theta ✅ 7s Activation successful
  gamma 🕘 8s Starting...
  alpha ✅ 1s Activation successful
epsilon 🕗 7s copying path '/nix/store/fhh4rfixny8b21l6jqzk7nqwxva5k20h-nixos-system-epsilon-20.09pre-git' to 'ssh://[email protected]'...
   beta 🕗 7s removing obsolete file /boot/kernels/z28ayg10kpnlrz0s2qrb9pzv82lc20s2-initrd-linux-5.4.89-initrd
  kappa ✅ 2s Activation successful

Installation

colmena is included in Nixpkgs beginning with 21.11.

Use the following command to enter a shell environment with the colmena command:

nix-shell -p colmena

Unstable Version

To install the latest development version to your user profile:

nix-env -if https://github.com/zhaofengli/colmena/tarball/main

Alternatively, if you have a local clone of the repo:

nix-env -if default.nix

A public binary cache is available at https://colmena.cachix.org, courtesy of Cachix. This binary cache contains unstable versions of Colmena built by GitHub Actions.

Tutorial

See Tutorial with Flakes for usage with Nix Flakes.

Colmena should work with your existing NixOps and morph configurations with minimal modification. Here is a sample hive.nix with two nodes, with some common configurations applied to both nodes:

{
  meta = {
    # Override to pin the Nixpkgs version (recommended). This option
    # accepts one of the following:
    # - A path to a Nixpkgs checkout
    # - The Nixpkgs lambda (e.g., import <nixpkgs>)
    # - An initialized Nixpkgs attribute set
    nixpkgs = <nixpkgs>;

    # You can also override Nixpkgs by node!
    nodeNixpkgs = {
      node-b = ./another-nixos-checkout;
    };

    # If your Colmena host has nix configured to allow for remote builds
    # (for nix-daemon, your user being included in trusted-users)
    # you can set a machines file that will be passed to the underlying
    # nix-store command during derivation realization as a builders option.
    # For example, if you support multiple orginizations each with their own
    # build machine(s) you can ensure that builds only take place on your
    # local machine and/or the machines specified in this file.
    # machinesFile = ./machines.client-a;
  };

  defaults = { pkgs, ... }: {
    # This module will be imported by all hosts
    environment.systemPackages = with pkgs; [
      vim wget curl
    ];

    # By default, Colmena will replace unknown remote profile
    # (unknown means the profile isn't in the nix store on the
    # host running Colmena) during apply (with the default goal,
    # boot, and switch).
    # If you share a hive with others, or use multiple machines,
    # and are not careful to always commit/push/pull changes
    # you can accidentaly overwrite a remote profile so in those
    # scenarios you might want to change this default to false.
    # deployment.replaceUnknownProfiles = true;
  };

  host-a = { name, nodes, ... }: {
    # The name and nodes parameters are supported in Colmena,
    # allowing you to reference configurations in other nodes.
    networking.hostName = name;
    time.timeZone = nodes.host-b.config.time.timeZone;

    boot.loader.grub.device = "/dev/sda";
    fileSystems."/" = {
      device = "/dev/sda1";
      fsType = "ext4";
    };
  };

  host-b = {
    # Like NixOps and morph, Colmena will attempt to connect to
    # the remote host using the attribute name by default. You
    # can override it like:
    deployment.targetHost = "host-b.mydomain.tld";

    # It's also possible to override the target SSH port.
    # For further customization, use the SSH_CONFIG_FILE
    # environment variable to specify a ssh_config file.
    deployment.targetPort = 1234;

    # Override the default for this target host
    deployment.replaceUnknownProfiles = false;

    # You can filter hosts by tags with --on @tag-a,@tag-b.
    # In this example, you can deploy to hosts with the "web" tag using:
    #    colmena apply --on @web
    # You can use globs in tag matching as well:
    #    colmena apply --on '@infra-*'
    deployment.tags = [ "web" "infra-lax" ];

    time.timeZone = "America/Los_Angeles";

    boot.loader.grub.device = "/dev/sda";
    fileSystems."/" = {
      device = "/dev/sda1";
      fsType = "ext4";
    };
  };
}

The full set of options can be found in the manual. Run colmena build in the same directory to build the configuration, or do colmena apply to build and deploy it to all nodes.

Tutorial with Flakes

To use with Nix Flakes, create outputs.colmena in your flake.nix.

Here is a short example:

{
  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
  };
  outputs = { nixpkgs, ... }: {
    colmena = {
      meta = {
        nixpkgs = import nixpkgs {
          system = "x86_64-linux";
        };
      };

      # Also see the non-Flakes hive.nix example above.
      host-a = { name, nodes, pkgs, ... }: {
        boot.isContainer = true;
        time.timeZone = nodes.host-b.config.time.timeZone;
      };
      host-b = {
        deployment = {
          targetHost = "somehost.tld";
          targetPort = 1234;
          targetUser = "luser";
        };
        boot.isContainer = true;
        time.timeZone = "America/Los_Angeles";
      };
    };
  };
}

The full set of options can be found in the manual. Run colmena build in the same directory to build the configuration, or do colmena apply to build and deploy it to all nodes.

Manual

Read the Colmena Manual.

Environment Variables

  • SSH_CONFIG_FILE: Path to a ssh_config file

Current Limitations

  • It's required to use SSH keys to log into the remote hosts, and interactive authentication will not work.
  • Error reporting is lacking.

Licensing

Colmena is available under the MIT License.

colmena's People

Contributors

bjornfor avatar blaggacao avatar cprussin avatar dminuoso avatar emilylange avatar fooker avatar glenn-m avatar i1i1 avatar janik-haag avatar jasonrm avatar justinas avatar ldicarlo avatar lheckemann avatar lovesegfault avatar minhuw avatar neverbehave avatar nrdxp avatar oddlama avatar otavio avatar phaer avatar pkel avatar sumnerevans avatar thinkchaos avatar whentze avatar zhaofengli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

colmena's Issues

remote lost configuration (back to clean install)

I am not sure if this is a Colmena specific issue but since I couldn't find anything on the web, so, I am asking it here.

How I install NixOS these days is I boot an empty VPS with the NixOS iso, run nixos-generate-config, uncomment lines to enable GRUB and ssh, allow password on root ssh login, then install. Once rebooted I have a minimal NixOS install with root password set.

Then I use colmena to get the system to the point where I need it to be.

However these few days I noticed that the VPS I deployed using colmena would 'reset' to the initial minimal install I had before colmena. (users gone, services not available, etc). This happens daily. Initially I thought that was because I enabled garbage collection daily, but it still persists after I turned gc off.

As I am new to NixOS, I hope someone would point me where to look if this is not a colmena bug.

Nixpkgs config not propagated (`nixpkgs.system`, possibly more)

I have an aarch64-linux machine in my hive. Recently updated to master of colmena and it tries to build for x86_64-linux instead, resulting in a cryptic error as it tries to execute x86_64 perl/bash:

[ERROR] Deployment to robotaki failed. Last 8 lines of logs:
[ERROR] /nix/store/2l0hln4hzpqv65r2wamkxd9gcvslg6kv-nixos-system-robotaki-21.05pre-git/bin/switch-to-configuration: line 3: use: command not found
[ERROR] /nix/store/2l0hln4hzpqv65r2wamkxd9gcvslg6kv-nixos-system-robotaki-21.05pre-git/bin/switch-to-configuration: line 4: use: command not found
[ERROR] /nix/store/2l0hln4hzpqv65r2wamkxd9gcvslg6kv-nixos-system-robotaki-21.05pre-git/bin/switch-to-configuration: line 5: use: command not found
[ERROR] /nix/store/2l0hln4hzpqv65r2wamkxd9gcvslg6kv-nixos-system-robotaki-21.05pre-git/bin/switch-to-configuration: line 6: use: command not found
[ERROR] /nix/store/2l0hln4hzpqv65r2wamkxd9gcvslg6kv-nixos-system-robotaki-21.05pre-git/bin/switch-to-configuration: line 7: use: command not found
[ERROR] /nix/store/2l0hln4hzpqv65r2wamkxd9gcvslg6kv-nixos-system-robotaki-21.05pre-git/bin/switch-to-configuration: line 8: syntax error near unexpected token `('
[ERROR] /nix/store/2l0hln4hzpqv65r2wamkxd9gcvslg6kv-nixos-system-robotaki-21.05pre-git/bin/switch-to-configuration: line 8: `use Sys::Syslog qw(:standard :macros);'

Introduced somewhere between ee52032 (I used that specific commit... for a while) and c6ac931 . I will try to find time to bisect tomorrow

Minimal repro

shell.nix

{ pkgs ? import <nixpkgs> { } }:
let
  myColmena = import
    (fetchTarball {
      url = "https://github.com/zhaofengli/colmena/archive/c6ac93152cbfe012013e994c5d1108e5008742d6.tar.gz";
      sha256 = "0zljn06yszzy1ghzfd3hyzxwfr9b26iydfgyqwag7h0d8bg2mgjr";
    })
    { };
in
pkgs.mkShell {
  buildInputs = [ myColmena ];
}

hive.nix

let
  nixpkgs = import <nixpkgs>;
in
{
  network = {
    nixpkgs = nixpkgs {
      system = "aarch64-linux";
    };
  };

  alpha = { config, pkgs, ... }: {
    boot.loader.grub.devices = [ "/dev/foo" ];
    fileSystems."/" = {
      device = "/dev/foo";
      fsType = "ext4";
    };

    system.activationScripts.testingOverlays = builtins.throw pkgs.system;
  };
}

Result:

$ colmena build --on alpha
<...snip...>
[ERROR] Evaluation of alpha failed. Logs:
[ERROR] error: x86_64-linux

See also:

Using nixosSystem? Flake future?

Hi. It seems mainstream NixOS is heading toward using nixpkgs.lib.nixosSystem, and standard ways of defining nixos hosts in a flake.nix: https://nixos.wiki/wiki/Flakes#Using_nix_flakes_with_NixOS

Any thoughts on integrating with that world with colmena? I'm keen to keep my configuration as close to "mainstream" as I can, while using remote deploys (some of my hosts are definitely too weak to build themselves).

It seems mainstream NixOS is getting remote deploys too: nixos-rebuild --flake .#mymachine --target-host mymachine-hostname --build-host localhost switch

Eval stopped when containers.*.config is defined.

I am defining a nixos container that would be deployed to a VPS by colmena, however, evaluation stops whenever I have containers.*.config defined.

Error message:
evaluation failed: nix was killed by signal 9

No trace available.

Don't require meta.nixpkgs if meta.nodeNixpkgs is set for the target host(s)

I think it's a little weird to require nixpkgs to be set (because it requires a system set which may change at any point, depending on where you're deploying from). Sure, right now, I may be using x86_64-linux, but I may want to use aarch64-linux or some other arch at any time.

I would expect, if all of the target hosts have a matching nodeNixpkgs entry, that meta.nixpkgs wouldn't be required.

Even better, if it was combined with a deployment.nixpkgs option that replaces the meta.nodeNixpkgs option (EDIT: opened #55 for this).

I may be able to work on this, if this is desired by more than just me.

Avoid transferring unchanged secrets

My secrets are protected with a mechanism that requires human actions to unlock.
Running colmena apply repeatedly while fiddling with the configuration is frustrating, as it wants to send secrets every time (and to all hosts, if I don't limit it with --on foo).

This annoyance is making me avoid the colmena secrets mechanism, or writing secrets in plaintext, which are obviously not good ideas.

I understand and appreciate that colmena stores no client side state, and thus can't just decide on the client side whether or not to upload a secret.

However, could we do something like a keyHash in addition to keyCommand, which could e.g. return a hash of the encrypted key (not leaking anything via easy hashes), store that on the destination host, and then only run keyCommand if the keyHash doesn't match previous value?

That would make colmena "semi-stateful" on the destination host, but it would be more like a cache; remove it, and all that happens is that some extra work is done.

Build each node individually

Currently we evaluate and build a group of nodes at a time. If one node in a given chunk depends on derivations that take a long time to build (e.g., kernels), the deployment for the whole chunk is bottlenecked as a result. Let's still evaluate nodes in chunks, but build the system profiles individually in parallel.

More: #36 (comment)

Magic Rollback

Hello,

I'm just now learning of colmena, and it looks great! I'm just opening this issue because I was originally looking for a flakeless alternative to deploy-rs, and having looked at it made me see its magic rollback feature.

How would you feel about implementing something like that for colmena?

I think the process would be to do something like:

  1. upload the closure
  2. run something akin to nixos-rebuild test; (sleep 300 && rollback) &
  3. then open a new ssh connection (without ControlMaster, so with ssh -S none)
  4. have the new ssh connection kill the sleep 300 && rollback command

I think that by using grub's robustness mechanisms it'd even be possible to have a similar mechanism working for when the system needs to be rebooted, which would be an awesome feature that AFAICT no other deployment mechanism has, but that'd certainly be a lot more work to implement.

nix-eval-jobs patch causes build failure with nixos-21.11 nixpkgs

If you use a flake of colmena and do inputs.nixpkgs.follows = "nixpkgs";, it fails to build:

error: builder for '/nix/store/2sikwzzkkzggxi7snbml0sc7sv4wkzn2-nix-eval-jobs-0.0.1-colmena.drv' failed with exit code 1;
       last 10 log lines:
       > unpacking source archive /nix/store/wndp1lpb7g9gs3fmh4pjmqw80pjiw606-source
       > source root is source
       > patching sources
       > applying patch /nix/store/mgnkwc0sbk8x3jn4x7gny013mp4gsvfh-1e0f309fefc9b2d597f8475a74c82ce29c189152.patch
       > patching file src/nix-eval-jobs.cc
       > Hunk #2 FAILED at 37.
       > Hunk #3 succeeded at 96 with fuzz 1 (offset -7 lines).
       > Hunk #4 succeeded at 235 (offset -14 lines).
       > Hunk #5 succeeded at 288 (offset -14 lines).
       > 1 out of 5 hunks FAILED -- saving rejects to file src/nix-eval-jobs.cc.rej
       For full logs, run 'nix log /nix/store/2sikwzzkkzggxi7snbml0sc7sv4wkzn2-nix-eval-jobs-0.0.1-colmena.drv'.

In general, I like to run my systems on stable nixpkgs as much as possible.

Raw IPv6 addresses in targetHost don't seem to be handled properly

[ERROR] Deployment to <redacted> failed. Last 5 lines of logs:
[ERROR] sh: -c: option requires an argument
[ERROR] sh: -c: option requires an argument
[ERROR] error: --- Error --- nix-copy-closure
[ERROR] don't know how to open Nix store 'ssh://root@2a10:4a80:407:3:f816:3eff:fe68:e112?compress=true'
[ERROR]

Seems like nix-copy-closure might want this in a different format?

Allow configuring system per-node

I like a lot of the ideas in colmena, and as it's heading toward flake support I'm growing more interested, but there's a show-stopper for me: I need to deploy to a set of nodes with heterogeneous architectures (aarch64 & x86_64).

`colmena build` fails when `meta.nixpkgs = import-from-derivation ...`

Hi, I'm using import-from-derivation to obtain my nixpkgs copy (to be able to inject version info from git into .version-suffix -- I haven't switched to flakes) for meta.nixpkgs = ..., in hive.nix.

However that breaks colmena build:

$ colmena build --show-trace
[INFO ] Enumerating nodes...
trace:  nixpkgsPinnedSrc: /nix/store/n31jsavkyn5ykkqmrf77140lqbfgz90n-nixpkgs-source
error: while evaluating the attribute 'meta.machinesFile' at /tmp/.tmptFR5Yd:474:3:
while evaluating the attribute 'meta' at /tmp/.tmptFR5Yd:294:7:
while evaluating 'mkNixpkgs' at /tmp/.tmptFR5Yd:298:27, called from /tmp/.tmptFR5Yd:339:6:
while evaluating the attribute 'nixpkgs' at /home/bf/colmena-issue-repro/hive.nix:3:5:
cannot import '/nix/store/n31jsavkyn5ykkqmrf77140lqbfgz90n-nixpkgs-source', since path '/nix/store/ln3aj7prhxb8y9yim5fg25lrslgyp9yv-nixpkgs-source.drv' is not valid, at /home/bf/colmena-issue-repro/hive.nix:25:10
[ERROR] -----
[ERROR] Operation failed with error: Nix exited with error code: 1

But if I take the Nix expressions from colmena and build it with nix-build (I hope I did it right), then it works:

$ nix-build ./build-with-colmena.nix
trace:  nixpkgsPinnedSrc: /nix/store/n31jsavkyn5ykkqmrf77140lqbfgz90n-nixpkgs-source
/nix/store/hckmh391x4frcib469jvfrgwlbk6pprz-colmena-hive

And since the above command successfully realized the missing .drv file from the error above, subsequent colmena build now works:

$ colmena build --show-trace
[INFO ] Enumerating nodes...
trace:  nixpkgsPinnedSrc: /nix/store/n31jsavkyn5ykkqmrf77140lqbfgz90n-nixpkgs-source
trace:  nixpkgsPinnedSrc: /nix/store/n31jsavkyn5ykkqmrf77140lqbfgz90n-nixpkgs-source
[INFO ] Selected all 1 nodes.
host1 ✅ 0s Built "/nix/store/qriwcxwmd2pazsj271y90xg789by036l-nixos-system-nixos-21.05.git.c4cbbed"

I have tried a bit of debugging in colmena (patching the source) but haven't found the root cause for this difference. Why does colmena build not work with IFD here, when the underlying Nix expressions do? Does it pass some option to nix-instantiate that disables IFD? Where? How to disable it?

The files referenced above are here:

hive.nix
{
  meta = {
    nixpkgs =
      let
        fetchGitWithVersionSuffix = { url, ref, rev }:
          let
            base = builtins.fetchGit { inherit url ref rev; };
            basePkgs = import base { config = {}; overlays = []; };
          in
            basePkgs.runCommandLocal "nixpkgs-source" { } ''
              mkdir -p "$out"
              (shopt -s dotglob; cp -r "${base}/"* "$out")
              echo ".git.${base.shortRev}" > "$out/.version-suffix"
            '';
        # Using builtins.fetchGit works, fetchGitWithVersionSuffix only works
        # with `nix-build ./build-with-colmena.nix`, not with `colmena build`.
        nixpkgsPinnedSrc = fetchGitWithVersionSuffix {
        #nixpkgsPinnedSrc = builtins.fetchGit {
          url = "https://github.com/NixOS/nixpkgs.git";
          ref = "refs/heads/release-21.05";
          rev = "c4cbbed186cc00066fd1998b5d915fe37f197135";
        };
        pinnedPkgs = builtins.trace" nixpkgsPinnedSrc: ${nixpkgsPinnedSrc}" (
                     #builtins.trace" nixpkgsPinnedSrc.drv: ${nixpkgsPinnedSrc.drvPath}"
        (import nixpkgsPinnedSrc { config = {}; overlays = []; }));
      in
        pinnedPkgs;
  };

  host1 =
    { pkgs, ... }:
    {
      fileSystems."/".device = "/dev/sda";
      boot.loader.grub.devices = [ "/dev/sda" ];
    };
}
build-with-colmena.nix
let
  colmenaSrc = builtins.fetchGit {
    url = "https://github.com/zhaofengli/colmena";
    ref = "refs/heads/release-0.2.x";
    rev = "e95dc850f3e715219cf9e0651cd53f63b79e0e11";
  };
  hiveEvaluated = import "${colmenaSrc}/src/nix/eval.nix" {
    rawHive = import ./hive.nix;
  };
in
  hiveEvaluated

Tip: after one successful build, remove the fetched nixpkgs source (and .drv) to be able to reproduce the colmena build issue: nix-store --delete /nix/store/n31jsavkyn5ykkqmrf77140lqbfgz90n-nixpkgs-source /nix/store/ln3aj7prhxb8y9yim5fg25lrslgyp9yv-nixpkgs-source.drv.

`colmena build` doesn't report instructions to the user with `requireFile`

$ colmena build --on node-a,node-b                                                                                                                                                                                                  
[INFO ] Enumerating nodes...
[INFO ] Selected all 2 nodes.
   (...) ❌ 1s Build failed: Nix exited with error code: 1
[ERROR] Build of 2 nodes failed. Last 10 lines of logs:
[ERROR] cannot build derivation '/nix/store/1khfxa0dmx97wv77qq1zcn9rfycl3l18-citrix-workspace-21.1.0.14_fish-completions.drv': 1 dependencies couldn't be built
[ERROR] cannot build derivation '/nix/store/4q5valvk93ik876k68r05lllss5v7bh9-man-paths.drv': 1 dependencies couldn't be built
[ERROR] cannot build derivation '/nix/store/78pr4yq09i2f931f8f0vivdqf0aahfcw-man-paths.drv': 1 dependencies couldn't be built
[ERROR] cannot build derivation '/nix/store/51ydcyabqskp1kjvbglpbdnf994zjnjh-system-path.drv': 1 dependencies couldn't be built
[ERROR] cannot build derivation '/nix/store/dmg1invb23034fsi7llm0h6k1xiz73aj-system-path.drv': 1 dependencies couldn't be built
[ERROR] cannot build derivation '/nix/store/jq7wnzqwj4cqmh6x0d7ff7478p1vlsfg-nixos-system-entadono-20.09pre-git.drv': 1 dependencies couldn't be built
[ERROR] cannot build derivation '/nix/store/rwwrsry5c9k01dyshi8v8xwz5hsvrabn-nixos-system-hp-20.09pre-git.drv': 1 dependencies couldn't be built
[ERROR] cannot build derivation '/nix/store/v5f169c76497djjhj4ib1lc6ab7z73jg-colmena-hive.drv': 1 dependencies couldn't be built
[ERROR] error: build of '/nix/store/v5f169c76497djjhj4ib1lc6ab7z73jg-colmena-hive.drv' failed
[ERROR]

It looks like colmena doesn't report instructions to the user with requireFile unless you use the -v option:

   (...) | ***
   (...) | In order to use Citrix Workspace, you need to comply with the Citrix EULA and download
   (...) | the 64-bit binaries, .tar.gz from:
   (...) | 
   (...) | https://www.citrix.com/de-de/downloads/workspace-app/linux/workspace-app-for-linux-latest.html
   (...) | 
   (...) | (if you do not find version 21.1.0.14 there, try at
   (...) | https://www.citrix.com/downloads/workspace-app/
   (...) | 
   (...) | Once you have downloaded the file, please use the following command and re-run the
   (...) | installation:
   (...) | 
   (...) | nix-prefetch-url file://$PWD/linuxx64-21.1.0.14.tar.gz
   (...) | 
   (...) | ***

I'm using the ee52032 build of colmena.

Question: meta.nixpkgs + config.allowUnfree

Consider the following configuration in hive.nix:

  meta = {
    nixpkgs = import (builtins.fetchTarball {
      name = "nixos-unstable-2020-12-08";
      url = "https://github.com/nixos/nixpkgs/archive/78dc359abf8217da2499a6da0fcf624b139d7ac3.tar.gz";
      sha256 = "0wgfvkwxj8vvy100dccffb6igbqljvhgyxdk8c9gk4k2zlkygz45";
    }) { config.allowUnfree = true; };
  };

Should it be necessary to specify nixpkgs.config.allowUnfree = true; in machine configurations?

colmena build should create a result symlink, like nix-build etc

As I'm finetuning a config, I often want to observe the created results. Right now it seems my choices are to either

  1. do a full apply -- which I'm not confident enough to do
  2. run colmena build -v, hope to catch a /nix/store/*-nixos-system-* path in the verbose log. And that path is not mentioned if there was no change, so sometimes that just leaves me eyeballing for nothing!
  3. run colmena build -v, catch a /nix/store/*-colmena-hive path in the verbose log, cat that file, and copy-paste a path referred in there

nix-build creates a result link by default. Could colmena build do the same, please? And please also make the contents immediately explorable, without needing to copy-paste a path name.

deployment.keys.*.{user,group} typos are silently ignored

I just used systemd-networkd as a user/group name, because the program is called that -- but the user account and group are actually named systemd-network, without the d. Colmena silently kept making the files owned by root:root until I got the spelling right.

Using the incorrect value for `meta.nixpkgs` can be fatal

Hi.

I was in the process of pinning nixpkgs for my cluster of NixOS nodes and ran into an issue where they all downgraded to NixOS 20.03post-git (Markhor), along with have really outdated packaged. I couldn't figure out what had gone wrong as niv was reporting the correct version of nixpkgs.

The offending line:

- meta.nixpkgs = import (import ./sources.nix).nixpkgs;
+ meta.nixpkgs = import (import ./sources.nix).nixpkgs { };

I imagine during evaluation there was confusion about what nixpkgs actually was, and so it did the "safe" thing of using a stable version.

Is there anything which can be implemented to prevent this from happening?

Option to remove secrets

Would it be possible to add some mechanism to remove secrets when they are removed from the configuration?

Consider two configured secrets like this:

            keys = {
              "test-secret1" = {
                keyCommand = [ "pass" "show" "nixos-secrets/ahorn/borg/passphrase1" ];
                destDir = "/var/src/colmena-keys"; 
              };
              "test-secret2" = {
                keyCommand = [ "pass" "show" "nixos-secrets/ahorn/borg/passphrase2" ];
                destDir = "/var/src/colmena-keys"; 
              };
          };

Which results in /var/src/colmena-keys/test-secret1 and /var/src/colmena-keys/test-secret2 being created. If one I remove one of those and re-deploy the configuration though, the file containing the secret will still be present on that host.

It would be nice to have an option to "clear" the secrets directory (in this case /var/src/colmena-keys) before copying all secrets, so that only keys present in the configuration will be present after a rebuild.

The default, temporary, location is not a solution for this problem as-is, because it requires to upload the keys again after a reboot. I have also considered adding something like

system.activationScripts.clean-secrets-dir =
  ''
    rm -rf /var/src/colmena-secrets/*
  '';
}

But I'm not sure if that would run before or after the copying of the secrets.

Incorrect build time reporting

$ colmena build
[INFO ] Enumerating nodes...
[INFO ] Selected all 10 nodes.
[...]
    netboot ✅ 0s Built "/nix/store/cw76vdvwbw93bdgrlwgzq83hxrqj28zv-nixos-system-netboot-21.05.4116.46251a79f75"
     indigo ✅ 0s Built "/nix/store/nyhsl87mfxxm3xd2zxz91cdjfly1q5aq-nixos-system-indigo-21.05.4116.46251a79f75"

Even though a kernel was built for indigo, taking >30min, they all say 0s. This doesn't seem like intended behaviour :)

That aside, very nice piece of work, I think this will do nicely as a nixops replacement ❤️

Evaluate & Build on target machine

I have a bigger number of machines and evaluating all machines on my laptop is too slow.
I also don't want to download all the packages that my servers need just to upload them again from my slow local connection.
Is there a way to do evaluation on the target host?

nixpkgs.config.allowUnfree unrecognized

I have a host that uses the nonfree broadcom-sta wifi driver. I don't actually use the wifi, but nixos-install slapped that in the hardware-config.nix, so my /etc/nixos/configuration.nix says nixpkgs.config.allowUnfree = true;.

I copied the /etc/nixos contents to my colmena client host in directory host/foo/ and wrote a hive.nix like this:

{
  foo = import ./host/foo/configuration.nix;
}

and everything works, but colmena warns:

foo | trace: warning: The following Nixpkgs configuration keys set in meta.nixpkgs will be ignored: allowUnfreePredicate

It still understands the nixpkgs.config line, because if I comment out the allowUnfree=true, the build fails in the usual "refusing to evaluate" way. The colmena warning seems like a false positive.

Nixpkgs fails to build on unstable

Hi, I'm using colmena with a flake to build a node that uses the nixos-unstable branch of nixpkgs (commit is NixOS/nixpkgs@689b76b), but it's failing to build.

In the start of the log I see a warning with

capucho-nixos | trace: warning: The following Nixpkgs configuration keys set in meta.nixpkgs will be ignored: path

Which leads me to believe that the issue might be caused by ignoring the path attribute introduced in NixOS/nixpkgs#153594

Here's the log of running colmena build --show-trace --verbose
log.txt

Flake support

I may have overlooked something but I couldn't find any references on how to use flakes, moreover the nixpkgs meta suggests the inputs are hardcoded.

Are there plans to add flake support?

Keys should be uploaded after activation, not before

Running into an issue with new machines because I have a module that creates a user and group. Seems like the keys are uploaded before activation, which means the chown can fail:

[ERROR] Deployment to [redacted] failed. Last 3 lines of logs:
[ERROR] sh: -c: option requires an argument
[ERROR] chown: invalid user: ‘vault-agent:vault-agent’
[ERROR]

Infinite recursion when calling package

Given a hive.nix

{
  somehost = { pkgs, ... }: pkgs.callPackage ./somehost { };
}

and the value of somehost/default.nix being

{ }

Colmena will panic with

$ colmena build 
[INFO ] Enumerating nodes...
error: infinite recursion encountered

       at /nix/store/s7z0ga4asah3zj5h3wkcmkkkmjmkapg7-nixpkgs-21.05pre289080.3eac120c3d2/nixpkgs/lib/modules.nix:305:28:

          304|         builtins.addErrorContext (context name)
          305|           (args.${name} or config._module.args.${name})
             |                            ^
          306|       ) (lib.functionArgs f);
(use '--show-trace' to show detailed location information)
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: NixFailure { exit_code: 1 }', src/command/apply.rs:122:50
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Panic when building a nix file with an unexpected type

Given hive.nix

{ pkgs, ... }: { }

Colmena will panic with

$ colmena build 
[INFO ] Enumerating nodes...
error: value is a function while a set was expected
(use '--show-trace' to show detailed location information)
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: NixFailure { exit_code: 1 }', src/command/apply.rs:122:50
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Use the same SSH session for key uploads

This is a rather minor issue, but I've observed a new SSH session will be initiated for every key uploaded. When you have a lot of keys to upload and a YubiKey which doesn't authenticate you instantly it takes a bit longer. Would be good to just use the same session for each host.

package for nixpkgs please

colmena is my favourite deploy tool for NixOS! I would love for it to be packaged in nixpkgs. Any blockers now that there has been a stable release?

add --no-substitute option

Nix itself accepts the option --no-substitute, while colmena's similar option is called --no-substitutes. This is unnecessarily confusing, and renaming the option would resolve this.

deployment.keys.*.mode documented but not implemented

README says https://github.com/zhaofengli/colmena/blame/main/README.md#L206 mode = "0640"; # Default: 0600 but there's no instance of mode in the source code and trying to set it gives error

[INFO ] Enumerating nodes...
error: The option `deployment.keys.foo.key.mode' defined in `<unknown-file>' does not exist.
(use '--show-trace' to show detailed location information)
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: NixFailure { exit_code: 1 }', src/command/apply.rs:122:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

doas support, in addition to --sudo

Hi. I've switched from sudo to the minimal & likely more secure doas. I no longer have sudo installed at all. I'd still like to use colmena apply-local. Please support a way of specifying the command to run to become root.

Overlays are applied both from nixpkgs.overlays and from ~/.config/nixpkgs/overlays.nix

Hello,

In the process of migrating to colmena, I discovered a weirdness around the handling of overlays: when passing nixpkgs by path to colmena (meta.nixpkgs = ./nixpkgs;), it imports the ~/.config/nixpkgs/overlays.nix overlays.

I'd have expected the default arguments to the nixpkgs lambda to be overlays = [];, so that the impurity of home-local overlays wouldn't come up.

Would that make sense to you, or do you think the current behavior is the correct one? (In which case I'd think maybe a warning somewhere might be appropriate, as I ended up debugging why my overlay was being applied twice, once from ~/.config and once from nixpkgs.overlays, for 45 minutes today, and only figured it out after having failed to create a reproducer for the issue)

Anyway, a simple workaround is to set meta.nixpkgs = import ./nixpkgs { overlays = []; };, which is the way I'm going to go for now, but I'm thinking changing the default might make colmena more accessible to new users that wouldn't expect user-local overlays to be applied to deployments :)

make `colmena apply dry-activate` verbose by default?

From the nixos-rebuild manual:

dry-activate
Build the new configuration, but instead of activating it, show what changes would be performed by the activation (i.e. by nixos-rebuild test). For instance, this
command will print which systemd units would be restarted. The list of changes is not guaranteed to be complete.

I'm not sure how much use dry-activate is without passing the --verbose flag. Maybe dry-activate should just always imply --verbose?

Excessive output when terminal isn't tall enough

When too many lines of progress indication are being printed, scrollback will explode. It looks mostly fine if you're not scrolling back:

2022-02-01-082505_screenshot

but if you make the terminal taller or scroll up, you'll see a problem:
2022-02-01-082437_screenshot

This can drown out valuable output from the build phase or similar.

A simple way to reproduce this is to run colmena apply keys with the following hive.nix, in a terminal less than 20 lines tall:

with import <nixpkgs> {};
lib.genAttrs (map (n: "host${toString n}") (lib.range 1 20)) (name: {
  deployment.targetHost = "dummy.invalid";
  boot.isContainer = true;
  deployment.keys.block-deployment = {
    keyCommand = ["sleep" "infinity"];
  };
})

apply option to reboot

Reboots (switch-to-configuration boot followed by a reboot) can be desirable for applying major changes such as stdenv rebuilds or kernel updates. Currently, the only way I'm aware of to do this with colmena is to colmena apply boot then colmena exec -- reboot. This feels a bit unwieldy though, a reboot action would be nice to have.

Is there a reason this doesn't already exist which I'm missing?

KeySource does not deserialize

With the following hive definition:

{
  alpha = { pkgs, ... }: {
    deployment.targetHost = "";
    deployment.keys.textual = {
      text = "textual_content";
    };

    imports = [ <...> ];
  };
}

Colmena panics with:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error("invalid type: null, expected a sequence", line: 1, column: 188)', src/nix/hive.rs:55:52

The problem seems to be with the new KeySource enum. Nix JSON output contains explicit null values for unset options:

{
  "alpha": {
    "textual": {
      "destDir": "/run/keys",
      "group": "root",
      "keyCommand": null,
      "keyFile": null,
      "permissions": "0600",
      "text": "textual_content",
      "user": "root"
    }
  }
}

Serde seems to prefer the first variant alphabetically in the source code, in this case, KeySource::Command. Since the JSON object contains a key for it, it decides that this is the variant to be deserialized. However, it then fails as the type does not match. Such thing presumably does not happen if the field is omitted altogether, rather than explicitly set to null.

I have tried various #[serde(untagged)], #[serde(flatten)], etc. combinations, but could not arrive to a solution, without writing something like a custom serializer.

Support for extensions of colmena / provisioning ?

Hello,

One of the features of nixops is the ability of provisioning nixos environment on e.g. google cloud, amazon aws, libvirt… In nixops 1.*, the code was deemed too monolithic, so to encourage users to contribute new provisioning environments, nixops 2.* adopts a more modular architecture with plugins.

I was wondering two things:

  • Do you plan to implement provisioning directly in colmena ?
  • Do you think it would be possible to have some kind of plugin interface for colmena, allowing to develop code for provisioning machines as plugins ? Having such a plugin system would allow to keep colmena effectively stateless, putting the (probably needed) state in the hands of plugins.

Thanks,
Rémy

Evaluate whole configurations ?

Hello,

I'd like to be able to evaluate the whole colmena configuration using colmena eval. Currently, it seems it's not working:

$ cat eval.nix

{ nodes, pkgs, lib, ... }:
lib.attrsets.mapAttrs (k: v: v.config) nodes

$ colmmena eval eval.nix

error: attribute 'cycle' missing

       at /nix/store/h5rmlfv5gnpjgc10xf6n6hkw0dvb997p-source/nixos/modules/tasks/filesystems.nix:230:119:

          229|       { assertion = ! (fileSystems' ? cycle);
          230|         message = "The ‘fileSystems’ option can't be topologically sorted: mountpoint dependency path ${ls " -> " fileSystems'.cycle} loops to ${ls ", " fileSystems'.loops}";
             |                                                                                                                       ^
          231|       }
[ERROR] -----
[ERROR] Operation failed with error: Nix exited with error code: 1
[…]

The flake.nix is below. My guess is there is a problem nix side, so, that might not be fixable.

Rémy

Flake.nix:

{
  description = "Sisyphe system configuration";

  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/nixos-21.11";
    colmena.url = "github:zhaofengli/colmena";
  };

  outputs = { self, nixpkgs, colmena }: {
    colmena = {
      meta = { nixpkgs = import nixpkgs { system = "x86_64-linux"; }; };

      test = { name, nodes, pkgs, ... }: {
        deployment = { };
      };
    };

    devShell.x86_64-linux = with import nixpkgs { system = "x86_64-linux"; };
      mkShell {
        buildInputs = [ colmena.packages.x86_64-linux.colmena ];
      };
  };
}

override deployment.targetUser in CLI

Could we have a CLI argument or environment variable to override the deployment.targetUser? Useful in cases where the remote user isn't the same, but someone else might need to use a different username to deploy, so setting it in the Nix configuration isn't helpful.

Cannot deploy to a host without internet

If for some reason the deployment target doesn't have an internet connection but does have binary caches enabled, copying the system closure will fail rather than falling back to copying directly.

Support users other than root

when using deployment.targetUser, there are two sets of errors.

Firstly:

$ colmena apply            
[INFO ] Enumerating nodes...
[INFO ] Selected all 1 nodes.
d9294fd26f ✅ 0s Build successful
d9294fd26f ❌ 10s Failed: Nix exited with error code: 1
[ERROR] Deployment to d9294fd26f failed. Last 10 lines of logs:
[ERROR] error: cannot add path '/nix/store/1fy42hwn9zid6nac4izmn570961g0xwg-nixos.conf' because it lacks a valid signature
[ERROR] error (ignored): error: unexpected end-of-file
[ERROR] error (ignored): error: interrupted by the user
[ERROR] error: cannot add path '/nix/store/1gs1vc505irbay4vih9razci1j0dkhxi-etc-os-release' because it lacks a valid signature
[ERROR] error (ignored): error: unexpected end-of-file
[ERROR] error (ignored): error: interrupted by the user
[ERROR] error: cannot add path '/nix/store/45qg1fk7yxrmjcip7vid8mp7wd67y3r4-linux-5.10.36-modules-shrunk' because it lacks a valid signature
[ERROR] error (ignored): error: writing to file: Broken pipe
[ERROR] error: unexpected end-of-file
[ERROR] 

Which can be resolved by adding the following to the machine's configuration.nix

nix.trustedUsers = [ "root" "@wheel" ];

Secondly

$ colmena apply
[INFO ] Enumerating nodes...
[INFO ] Selected all 1 nodes.
d9294fd26f ✅ 1s Build successful
d9294fd26f ❌ 5s Failed: Nix exited with error code: 1
[ERROR] Deployment to d9294fd26f failed. Last 10 lines of logs:
[ERROR] copying path '/nix/store/cian0pqqfs2dwlcmi4kbkxz099qsk6xh-unit-systemd-fsck-.service' to 'ssh://[email protected]'...
[ERROR] copying path '/nix/store/l7rf3l3y4rf3g1pcbdz1wnkz995xikza-initrd-linux-5.10.36' to 'ssh://[email protected]'...
[ERROR] copying path '/nix/store/5z8c9v7yncsbs2bihb11w8kxfqjbqnrm-unit-dbus.service' to 'ssh://[email protected]'...
[ERROR] copying path '/nix/store/5hfykn7mjfmwg5nrma3j43019jpxb9ri-unit-dbus.service' to 'ssh://[email protected]'...
[ERROR] copying path '/nix/store/58mk0ybrb8qj27lv194lp1swbpk1r402-user-units' to 'ssh://[email protected]'...
[ERROR] copying path '/nix/store/7v0b32q87vfjn0zjlvn7mn2d4f3508mi-system-units' to 'ssh://[email protected]'...
[ERROR] copying path '/nix/store/qdsxvfn0vq3rr2zbka51bxngpva1s46b-etc' to 'ssh://[email protected]'...
[ERROR] copying path '/nix/store/jnv3wsh2m3fv03vml5ix8pm98h4nld5v-nixos-system-d9294fd26f-21.05pre289080.3eac120c3d2' to 'ssh://[email protected]'...
[ERROR] error: creating symlink from '/nix/var/nix/profiles/system-20-link.tmp-26914-1695977995' to '/nix/store/jnv3wsh2m3fv03vml5ix8pm98h4nld5v-nixos-system-d9294fd26f-21.05pre289080.3eac120c3d2': Permission denied
[ERROR]

Lots of permission errors which I'm not entirely sure how to fix. It looks as if colmena isn't using sudo to link files.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.