janma / nomad-driver-nspawn Goto Github PK

View Code? Open in Web Editor NEW

50.0 50.0 14.0 18.54 MB

A Nomad task driver for systemd-nspawn

License: MIT License

Go 89.13% HCL 7.91% Shell 0.94% Makefile 2.01%

nomad-driver-nspawn's People

Contributors

Stargazers

Watchers

Forkers

arianvp mateuszlewko isgasho input-output-hk blaggacao bitmappergit riskiwah gibheer

nomad-driver-nspawn's Issues

Failed to parse UID "auto": Invalid argument

Errors in task logs:

Jul 03, '23 00:25:31 -0700 | Driver Failure | rpc error: code = Unknown desc = systemd-nspawn failed to start task
Jul 03, '23 00:25:31 -0700 | Driver | Failed to parse UID "auto": Invalid argument

I basically used modified nginx example:

job "nginx" {
  datacenters = ["dc1"]
  type = "service"
  group "linux" {
    count = 1
    network {
      port "http" {
        static = "8080"
        to = "80"
      }
    }
    task "nginx" {
      driver = "nspawn"
      config {
        image = "/home/sergei/projs/nomad/Nginx/image"
        resolv_conf = "copy-host"
        command = ["/bin/bash", "-c", "nginx && tail -f /var/log/nginx/access.log " ]
        boot = false
        process_two = true
        ports = ["http"]
      }
    }
  }
}

I can't quite figure out what parameter UID belongs to.

Can CSI work with systemd nspawn?

I'm just wondering if it's possible to get CSI working with nspawn. Based on my limited knowledge about how CSI works, it should be possible since all that would need to be done would to bind mount the temporary directory created by the CSI driver to the correct place in the container.

Do you have any plans to implement this?

Support Nomad's generic volume/volume_mount syntax

Support something like this:

  group "example" {
    count = 1
    volume "nix-store" {
      type      = "host"
      source    = "nix-store"
      read_only = true
    }
    task "example" {
      driver = "nspawn"
      volume_mount {
        volume      = "nix-store"
        destination = "/nix/store"
      }
     config {
      image = "debian"
     }
  }

It would be implemented the same as the current bind and bind_read_only syntax. It sounds easy to add.
Happy to send a PR for this

https://www.nomadproject.io/docs/job-specification/volume
https://www.nomadproject.io/docs/job-specification/volume_mount

I guess we can only implement the host driver; as I dont know if CSI ingetrates with nspawn

When nspawn fails immediately, it does not get logged

When starting a job that exits immediately (e.g. due to wrong config in nginx) the driver think the job is failed because it doesn't show up in machinectl :

    2020-07-11T14:18:13.777+0200 [ERROR] client.driver_mgr.nomad-driver-nspawn: failed to get machine information: driver=nspawn @module=nspawn error="timed out while getting machine properties: No machine 'example-6e0636df-97ce-5a06-b354-ce79a7b7c63a' known" timestamp=2020-07-11T14:18:13.776+0200

because probably the service is already gone at that point; and will not show up in machinectl

Instead I would expect logs to show up when viewing the alloc in nomad, but those are now empty because the allocation is never considered succesfully created

ID        Node ID   Task Group  Version  Desired  Status  Created    Modified
694b6326  430116cf  example     0        run      failed  2m10s ago  1m39s 
nomad alloc logs 694b6326
Error reading file: Unexpected response code: 404 (task "example" not started yet. No logs available)

This makes it very hard to debug to figure out why an nspawn command failed.

Implement https://github.com/arianvp/nomad-driver-systemd behaviour into this project?

Hey! Just found your project! Cool!

I've been working on something related https://github.com/arianvp/nomad-driver-systemd (though it's a bit abandoned and i was planning to work on it again). Namely scheduling systemd units using nomad. Systemd units have all the same isolation primitives as systemd-nspawn containers; and there is also https://systemd.io/PORTABLE_SERVICES/ which is a new feature in systemd for container-like workloads.

I was wondering; would you accept a PR for supporting running regular systemd services and portable services into this project? I think the projects are similar enough that fragmentation doesn't make sense here.

Timed out while getting machine addresses error

Hi there,

thanks for this nomad plugin. I wanted to try out this driver but I get the above error when allocating job.
I can see in the driver's code when the error is raised, but I'm not able to understand why it fails to get the machine address.
Could you advise me on why is that happening?

I have nspawn installed and I was able to successfully boot and play around with nspawn container outside of nomad.
I'm running all this on a fresh Ubuntu 20.04 instance. Is there anything on my machine I could check that would provide more info?

Nomad job

job "debian" {
  type = "service"
    
  datacenters = ["dc1"]

  group "debian" {
    task "debian" {
      driver = "nspawn"
      config {
        image = "debian-buster"
        resolv_conf = "copy-host"
        image_download {
          url = "https://nspawn.org/storage/debian/buster/tar/image.tar.xz"
        }
      }

      resources {
        cpu    = 2000
        memory = 1024
      }
    }
  }
}

Result of `nomad alloc status`

Recent Events:
Time Type Description
2020-09-05T21:59:05+02:00 Alloc Unhealthy Unhealthy because of failed task
2020-09-05T21:59:05+02:00 Not Restarting Error was unrecoverable
2020-09-05T21:59:05+02:00 Driver Failure rpc error: code = Unknown desc = timed out while getting machine addresses:
2020-09-05T21:59:05+02:00 Driver buster login:
2020-09-05T21:59:05+02:00 Driver Failed to set alternative interface name 've-debian-b594aee4-72b5-aa4d-12b8-3178280d1d08' to 've-debian-buTh1', ignoring: Operation not supported
2020-09-05T21:58:32+02:00 Driver Downloading image
2020-09-05T21:58:32+02:00 Task Setup Building Task Directory

"read_only and user_namespacing may not be combined" - is that really true?

Hello there,

Thanks for creating this nspawn driver for nomad! I've been playing around with it this weekend :)

I tried to enable read_only / volatile in a container and it refused to start, due to this validation rule:

nomad-driver-nspawn/nspawn/nspawn.go

Line 224 in 0217d36

return fmt.Errorf("volatile and user_namespacing may not be combined")

I've looked online and couldn't find any info about this, so I'm curious about your experience with these flags interacting with -U. I've not tried volatile because of a lack of suitable containers, but running systemd-nspawn directly with -U --read-only seems to work fine. Has there been a recently change in systemd that made this combination work?

Thanks,

xkxx

Support restarting Nomad without restarting nspawn containers

It seems that restarting Nomad service (for example when upgrading Nomad or reloading configuration) restarts jobs run by nspawn driver. Docker jobs stay alive and are not restarted when restarting Nomad.

I observed the following errors in logs:

2020-10-26T19:28:26.592+0100 [ERROR] client.alloc_runner.task_runner: error recovering task; cleaning up: alloc_id=2fb194a7-5964-07f6-e9da-b3c09abfb3a5 task= error="rpc error: code = Unknown desc = failed to decode driver config: EOF" task_id=2fb194a7-5964-07f6-e9da-b3c09abfb3a5//0af76d99
2020-10-26T19:28:26.592+0100 [WARN] client.alloc_runner.task_runner: error destroying unrecoverable task: alloc_id=2fb194a7-5964-07f6-e9da-b3c09abfb3a5 task= error="rpc error: code = Unknown desc = task not found for given id" task_id=2fb194a7-5964-07f6-e9da-b3c09abfb3a5//0af76d99

Failed jobs are then reallocated and run fine, however, it's undesirable that they are restarted.
Would it be hard to support that?

janma / nomad-driver-nspawn Goto Github PK

nomad-driver-nspawn's People

Contributors

Stargazers

Watchers

Forkers

nomad-driver-nspawn's Issues

Failed to parse UID "auto": Invalid argument

Can CSI work with systemd nspawn?

Support Nomad's generic volume/volume_mount syntax

When nspawn fails immediately, it does not get logged

Implement https://github.com/arianvp/nomad-driver-systemd behaviour into this project?

Timed out while getting machine addresses error

Nomad job

Result of `nomad alloc status`

"read_only and user_namespacing may not be combined" - is that really true?

Support restarting Nomad without restarting nspawn containers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

janma / nomad-driver-nspawn Goto Github PK

nomad-driver-nspawn's People

Contributors

Stargazers

Watchers

Forkers

nomad-driver-nspawn's Issues

Nomad job

Result of nomad alloc status

Recommend Projects

Recommend Topics

Recommend Org

Result of `nomad alloc status`