Giter VIP home page Giter VIP logo

guardian's Introduction

Guardian

Note: This repository should be imported as code.cloudfoundry.org/guardian.

A simple single-host OCI container manager.

Developing and Deploying Guardian

For details on how to get started with developing or deploying Guardian please check out the Garden-runC release repo

Components

  • Gardeners Question Time (GQT): A venerable British radio programme. And also a test suite.
  • Gardener: Orchestrates the other components. Implements the Cloud Foundry Garden API.
  • RunDMC: A tiny wrapper around RunC to manage a collection of RunC containers.
  • Kawasaki: It's an amazing networker.

guardian's People

Contributors

ameowlia avatar booleancat avatar callisto13 avatar craigfurman avatar danail-branekov avatar dependabot[bot] avatar dsabeti avatar gcapizzi avatar geofffranks avatar georgethebeatle avatar glestaris avatar goonzoid avatar jrussett avatar julz avatar karampok avatar marcpaquette avatar mariash avatar missingroberto avatar mnitchev avatar ostenbom avatar pppepito86 avatar rosenhouse avatar sunjaybhatia avatar tas-runtime-bot avatar teddyking avatar totherme avatar tscolari avatar vito avatar winkingturtle-vmw avatar yulianedyalkova avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

guardian's Issues

is there any way I can access the containers on the host which the contaienrs are running?

Todos

Please try the following before submitting the issue:

  • Upgrade Concourse if you are observing a failure in a test
  • Use latest BOSH stemcell if the problem occurred in a BOSH VM
  • Use Go 1.6 if the issue has to do with running unit tests locally

Description

I'm trying to access the apps on the diego-cell which the app containers running.
say I'm trying to access
10.0.16.5:61000, 10.0.16.5 is the diego-cell eth0's ip.

but from other node in the same subnet, say, a vm with ip 10.0.16.4.
i can wget 10.0.16.5:61000.
seems the ip:port could not been accessed only in the containers are running on.

could you please help?

Logging and/or test output

Provide any output you think may help understanding further the issue. Guardian
stores its log files in /var/vcap/sys/log/garden when run through its BOSH
release.

Steps to reproduce

Outline the steps to test or reproduce the PR here. Please also provide the
following information if applicable:

  • Guardian release version
  • Linux kernel version
  • Concourse version
  • Go version

Is there a roadmap of guardian?

Hi,
It is very nice to get the infomation of the Garden project is going to transition from it’s own backend for Linux containers to using runC from the Open Container Initiative.
I am developing oci testing project OCT which aims to testing whether the container runtime is compliant to opencontainers/specs. The project is in the OCI/roadmap 1.0 of OCI now.
So, my question is , is there a roadmap of guardian?
If it does have, we are going to do supporting docker(using guardian to run OCI bundle) in OCT(see OCT/roadmap)
Any information about plan of guardian, please help to let me know.
Thanks very much.

Enhancement: add -m comment --comment ${container-handle} to iptables rules

Since the instance IDs used in iptables rules no longer match the names of directories in the container depot, it would be really cool if the the rules themselves pointed to the container handle to make it easier to associate rule with the container.

A comment can be up to 256 characters long and, IIRC, that's longer than the maximum supported length of a container handle.

Hosts file should not contain IPv6 loopback if IPv6 is not enabled

Description

Since upgrading to Garden-runC 1.9.4 we've observed that a few test frameworks now fail (notably, Selenium) on our BOSH deployed Concourse instances. They fail with the following error:

Errno::EAFNOSUPPORT:
       Address family not supported by protocol - socket(2) for "::1" port 9515

It looks like it's finding the IPv6 loopback entry and trying to use it, even though it's not supported by the kernel. We've submitted a patch to have Selenium swallow this error, but it seems more correct to not put an unsupported address in /etc/hosts in the first place; today it's Selenium, who knows what else may be confused.

Steps to reproduce

  • Guardian release version: 1.9.4
  • Linux kernel version: 4.4.0
  • Concourse version: 3.5.1 release candidate (we haven't shipped a version with 1.9.4 yet)
  • Go version: 1.9

Here's our steps:

  1. BOSH-deployed Concourse or just Garden

  2. Run the following in a container (this is what Selenium does, distilled):

    require "socket"
    
    Socket.getaddrinfo('localhost', 8080).map { |e| e[3] }.each do |i|
      TCPServer.new(i, 0).tap { |s| p [i, s] }.close
    end
  3. No errors should be raised.

The tl;dr is that Socket.getaddrinfo('localhost', 8080) is what returns the ::1 address, which it then tries to listen on.

/cc @topherbullock

Guardian should not report a container's age if configured with a runtime plugin that does not report a created time

Description

The Created time field runc outputs as part of it's container state is not part of the runtime spec. If guardian is run with a runtime plugin that does not output this (e.g. winc), then the Age field calculated by guardian is very incorrect and also does not change as the container ages, as it is the max int64 value possible.

Steps to reproduce

  • Start guardian with a runtime plugin that does not report the Created time for a container (e.g. winc)
  • Create a container
  • Get the metrics for that container
  • See the container Age is 9223372036854775807 (max int64)

Suggested Fix

Instead of just taking the unmarshaled value from the runtime plugin output here be defensive and do a check if the time.Time value is zero. If it is, just do not report an Age since we cannot get it correct.

ip: command not found

Description

Upgraded to concourse v3.9.1 on my AWS cluster using Amazon Linux instances and now the workers are erroring on startup.

It looks like it might be a PATH error, because ip IS installed and available in the normal path at /sbin/ip.

But the PATH IS being copied: https://github.com/cloudfoundry/guardian/blob/master/kawasaki/iptables/global_chains.go#L246

The ip: command not found error is escaping a bash subshell (because the subshell isn't using errexit): https://github.com/cloudfoundry/guardian/blob/master/kawasaki/iptables/global_chains.go#L105

Then the result isn't being quoted so the iptables command explodes: https://github.com/cloudfoundry/guardian/blob/master/kawasaki/iptables/global_chains.go#L111

So that causes the fatal Bad argument `ACCEPT'

Logging

From the worker log (stdout+stderr):

Exit trace for group:
garden exited with error: Exit trace for group:
garden-runc exited with error: bulk starter: setting up default chains: iptables: setup-global-chains: + set -o nounset
+ set -o errexit
+ shopt -s nullglob
+ filter_input_chain=w--input
+ filter_forward_chain=w--forward
+ filter_default_chain=w--default
+ filter_instance_prefix=w--instance-
+ nat_prerouting_chain=w--prerouting
+ nat_postrouting_chain=w--postrouting
+ nat_instance_prefix=w--instance-
+ iptables_bin=/opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables
+ case "${ACTION}" in
+ setup_filter
+ teardown_filter
+ teardown_deprecated_rules
++ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -S INPUT
+ rules='-P INPUT ACCEPT'
+ echo '-P INPUT ACCEPT'
+ grep ' -j garden-dispatch'
+ sed -e 's/--icmp-type any/--icmp-type 255\/255/'
+ sed -e s/-A/-D/ -e 's/\s\+$//'
+ xargs --no-run-if-empty --max-lines=1 /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w
++ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -S FORWARD
+ rules='-P FORWARD ACCEPT'
+ echo '-P FORWARD ACCEPT'
+ grep ' -j garden-dispatch'
+ sed -e s/-A/-D/ -e 's/\s\+$//'
+ xargs --no-run-if-empty --max-lines=1 /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w
+ sed -e 's/--icmp-type any/--icmp-type 255\/255/'
+ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -F garden-dispatch
+ true
+ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -X garden-dispatch
+ true
++ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -S w--forward
+ rules=
+ true
+ echo ''
+ grep '\-g w--instance-'
+ sed -e s/-A/-D/ -e 's/\s\+$//'
+ xargs --no-run-if-empty --max-lines=1 /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w
+ sed -e 's/--icmp-type any/--icmp-type 255\/255/'
++ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -S
+ rules='-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT'
+ echo '-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT'
+ grep '^-A w--instance-'
+ sed -e 's/--icmp-type any/--icmp-type 255\/255/'
+ sed -e s/-A/-D/ -e 's/\s\+$//'
+ xargs --no-run-if-empty --max-lines=1 /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w
++ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -S
+ rules='-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT'
+ echo '-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT'
+ grep '^-N w--instance-'
+ sed -e s/-N/-X/ -e 's/\s\+$//'
+ sed -e 's/--icmp-type any/--icmp-type 255\/255/'
+ xargs --no-run-if-empty --max-lines=1 /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w
++ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -S FORWARD
+ rules='-P FORWARD ACCEPT'
+ echo '-P FORWARD ACCEPT'
+ grep ' -j w--forward'
+ sed -e s/-A/-D/ -e 's/\s\+$//'
+ xargs --no-run-if-empty --max-lines=1 /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w
+ sed -e 's/--icmp-type any/--icmp-type 255\/255/'
+ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -F w--forward
+ true
+ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -F w--default
+ true
++ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -S INPUT
+ rules='-P INPUT ACCEPT'
+ echo '-P INPUT ACCEPT'
+ sed -e s/-A/-D/ -e 's/\s\+$//'
+ sed -e 's/--icmp-type any/--icmp-type 255\/255/'
+ grep ' -j w--input'
+ xargs --no-run-if-empty --max-lines=1 /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w
+ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -F w--input
+ true
+ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -X w--input
+ true
++ ip route show
bash: line 94: ip: command not found
++ grep default
++ head -1
++ cut '-d ' -f5
+ default_interface=
+ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -N w--input
+ /opt/concourse/worker/3.9.1/assets/iptables/sbin/iptables -w -I w--input -i --jump ACCEPT
Bad argument `ACCEPT'
Try `iptables -h' or 'iptables --help' for more information.


baggageclaim exited with nil
beacon exited with nil

2018/03/01 00:14:43 failed to forward remote connection: dial tcp 127.0.0.1:7777: connect: connection refused

Steps to reproduce

Outline the steps to test or reproduce the PR here. Please also provide the
following information if applicable:

  • Concourse version: v3.9.1
  • Guardian version: ?
  • Linux kernel version: 4.4.41-36.55.amzn1.x86_64
  • Linux version: Amazon Linux AMI 2016.09
  • AMI name: amzn-ami-hvm-2016.09.1.20170119-x86_64-gp2

Hosts should contain both ipv4 AND ipv6 loopback

(coming from concourse-land)

Right now, when I spin up concourse (which ofc uses guardian as the backend) my container /etc/hosts file appears to be blown away and replaced with something like this:

127.0.0.1	localhost
fe00::0	ip6-localnet
ff00::0	ip6-mcastprefix
ff02::1	ip6-allnodes
ff02::2	ip6-allrouters
172.17.0.2	98664ae95097

Some of the concourse folks believe this is either guardian mapping a hosts file or some default somewhere in guardian.

It'd be great if both 127.0.0.1 localhost as well as ::1 localhost were provided for those of us running tests that have to use ::1 ipv6 loopback in our tests. πŸ™‚

Auto-detect MTU?

GCE machines have an MTU of 1460. Guardian/Garden-runC default to 1500, so network packets in containers would randomly be dropped, causing things to hang.

Is there any way this could be auto-detected and remove the 1500 default?

Default port-pool overlaps with linux default ephemeral port range

I see that Guardian defaults to using a port range from 60000-65000:

Network struct {
    PortPoolStart  uint32 `long:"port-pool-start" default:"60000" description:"Start of the ephemeral port range used for mapped container ports."`
    PortPoolSize  uint32 `long:"port-pool-size"  default:"5000"  description:"Size of the port pool used for mapped container ports."`
}

[ source code ]

I also see that Linux uses part of that range for the local ephemeral port range:

$ uname -a
Linux 4f4b699f-80f6-4a63-880b-f0917d519000 4.4.0-59-generic #80~14.04.1-Ubuntu SMP Fri Jan 6 18:02:02 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

$ cat /proc/sys/net/ipv4/ip_local_port_range
32768	60999

We don't want those two ranges to overlap, correct?

Support configuring set of certificates in `/etc/ssl/certs`

Description

On the Concourse team we've had a long-running issue (concourse/concourse#1027) around adding certificates to resource containers. We need this to support custom docker registries, proxy servers, and various other super-enterprisey-custom-cert-requirin' use cases that are better served by a more general approach than special-casing each resource container's image / creation process.

After an approach of automatically bind-mounting certificates into all containers, we're wondering whether this would be better served as a feature of Garden itself.

A Modest Proposal

This could work in a similar to the way to how kawasaki writes / modifies /etc/hosts and /etc/resolv.conf based on network config passed to the Garden server. ServerCommand could have a separate namespace for Certificates, and allow specifying a set of cert files from the host to propagate over to created containers.

Support for pre-3.19 kernels?

As far as I know the only thing requiring 3.19 is the use of the execveat syscall used by nstar.c for StreamIn/StreamOut. Unfortunately the use of 3.19+ is not yet widespread: lots of infrequently-updated-but-commonly-required distributions are still a ways behind. For example, CentOS 7.2 is on 3.10, without a blessed way to upgrade.

Is there an alternative approach to solving the security issue that execveat was used to fix? Or, if the syscall is not present, could we (/ should we) fall back on the less-secure-but-working code path?

The answer to these questions may be no, just wanted to raise this as an issue that affects Concourse/Guardian adoption.

Dead Run/Attach consumers stick around and siphon process output into the abyss

Description

Concourse's ATCs can go away and reattach to builds that were in flight. However, Garden seems to retain some state from the original attaching of the process (via both Run or Attach), resulting in the re-attaching losing 50% of the output, resulting in very odd looking build logs. For example:

https://ci.concourse.ci/teams/main/pipelines/main/jobs/topgun/builds/330

All we need is to not have this permanent output loss; we're OK with losing a window of log output during the detach + reattach, we just don't want build logs to be butchered after reattaching. http.CloseNotifier may be of use here - we use it in Concourse for closing event streams for dead clients, and it seems pretty straightforward.

Logging and/or test output

(n/a)

Steps to reproduce

  • Spawn a long-running process printing monotonically increasing numbers (1, 2, 3, 4, 5, 6, ...).
  • Kill the thing that spawned and attached to the process.
  • Run another thing that attaches to the process.
  • You will see something like 10, 12, 14, 16, .... Ideally you'd see something like 10, 11, 12, 13, 14, ....

Huge log line on start

Description

The following logs are super noisy and probably overkill. Maybe it can just be moved to debug level if it's useful to y'all?

Logging and/or test output

concourse-worker_1  | {"timestamp":"1496353667.616831779","source":"guardian","message":"guardian.base-bundles","log_level":1,"data":{"privileged":{"Spec":{"ociVersion":"1.0.0-rc3","platform":{"os":"linux","arch":"amd64"},"process":{"consoleSize":{"height":0,"width":0},"user":{"uid":0,"gid":0},"args":["/tmp/garden-init"],"cwd":"/","capabilities":{"bounding":["CAP_AUDIT_CONTROL","CAP_AUDIT_READ","CAP_AUDIT_WRITE","CAP_BLOCK_SUSPEND","CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_DAC_READ_SEARCH","CAP_FOWNER","CAP_FSETID","CAP_IPC_LOCK","CAP_IPC_OWNER","CAP_KILL","CAP_LEASE","CAP_LINUX_IMMUTABLE","CAP_MAC_ADMIN","CAP_MAC_OVERRIDE","CAP_MKNOD","CAP_NET_ADMIN","CAP_NET_BIND_SERVICE","CAP_NET_BROADCAST","CAP_NET_RAW","CAP_SETGID","CAP_SETFCAP","CAP_SETPCAP","CAP_SETUID","CAP_SYS_ADMIN","CAP_SYS_BOOT","CAP_SYS_CHROOT","CAP_SYS_MODULE","CAP_SYS_NICE","CAP_SYS_PACCT","CAP_SYS_PTRACE","CAP_SYS_RAWIO","CAP_SYS_RESOURCE","CAP_SYS_TIME","CAP_SYS_TTY_CONFIG","CAP_SYSLOG","CAP_WAKE_ALARM"],"inheritable":["CAP_AUDIT_CONTROL","CAP_AUDIT_READ","CAP_AUDIT_WRITE","CAP_BLOCK_SUSPEND","CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_DAC_READ_SEARCH","CAP_FOWNER","CAP_FSETID","CAP_IPC_LOCK","CAP_IPC_OWNER","CAP_KILL","CAP_LEASE","CAP_LINUX_IMMUTABLE","CAP_MAC_ADMIN","CAP_MAC_OVERRIDE","CAP_MKNOD","CAP_NET_ADMIN","CAP_NET_BIND_SERVICE","CAP_NET_BROADCAST","CAP_NET_RAW","CAP_SETGID","CAP_SETFCAP","CAP_SETPCAP","CAP_SETUID","CAP_SYS_ADMIN","CAP_SYS_BOOT","CAP_SYS_CHROOT","CAP_SYS_MODULE","CAP_SYS_NICE","CAP_SYS_PACCT","CAP_SYS_PTRACE","CAP_SYS_RAWIO","CAP_SYS_RESOURCE","CAP_SYS_TIME","CAP_SYS_TTY_CONFIG","CAP_SYSLOG","CAP_WAKE_ALARM"],"permitted":["CAP_AUDIT_CONTROL","CAP_AUDIT_READ","CAP_AUDIT_WRITE","CAP_BLOCK_SUSPEND","CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_DAC_READ_SEARCH","CAP_FOWNER","CAP_FSETID","CAP_IPC_LOCK","CAP_IPC_OWNER","CAP_KILL","CAP_LEASE","CAP_LINUX_IMMUTABLE","CAP_MAC_ADMIN","CAP_MAC_OVERRIDE","CAP_MKNOD","CAP_NET_ADMIN","CAP_NET_BIND_SERVICE","CAP_NET_BROADCAST","CAP_NET_RAW","CAP_SETGID","CAP_SETFCAP","CAP_SETPCAP","CAP_SETUID","CAP_SYS_ADMIN","CAP_SYS_BOOT","CAP_SYS_CHROOT","CAP_SYS_MODULE","CAP_SYS_NICE","CAP_SYS_PACCT","CAP_SYS_PTRACE","CAP_SYS_RAWIO","CAP_SYS_RESOURCE","CAP_SYS_TIME","CAP_SYS_TTY_CONFIG","CAP_SYSLOG","CAP_WAKE_ALARM"]}},"root":{"path":""},"mounts":[{"destination":"/sys","type":"sysfs","source":"sysfs","options":["nosuid","noexec","nodev","ro"]},{"destination":"/dev/shm","type":"tmpfs","source":"tmpfs"},{"destination":"/dev/pts","type":"devpts","source":"devpts","options":["nosuid","noexec","newinstance","ptmxmode=0666","mode=0620"]},{"destination":"/tmp/garden-init","type":"bind","source":"/worker-state/3.1.0/assets/bin/init","options":["bind"]},{"destination":"/proc","type":"proc","source":"proc","options":["nosuid","noexec","nodev"]}],"linux":{"resources":{"devices":[{"allow":false,"access":"rwm"},{"allow":true,"type":"c","major":1,"minor":3,"access":"rwm"},{"allow":true,"type":"c","major":5,"minor":0,"access":"rwm"},{"allow":true,"type":"c","major":1,"minor":8,"access":"rwm"},{"allow":true,"type":"c","major":1,"minor":9,"access":"rwm"},{"allow":true,"type":"c","major":1,"minor":5,"access":"rwm"},{"allow":true,"type":"c","major":1,"minor":7,"access":"rwm"},{"allow":true,"type":"c","major":1,"minor":7,"access":"rwm"},{"allow":true,"type":"c","major":10,"minor":229,"access":"rwm"}]},"namespaces":[{"type":"network"},{"type":"pid"},{"type":"uts"},{"type":"ipc"},{"type":"mount"}],"devices":[{"path":"/dev/fuse","type":"c","major":10,"minor":229,"fileMode":438}]}}},"unprivileged":{"Spec":{"ociVersion":"1.0.0-rc3","platform":{"os":"linux","arch":"amd64"},"process":{"consoleSize":{"height":0,"width":0},"user":{"uid":0,"gid":0},"args":["/tmp/garden-init"],"cwd":"/","capabilities":{"bounding":["CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FSETID","CAP_FOWNER","CAP_MKNOD","CAP_NET_RAW","CAP_SETGID","CAP_SETUID","CAP_SETFCAP","CAP_SETPCAP","CAP_NET_BIND_SERVICE","CAP_SYS_CHROOT","CAP_KILL","CAP_AUDIT_WRITE"],"effective":["CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FSETID","CAP_FOWNER","CAP_MKNOD","CAP_NET_RAW","CAP_SETGID","CAP_SETUID","CAP_SETFCAP","CAP_SETPCAP","CAP_NET_BIND_SERVICE","CAP_SYS_CHROOT","CAP_KILL","CAP_AUDIT_WRITE"],"inheritable":["CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FSETID","CAP_FOWNER","CAP_MKNOD","CAP_NET_RAW","CAP_SETGID","CAP_SETUID","CAP_SETFCAP","CAP_SETPCAP","CAP_NET_BIND_SERVICE","CAP_SYS_CHROOT","CAP_KILL","CAP_AUDIT_WRITE"],"permitted":["CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FSETID","CAP_FOWNER","CAP_MKNOD","CAP_NET_RAW","CAP_SETGID","CAP_SETUID","CAP_SETFCAP","CAP_SETPCAP","CAP_NET_BIND_SERVICE","CAP_SYS_CHROOT","CAP_KILL","CAP_AUDIT_WRITE"],"ambient":["CAP_CHOWN","CAP_DAC_OVERRIDE","CAP_FSETID","CAP_FOWNER","CAP_MKNOD","CAP_NET_RAW","CAP_SETGID","CAP_SETUID","CAP_SETFCAP","CAP_SETPCAP","CAP_NET_BIND_SERVICE","CAP_SYS_CHROOT","CAP_KILL","CAP_AUDIT_WRITE"]}},"root":{"path":""},"mounts":[{"destination":"/sys","type":"sysfs","source":"sysfs","options":["nosuid","noexec","nodev","ro"]},{"destination":"/dev/shm","type":"tmpfs","source":"tmpfs"},{"destination":"/dev/pts","type":"devpts","source":"devpts","options":["nosuid","noexec","newinstance","ptmxmode=0666","mode=0620"]},{"destination":"/tmp/garden-init","type":"bind","source":"/worker-state/3.1.0/assets/bin/init","options":["bind"]},{"destination":"/proc","type":"proc","source":"proc","options":["nosuid","noexec","nodev"]}],"linux":{"uidMappings":[{"hostID":4294967294,"containerID":0,"size":1},{"hostID":1,"containerID":1,"size":4294967293}],"gidMappings":[{"hostID":4294967294,"containerID":0,"size":1},{"hostID":1,"containerID":1,"size":4294967293}],"resources":{"devices":[{"allow":false,"access":"rwm"},{"allow":true,"type":"c","major":1,"minor":3,"access":"rwm"},{"allow":true,"type":"c","major":5,"minor":0,"access":"rwm"},{"allow":true,"type":"c","major":1,"minor":8,"access":"rwm"},{"allow":true,"type":"c","major":1,"minor":9,"access":"rwm"},{"allow":true,"type":"c","major":1,"minor":5,"access":"rwm"},{"allow":true,"type":"c","major":1,"minor":7,"access":"rwm"},{"allow":true,"type":"c","major":1,"minor":7,"access":"rwm"},{"allow":true,"type":"c","major":10,"minor":229,"access":"rwm"}]},"namespaces":[{"type":"network"},{"type":"pid"},{"type":"uts"},{"type":"ipc"},{"type":"mount"},{"type":"user"}],"devices":[{"path":"/dev/fuse","type":"c","major":10,"minor":229,"fileMode":438}],"seccomp":{"defaultAction":"SCMP_ACT_ERRNO","architectures":["SCMP_ARCH_X86_64","SCMP_ARCH_X86","SCMP_ARCH_X32"],"syscalls":[{"names":["accept"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["accept4"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["access"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["alarm"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["bind"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["brk"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["capget"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["capset"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["chdir"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["chmod"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["chown"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["chown32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["clock_getres"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["clock_gettime"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["clock_nanosleep"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["close"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["connect"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["copy_file_range"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["creat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["dup"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["dup2"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["dup3"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["epoll_create"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["epoll_create1"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["epoll_ctl"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["epoll_ctl_old"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["epoll_pwait"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["epoll_wait"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["epoll_wait_old"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["eventfd"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["eventfd2"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["execve"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["execveat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["exit"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["exit_group"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["faccessat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fadvise64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fadvise64_64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fallocate"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fanotify_mark"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fchdir"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fchmod"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fchmodat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fchown"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fchown32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fchownat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fcntl"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fcntl64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fdatasync"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fgetxattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["flistxattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["flock"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fork"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fremovexattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fsetxattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fstat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fstat64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fstatat64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fstatfs"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fstatfs64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["fsync"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["ftruncate"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["ftruncate64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["futex"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["futimesat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getcpu"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getcwd"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getdents"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getdents64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getegid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getegid32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["geteuid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["geteuid32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getgid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getgid32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getgroups"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getgroups32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getitimer"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getpeername"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getpgid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getpgrp"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getpid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getppid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getpriority"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getrandom"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getresgid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getresgid32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getresuid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getresuid32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getrlimit"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["get_robust_list"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getrusage"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getsid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getsockname"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getsockopt"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["get_thread_area"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["gettid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["gettimeofday"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getuid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getuid32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["getxattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["inotify_add_watch"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["inotify_init"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["inotify_init1"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["inotify_rm_watch"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["io_cancel"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["ioctl"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["io_destroy"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["io_getevents"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["ioprio_get"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["ioprio_set"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["io_setup"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["io_submit"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["ipc"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["kill"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["lchown"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["lchown32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["lgetxattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["link"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["linkat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["listen"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["listxattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["llistxattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["_llseek"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["lremovexattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["lseek"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["lsetxattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["lstat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["lstat64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["madvise"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["memfd_create"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mincore"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mkdir"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mkdirat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mknod"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mknodat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mlock"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mlock2"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mlockall"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mmap"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mmap2"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mprotect"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mq_getsetattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mq_notify"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mq_open"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mq_timedreceive"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mq_timedsend"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mq_unlink"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["mremap"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["msgctl"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["msgget"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["msgrcv"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["msgsnd"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["msync"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["munlock"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["munlockall"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["munmap"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["nanosleep"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["newfstatat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["_newselect"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["open"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["openat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["pause"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["personality"],"action":"SCMP_ACT_ALLOW","args":[{"index":0,"value":0,"valueTwo":0,"op":"SCMP_CMP_EQ"}]},{"names":["personality"],"action":"SCMP_ACT_ALLOW","args":[{"index":0,"value":8,"valueTwo":0,"op":"SCMP_CMP_EQ"}]},{"names":["personality"],"action":"SCMP_ACT_ALLOW","args":[{"index":0,"value":4294967295,"valueTwo":0,"op":"SCMP_CMP_EQ"}]},{"names":["pipe"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["pipe2"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["poll"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["ppoll"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["prctl"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["pread64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["preadv"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["prlimit64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["pselect6"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["pwrite64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["pwritev"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["read"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["readahead"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["readlink"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["readlinkat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["readv"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["recv"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["recvfrom"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["recvmmsg"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["recvmsg"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["remap_file_pages"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["removexattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["rename"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["renameat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["renameat2"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["restart_syscall"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["rmdir"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["rt_sigaction"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["rt_sigpending"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["rt_sigprocmask"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["rt_sigqueueinfo"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["rt_sigreturn"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["rt_sigsuspend"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["rt_sigtimedwait"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["rt_tgsigqueueinfo"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sched_getaffinity"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sched_getattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sched_getparam"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sched_get_priority_max"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sched_get_priority_min"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sched_getscheduler"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sched_rr_get_interval"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sched_setaffinity"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sched_setattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sched_setparam"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sched_setscheduler"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sched_yield"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["seccomp"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["select"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["semctl"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["semget"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["semop"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["semtimedop"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["send"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sendfile"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sendfile64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sendmmsg"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sendmsg"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sendto"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setfsgid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setfsgid32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setfsuid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setfsuid32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setgid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setgid32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setgroups"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setgroups32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setitimer"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setpgid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setpriority"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setregid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setregid32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setresgid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setresgid32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setresuid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setresuid32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setreuid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setreuid32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setrlimit"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["set_robust_list"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setsid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setsockopt"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["set_thread_area"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["set_tid_address"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setuid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setuid32"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["setxattr"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["shmat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["shmctl"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["shmdt"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["shmget"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["shutdown"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sigaltstack"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["signalfd"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["signalfd4"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sigreturn"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["socket"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["socketcall"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["socketpair"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["splice"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["stat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["stat64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["statfs"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["statfs64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["symlink"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["symlinkat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sync"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sync_file_range"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["syncfs"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["sysinfo"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["syslog"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["tee"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["tgkill"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["time"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["timer_create"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["timer_delete"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["timerfd_create"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["timerfd_gettime"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["timerfd_settime"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["timer_getoverrun"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["timer_gettime"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["timer_settime"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["times"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["tkill"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["truncate"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["truncate64"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["ugetrlimit"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["umask"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["uname"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["unlink"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["unlinkat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["utime"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["utimensat"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["utimes"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["vfork"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["vmsplice"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["wait4"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["waitid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["waitpid"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["write"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["writev"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["arch_prctl"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["modify_ldt"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["chroot"],"action":"SCMP_ACT_ALLOW","args":[]},{"names":["clone"],"action":"SCMP_ACT_ALLOW","args":[{"index":0,"value":2080505856,"valueTwo":0,"op":"SCMP_CMP_MASKED_EQ"}]}]},"maskedPaths":["/proc/kcore","/proc/latency_stats","/proc/timer_stats","/proc/sched_debug"]}}}}}

Steps to reproduce

Start Guardian. :)

  • Guardian release version: 1.6.0
  • Linux kernel version: n/a
  • Concourse version: n/a
  • Go version: n/a

Garden exception during bosh shutdown

Versions:
observed on:
bosh-vsphere-esxi-ubuntu-trusty-go_agent/3421.6 and 3363.14
garden-runc 1.0.4

When deploy a BOSH release, i frequently get


Error 450001: Action Failed get_task: Task 4ca73da5-0b63-489d-5369-632a15f6461c result: Stopping Monitored Services: Stopping services
'[garden]' errored

And garden.stderr.log always contains

PC=0x460251 m=0

goroutine 0 [idle]:
runtime.futex(0xee5490, 0x0, 0x0, 0x0, 0x7ffc00000000, 0x0, 0x0, 0x0, 0x7ffcab765698, 0x40f8b2, ...)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/sys_linux_amd64.s:387 +0x21
runtime.futexsleep(0xee5490, 0x0, 0xffffffffffffffff)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/os_linux.go:45 +0x62
runtime.notesleep(0xee5490)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/lock_futex.go:145 +0x82
runtime.stopm()
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/proc.go:1594 +0xad
runtime.findrunnable(0xc42001c000, 0x0)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/proc.go:2021 +0x228
runtime.schedule()
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/proc.go:2120 +0x14c
runtime.goexit0(0xc42008e340)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/proc.go:2257 +0x180
runtime.mcall(0x7ffcab765840)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/asm_amd64.s:240 +0x5b

goroutine 1 [chan receive, 942 minutes]:
code.cloudfoundry.org/guardian/guardiancmd.(*GuardianCommand).Execute(0xc42020edc0, 0xc420243860, 0x0, 0x1a, 0x0, 0x0)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/guardian/guardiancmd/command.go:236 +0xb8
main.main()
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/guardian/cmd/guardian/main.go:22 +0xe4

goroutine 17 [syscall, 942 minutes, locked to thread]:
runtime.goexit()
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/asm_amd64.s:2086 +0x1

goroutine 5 [syscall]:
os/signal.signal_recv(0xeca0c0)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/sigqueue.go:116 +0x157
os/signal.loop()
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/os/signal/signal_unix.go:22 +0x22
created by os/signal.init.1
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/os/signal/signal_unix.go:28 +0x41

goroutine 18 [select]:
github.com/tedsuo/ifrit/sigmon.sigmon.Run(0xc420208e60, 0x2, 0x2, 0xec2c80, 0xc42020edc0, 0xc4202025a0, 0xc420202600, 0x0, 0x0)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/sigmon/sigmon.go:35 +0x317
github.com/tedsuo/ifrit/sigmon.(*sigmon).Run(0xc4201edd10, 0xc4202025a0, 0xc420202600, 0x0, 0x0)
	<autogenerated>:1 +0x82
github.com/tedsuo/ifrit.(*process).run(0xc4201eeb80)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:71 +0x49
created by github.com/tedsuo/ifrit.Background
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:49 +0x14e

goroutine 19 [chan receive, 942 minutes]:
github.com/tedsuo/ifrit.(*process).Wait.func1(0xc4201eeb80, 0xc4202026c0)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:83 +0x44
created by github.com/tedsuo/ifrit.(*process).Wait
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:85 +0x67

goroutine 34 [select, 942 minutes, locked to thread]:
runtime.gopark(0xa694e8, 0x0, 0xa0b341, 0x6, 0x18, 0x2)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/proc.go:259 +0x13a
runtime.selectgoImpl(0xc42025a730, 0x0, 0x18)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/select.go:423 +0x11d9
runtime.selectgo(0xc42025a730)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/select.go:238 +0x1c
runtime.ensureSigM.func1()
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/signal1_unix.go:304 +0x2f3
runtime.goexit()
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/asm_amd64.s:2086 +0x1

goroutine 35 [semacquire]:
sync.runtime_Semacquire(0xc4202afd5c)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/sema.go:47 +0x30
sync.(*WaitGroup).Wait(0xc4202afd50)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/sync/waitgroup.go:131 +0x97
code.cloudfoundry.org/garden/server.(*GardenServer).Stop(0xc4202a60b0)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden/server/server.go:209 +0x346
code.cloudfoundry.org/guardian/guardiancmd.(*GuardianCommand).Run(0xc42020edc0, 0xc42024c180, 0xc42024c1e0, 0x0, 0x0)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/guardian/guardiancmd/command.go:342 +0xd86
github.com/tedsuo/ifrit.(*process).run(0xc420200040)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:71 +0x49
created by github.com/tedsuo/ifrit.Background
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:49 +0x14e

goroutine 36 [chan receive, 942 minutes]:
github.com/tedsuo/ifrit.(*process).Wait.func1(0xc420200040, 0xc42024c2a0)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:83 +0x44
created by github.com/tedsuo/ifrit.(*process).Wait
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:85 +0x67

goroutine 7 [select]:
code.cloudfoundry.org/guardian/metrics.PeriodicMetronNotifier.Start.func1(0xec8a00, 0xc420028128, 0xecee00, 0xc42024d620, 0xdf8475800, 0xecee00, 0xc42024c300, 0xecd300, 0xf01098, 0xecd060, ...)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/guardian/metrics/periodic_metron_notifier.go:68 +0x4b0
created by code.cloudfoundry.org/guardian/metrics.PeriodicMetronNotifier.Start
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/guardian/metrics/periodic_metron_notifier.go:82 +0x277

goroutine 39 [select]:
github.com/cloudfoundry/dropsonde/metricbatcher.New.func1(0xc420201000)
	/var/vcap/packages/guardian/src/github.com/cloudfoundry/dropsonde/metricbatcher/metricbatcher.go:34 +0x13e
created by github.com/cloudfoundry/dropsonde/metricbatcher.New
	/var/vcap/packages/guardian/src/github.com/cloudfoundry/dropsonde/metricbatcher/metricbatcher.go:42 +0x155

goroutine 40 [select]:
github.com/cloudfoundry/dropsonde/runtime_stats.(*RuntimeStats).Run(0xc4202b7ca0, 0x0)
	/var/vcap/packages/guardian/src/github.com/cloudfoundry/dropsonde/runtime_stats/runtime_stats.go:33 +0x1bd
created by github.com/cloudfoundry/dropsonde.initialize
	/var/vcap/packages/guardian/src/github.com/cloudfoundry/dropsonde/dropsonde.go:100 +0x255

goroutine 43 [select]:
code.cloudfoundry.org/garden/server/bomberman.(*Bomberman).manageBombs(0xc4201eed80)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden/server/bomberman/bomberman.go:68 +0x6bc
created by code.cloudfoundry.org/garden/server/bomberman.New
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden/server/bomberman/bomberman.go:43 +0x182

goroutine 1838799 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc423ba4008)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838717 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc4240de000)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 45 [chan receive, 942 minutes]:
github.com/tedsuo/ifrit.(*process).Wait.func1(0xc4201eeb80, 0xc4203de420)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:83 +0x44
created by github.com/tedsuo/ifrit.(*process).Wait
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:85 +0x67

goroutine 1838720 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc423ba4010)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838810 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc4236d2090)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838797 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc423c0ce30)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838759 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc422e6a000)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838795 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc4236d3eb0)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838815 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc423252f70)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838806 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc423c0c000)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838812 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc423c0db58)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838822 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc423c0c030)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838794 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc4236d3ea8)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838719 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc423c0c050)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838796 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc4240de030)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838850 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc4236d2010)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838819 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc423c0c010)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838712 [sleep, 1 minutes]:
time.Sleep(0xdf8475800)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/runtime/time.go:59 +0xe1
code.cloudfoundry.org/garden/server/streamer.(*Streamer).Stop.func1(0xc4202b9e60, 0xc42071efd0, 0x3)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden/server/streamer/streamer.go:103 +0x33
created by code.cloudfoundry.org/garden/server/streamer.(*Streamer).Stop
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden/server/streamer/streamer.go:108 +0x8b

goroutine 1838813 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc423c0db60)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838798 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc423ba4000)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838818 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc423c0d9b8)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838801 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc424109bd0)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838811 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc423252000)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838718 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc4240de010)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838817 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc424106000)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838721 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc423c0c020)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838800 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc424108720)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838814 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc4241092c0)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838816 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc4236d2000)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838579 [runnable]:
syscall.readlen(0x8, 0xc420e0ac48, 0x8, 0x3, 0xc4218041c0, 0xe)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/syscall/zsyscall_linux_amd64.go:1084
syscall.forkExec(0xc423c0a560, 0xf, 0xc422b0b700, 0x2, 0x2, 0xc420e0ade0, 0xa65c08, 0xc41fea7a41, 0xc422ca8d00)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/syscall/exec_unix.go:202 +0x33e
syscall.StartProcess(0xc423c0a560, 0xf, 0xc422b0b700, 0x2, 0x2, 0xc420e0ade0, 0x2, 0x4, 0x455d50, 0xc423bee580)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/syscall/exec_unix.go:240 +0x64
os.startProcess(0xc423c0a560, 0xf, 0xc422b0b700, 0x2, 0x2, 0xc420e0af88, 0xc4213de5b0, 0xd, 0xd)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/os/exec_posix.go:45 +0x1a3
os.StartProcess(0xc423c0a560, 0xf, 0xc422b0b700, 0x2, 0x2, 0xc420e0af88, 0x0, 0x0, 0x0)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/os/doc.go:28 +0x64
os/exec.(*Cmd).Start(0xc420db8b00, 0xc422b0b700, 0xc423c0a540)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/os/exec/exec.go:358 +0x3c9
os/exec.(*Cmd).Run(0xc420db8b00, 0xa, 0xc420e0b060)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/os/exec/exec.go:276 +0x2b
code.cloudfoundry.org/garden-shed/docker_drivers/aufs.(*Loop).Unmount.func1(0xc423c0a430, 0xc420e0b0d8)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden-shed/docker_drivers/aufs/loop.go:36 +0x16c
github.com/eapache/go-resiliency/retrier.(*Retrier).Run(0xc4202006c0, 0xc422b0b440, 0x7, 0xc4203fecc0)
	/var/vcap/packages/guardian/src/github.com/eapache/go-resiliency/retrier/retrier.go:42 +0x32
code.cloudfoundry.org/garden-shed/docker_drivers/aufs.(*Loop).Unmount(0xc4201ff160, 0xc42128fd50, 0x66, 0x0, 0x0)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden-shed/docker_drivers/aufs/loop.go:43 +0x25d
code.cloudfoundry.org/garden-shed/docker_drivers/aufs.(*QuotaedDriver).Put(0xc420264230, 0xc422b4c580, 0x40, 0x5e3a31, 0xc420c106e0)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden-shed/docker_drivers/aufs/aufs.go:91 +0x19e
code.cloudfoundry.org/garden-shed/layercake.(*Docker).Remove(0xc4201ff2a0, 0xec52c0, 0xc422e6a230, 0x5e3ed1, 0xc420c106e0)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden-shed/layercake/docker.go:53 +0x60
code.cloudfoundry.org/garden-shed/layercake.(*AufsCake).Remove(0xc420288cc0, 0xec52c0, 0xc422e6a230, 0xc4204ff180, 0x0)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden-shed/layercake/aufs.go:130 +0x62
code.cloudfoundry.org/garden-shed/layercake/cleaner.(*OvenCleaner).removeRecursively(0xc4201ff360, 0xecee00, 0xc42040c840, 0xecf5a0, 0xc420288cc0, 0xec52c0, 0xc422e6a230, 0x0, 0x0)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden-shed/layercake/cleaner/cleaner.go:83 +0x5aa
code.cloudfoundry.org/garden-shed/layercake/cleaner.(*OvenCleaner).GC(0xc4201ff360, 0xecee00, 0xc4237332c0, 0xecf5a0, 0xc420288cc0, 0x0, 0x0)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden-shed/layercake/cleaner/cleaner.go:53 +0x240
code.cloudfoundry.org/garden-shed/rootfs_provider.(*CakeOrdinator).GC(0xc4202642a0, 0xecee00, 0xc423733200, 0x0, 0x0)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden-shed/rootfs_provider/cakeordinator.go:92 +0xa6
code.cloudfoundry.org/guardian/gardener.(*Gardener).Create(0xc42026ba40, 0xc4245d4980, 0x39, 0x45d964b800, 0xc422cb3560, 0x24, 0x0, 0x0, 0x0, 0x0, ...)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/guardian/gardener/gardener.go:214 +0x378
code.cloudfoundry.org/garden/server.(*GardenServer).handleCreate(0xc4202a60b0, 0xecb800, 0xc421f7cea0, 0xc4212705a0)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden/server/request_handling.go:85 +0x3a6
code.cloudfoundry.org/garden/server.(*GardenServer).(code.cloudfoundry.org/garden/server.handleCreate)-fm(0xecb800, 0xc421f7cea0, 0xc4212705a0)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden/server/server.go:75 +0x48
net/http.HandlerFunc.ServeHTTP(0xc4202afd80, 0xecb800, 0xc421f7cea0, 0xc4212705a0)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/net/http/server.go:1726 +0x44
github.com/bmizerany/pat.(*PatternServeMux).ServeHTTP(0xc4201fca40, 0xecb800, 0xc421f7cea0, 0xc4212705a0)
	/var/vcap/packages/guardian/src/github.com/bmizerany/pat/mux.go:109 +0x5de
code.cloudfoundry.org/garden/server.New.func1(0xecb800, 0xc421f7cea0, 0xc4212705a0)
	/var/vcap/packages/guardian/src/code.cloudfoundry.org/garden/server/server.go:112 +0x4d
net/http.HandlerFunc.ServeHTTP(0xc4202b7b40, 0xecb800, 0xc421f7cea0, 0xc4212705a0)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/net/http/server.go:1726 +0x44
net/http.serverHandler.ServeHTTP(0xc420260700, 0xecb800, 0xc421f7cea0, 0xc4212705a0)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/net/http/server.go:2202 +0x7d
net/http.(*conn).serve(0xc4213c0e80, 0xecc500, 0xc4245d47c0)
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/net/http/server.go:1579 +0x4b7
created by net/http.(*Server).Serve
	/var/vcap/data/packages/golang_1.7.1/13854b532e3b7411e6abeb2e683c530b8114723b/src/net/http/server.go:2293 +0x44d

goroutine 1838834 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc424108000)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

goroutine 1838836 [select]:
github.com/tedsuo/ifrit.(*process).Signal.func1(0xc420200040, 0xeca0c0, 0xc4236d2250)
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:92 +0x112
created by github.com/tedsuo/ifrit.(*process).Signal
	/var/vcap/packages/guardian/src/github.com/tedsuo/ifrit/process.go:96 +0x53

rax    0xca
rbx    0xee4730
rcx    0x460253
rdx    0x0
rdi    0xee5490
rsi    0x0
rbp    0x7ffcab765668
rsp    0x7ffcab765620
r8     0x0
r9     0x0
r10    0x0
r11    0x286
r12    0x0
r13    0xd3
r14    0xa65c0b
r15    0x3
rip    0x460251
rflags 0x286
cs     0x33
fs     0x0
gs     0x0

Support setting up iptables with clamp mss to pmtu

Description

As discussed in concourse/concourse#618 our network setup on OpenStack has problems with PMTU discovery.

Inserting iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu improves network speed a lot. As less packets have to be retransmitted with reduced MSS.

I think this is generally the right approach as the worker knows the path MTU and uses NAT to forward the container traffic.

I suggest to add an option to the BOSH manifest to get this rule on all the workers automatically. However I don't see a problem with having the rule as a default.

  • Guardian release version: 0.7.0
  • Concourse version: 2.0.2

iptables FORWARD chain DROP policy breaks workers

Re-opening this issue here as it seems this regards guardian.

I've come across the issue in machines (where workers are installed in) that have a default DROP policy for the FORWARD and INPUT chain and the last rule of the chain is a REJECT.

When starting the worker the w--forward rule is added at the end of the FORWARD chain, which means that packets will be dropped before they reach the that rule. However even if the w--forward rule is moved up, any NEW connections will be dropped (since the w--default rule doesn't pick up packets with ctstate NEW)

Initial INPUT and FORWARD chains:

Chain INPUT (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    (...)
    0     0 LOG        all  --  *      *       0.0.0.0/0            0.0.0.0/0            limit: avg 5/min burst 5 LOG flags 0 level 7 prefix "iptables INPUT denied: "
    0     0 REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-port-unreachable

Chain FORWARD (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 LOG        all  --  *      *       0.0.0.0/0            0.0.0.0/0            limit: avg 5/min burst 5 LOG flags 0 level 7 prefix "iptables FORWARD denied: "
    0     0 REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-port-unreachable

After the concourse worker is started:

Chain INPUT (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    (...)
    0     0 LOG        all  --  *      *       0.0.0.0/0            0.0.0.0/0            limit: avg 5/min burst 5 LOG flags 0 level 7 prefix "iptables INPUT denied: "
    0     0 REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-port-unreachable
    0     0 w--input   all  --  w+     *       0.0.0.0/0            0.0.0.0/0

Chain FORWARD (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 LOG        all  --  *      *       0.0.0.0/0            0.0.0.0/0            limit: avg 5/min burst 5 LOG flags 0 level 7 prefix "iptables FORWARD denied: "
    0     0 REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-port-unreachable
    0     0 w--forward  all  --  w+     *       0.0.0.0/0            0.0.0.0/0

Chain w--default (0 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED

Chain w--forward (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 ACCEPT     all  --  eth1   *       0.0.0.0/0            0.0.0.0/0
    0     0 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0

Chain w--input (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 ACCEPT     all  --  eth1   *       0.0.0.0/0            0.0.0.0/0
    0     0 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0

After moving the w--forward and w--input rule up and trying to run a job it's easy to see that requests are being blocked

Apr 21 10:01:14 n0 kernel: iptables FORWARD denied: IN=wbrdg-c0a80000 OUT=eth1 MAC=66:b8:b0:60:92:a4:82:52:67:81:a5:88:08:00 SRC=192.168.0.2 DST=10.34.104.11 LEN=66 TOS=0x00 PREC=0x00 TTL=63 ID=58362 DF PROTO=UDP SPT=38584 DPT=53 LEN=46

So my first question is shouldn't the w--default rule be ctstate NEW,RELATED,ESTABLISHED ?
This will not completely solve the problem though since inbound forwarding is also blocked (after adding ctstate NEW to the w--default rule)

Apr 21 10:08:48 n0 kernel: iptables FORWARD denied: IN=eth1 OUT=wbrdg-c0a80004 MAC=00:0c:29:42:14:01:00:10:db:ff:10:00:08:00 SRC=10.34.104.11 DST=192.168.0.6 LEN=150 TOS=0x00 PREC=0x00 TTL=62 ID=39046 PROTO=UDP SPT=53 DPT=38440 LEN=130

As far as I can tell the following config solves the forwarding problem but I'm not a iptables expert so not really sure if it makes sense or is the right solution.

Chain FORWARD (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
  884 66730 w--forward  all  --  w+     *       0.0.0.0/0            0.0.0.0/0
 1512 2063K w--forward  all  --  *      w+      0.0.0.0/0            0.0.0.0/0
    0     0 LOG        all  --  *      *       0.0.0.0/0            0.0.0.0/0            limit: avg 5/min burst 5 LOG flags 0 level 7 prefix "iptables FORWARD denied: "
    0     0 REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-port-unreachable

Chain w--default (7 references)
 pkts bytes target     prot opt in     out     source               destination
  884 66730 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ctstate NEW,RELATED,ESTABLISHED

Chain w--forward (2 references)
 pkts bytes target     prot opt in     out     source               destination
 1512 2063K ACCEPT     all  --  eth1   *       0.0.0.0/0            0.0.0.0/0
    (... forwarding rules for each container instance ...)

So in summary:

  1. With a default DROP FORWARD policy, concourse generated rules are not sufficient to enable traffic forwarding between host and containers
  2. If the last rule of the INPUT/FORWARD is a REJECT, concourse generated rules will never be reached.

Not sure if something can be done about the second issue but the first one definitely seems to be a problem.

  • concourse 2.7.3 (binary release)
  • Linux 4.9.0
  • iptables v1.4.21

/dev/null file descriptor leak

All 3 workers behind our Concourse pipeline for a duration of ~2 days eventually started hitting an open fd limit.

This resulted in a string of interesting errors: https://ci.concourse.ci/builds?since=135962 - some of which may be due to missing err checks beforehand (some complain that a file just doesn't exist, but the file probably actually failed to be created and Guardian didn't short-circuit)

The lsof -p of each worker is here: https://gist.github.com/vito/21a2a4ac9678b227a0cb37d7bb54f0f3

Looks like all of the leaked fds are /dev/null.

Path MTU discovery

Todos

Please try the following before submitting the issue:

  • Upgrade Concourse if you are observing a failure in a test
    • N/A
  • Use latest BOSH stemcell if the problem occurred in a BOSH VM
  • Use Go 1.6 if the issue has to do with running unit tests locally
    • N/A

Description

We run Cloud Foundry and use SAP/ipsec-release to secure unencrypted traffic between routers and cells. After upgrading to garden-runc-release 1.4.0 we experienced an outage whereby requests to applications that generated HTTP responses larger than approximately 9k would hang after returning the response headers.

We eventually narrowed it down to the change in container MTUs from garden-runc-release 1.3.0 (af8d612 and cloudfoundry/garden-runc-release@a9b22fd). We deploy to AWS where some instances types get a host interface MTU of 9001, so with that release our containers MTUs changed from 1500 to 9001. This meant that they generated response packets that were too large after IPsec transport headers had been added.

Now, we understand that we're deviating a bit from the norm by running IPsec, and that it's not Guardian's responsibility to know about it. We can also workaround it by setting garden.network_mtu = 1500 but it feels fragile because we have to hardcode the value irrespective of what the host's actual MTU is.

However it feels like this should Just Work with path MTU discovery. I'm not clear why it doesn't - perhaps you can shed any light? There was a similar discussion in #51 previously.

Logging and/or test output

tcpdump on a cell shows ICMP unreachable packets being generated from the cell to itself. I'm unsure if anything is acting upon these though:

  14:11:22.727890 IP (tos 0xc0, ttl 64, id 19465, offset 0, flags [none], proto ICMP (1), length 576)
      cell-0.node.dc1.cf.internal > cell-0.node.dc1.cf.internal: ICMP router-0.node.dc1.cf.internal unreachable - need to frag (mtu 9001), length 556
          IP (tos 0x0, ttl 63, id 1347, offset 0, flags [DF], proto ESP (50), length 9048)
      cell-0.node.dc1.cf.internal > router-0.node.dc1.cf.internal: ESP(spi=REDACTED,seq=0x170), length 9028
  14:11:24.225646 IP (tos 0xc0, ttl 64, id 19581, offset 0, flags [none], proto ICMP (1), length 576)
      cell-0.node.dc1.cf.internal > cell-0.node.dc1.cf.internal: ICMP router-0.node.dc1.cf.internal unreachable - need to frag (mtu 9001), length 556
          IP (tos 0x0, ttl 63, id 22649, offset 0, flags [DF], proto ESP (50), length 9048)
      cell-0.node.dc1.cf.internal > router-0.node.dc1.cf.internal: ESP(spi=REDACTED,seq=0x174), length 9028
  14:11:24.227976 IP (tos 0xc0, ttl 64, id 19582, offset 0, flags [none], proto ICMP (1), length 576)
      cell-0.node.dc1.cf.internal > cell-0.node.dc1.cf.internal: ICMP router-0.node.dc1.cf.internal unreachable - need to frag (mtu 9001), length 556
          IP (tos 0x0, ttl 63, id 22652, offset 0, flags [DF], proto ESP (50), length 9048)
      cell-0.node.dc1.cf.internal > router-0.node.dc1.cf.internal: ESP(spi=REDACTED,seq=0x177), length 9028

Steps to reproduce

  1. Deploy CF with:
  2. Deploy the dora application
  3. Request a response larger than the MTU size: curl <dora_url>/largetext/20

Please also provide the following information if applicable:

  • Guardian release version: 1.4.0 (with diego-release 1.12.0 and cf-release 256)
  • Linux kernel version: 4.4.0-72-generic (BOSH stemcell 3363.19 bosh-aws-xen-hvm-ubuntu-trusty-go_agent)
  • Concourse version: N/A
  • Go version: N/A, whatever is used by garden-runc-release

Hangs when attaching to processes (possibly already-exited ones)

Description

Had a Concourse deploy that rolled the workers (not stemcell). When they came back, re-attaching to processes would hang. I'm not sure whether these processes had exited while Garden was down.

Logging and/or test output

Here's a stack dump: https://gist.githubusercontent.com/vito/508c434965b3d5c16ca337102a9cb37e/raw/cdc39744c647640ee6fca998eaa30b940cd9dc4c/hanging-gardens

Running it through Swirly, the following sticks out:

screenshot from 2016-08-06 19-51-04

These seem to correlate to the hanging builds. They were stuck for >1h before I noticed (the >80 mins ones), and then I tried aborting one of the stick builds and bounced the ATCs and observed it happen again (the ~10min one).

container leak

Description

since some weeks we are seeing concourse containers not being reaped.
Concourse 2.1.0, garden-runc/0.8.0, bosh-openstack-kvm-ubuntu-trusty-go_agent/3263.3
The symptom is insufficient subnets remaining in the pool similar to concourse/concourse#293
It takes several days for the leak to reach the state where workers run out of containers (250 containers per worker)
The effect can be seen when comparing fly containers with fly workers: fly workers shows much more containers than fly containers.
When you bosh ssh into the worker and use gaol list, you can see the "zombie" containers.
Trying to gaol shell into a "zombie" container gives an error

root@7402fe78-8b6d-4c6e-97fb-e809337980ea:~# /tmp/gaol shell fdd75cce-c0ae-46ee-729b-6f9374525ab9
error: hijack: Backend error: Exit status: 500, message: {"Type":"","Message":"unable to find user root: no matching entries in passwd file","Handle":""}

gaol properties of a "dead" container looks like this:

garden.grace-time       300000000000
garden.network.host-ip  10.254.0.93
kawasaki.container-interface    wui37gmn4uj7-1
kawasaki.bridge-interface       wbrdg-0afe005c
kawasaki.dns-servers
kawasaki.mtu    1500
concourse:volume-mounts {"2b3475a7-1e37-42c5-7a4b-e49e68074495":"/tmp/build/get"}
garden.network.container-ip     10.254.0.94
garden.network.external-ip      10.1.6.17
garden.state    created
concourse:resource-result       {
  "version": {
    "digest": "sha256:4e9752e8f15bff07872664e7e206678e45dd84bccce02383f9b29a2bd1501864"
  },
  "metadata": [
    {
      "name": "image",
      "value": "sha256:4c07d"
    }
  ]
}

kawasaki.host-interface wui37gmn4uj7-0
kawasaki.iptable-prefix w--

so it looks like all zombie containers are concourse:resource-result containers, i.e. containers which should have done a git clone.

Logging and/or test output

The last log entries of a "dead" container:

{"timestamp":"1475624427.399276257","source":"guardian","message":"guardian.run.exec.finished","log_level":1,"data":{"handle":"a9645f41-a32f-4fa7-58e4-98838a3191d8","id":"a9645f41-a32f-4fa7-58e4-98838a3191d8","path":"/opt/resource/in","session":"6796.2"}}
{"timestamp":"1475624427.399291754","source":"guardian","message":"guardian.run.finished","log_level":1,"data":{"handle":"a9645f41-a32f-4fa7-58e4-98838a3191d8","path":"/opt/resource/in","session":"6796"}}
{"timestamp":"1475624427.399307251","source":"guardian","message":"guardian.api.garden-server.run.spawned","log_level":1,"data":{"handle":"a9645f41-a32f-4fa7-58e4-98838a3191d8","id":"337a97b6-3343-48a4-6d62-37b032f7729b","session":"3.1.52629","spec":{"Path":"/opt/resource/in","Dir":"","User":"root","Limits":{},"TTY":null}}}
{"timestamp":"1475624637.809005260","source":"guardian","message":"guardian.api.garden-server.run.exited","log_level":1,"data":{"handle":"a9645f41-a32f-4fa7-58e4-98838a3191d8","id":"337a97b6-3343-48a4-6d62-37b032f7729b","session":"3.1.52629","status":0}}

in comparison to a container, which was cleaned up (same steps are executed, except that in the case of the zombiw container the reaper did not kick in):

{"timestamp":"1475653406.035202265","source":"guardian","message":"guardian.run.exec.finished","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","id":"96692456-981b-454e-5180-7d841ca488a8","path":"landscape/documentation/concourse/tasks/smoke/test.sh","session":"9625.2"}}
{"timestamp":"1475653406.035219669","source":"guardian","message":"guardian.run.finished","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","path":"landscape/documentation/concourse/tasks/smoke/test.sh","session":"9625"}}
{"timestamp":"1475653406.035238743","source":"guardian","message":"guardian.api.garden-server.run.spawned","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","id":"e25b4989-2ac0-44cf-6334-f5e893d60068","session":"3.1.58572","spec":{"Path":"landscape/documentation/concourse/tasks/smoke/test.sh","Dir
":"/tmp/build/f541ec31","User":"root","Limits":{},"TTY":{}}}}
{"timestamp":"1475653432.606876373","source":"guardian","message":"guardian.api.garden-server.run.exited","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","id":"e25b4989-2ac0-44cf-6334-f5e893d60068","session":"3.1.58572","status":0}}
{"timestamp":"1475653929.393552065","source":"guardian","message":"guardian.api.garden-server.reaping","log_level":1,"data":{"grace-time":"5m0s","handle":"96692456-981b-454e-5180-7d841ca488a8","session":"3.1"}}
{"timestamp":"1475653929.393662930","source":"guardian","message":"guardian.destroy.start","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","session":"9724"}}
{"timestamp":"1475653929.394035816","source":"guardian","message":"guardian.destroy.started","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","session":"9725"}}
{"timestamp":"1475653929.394057512","source":"guardian","message":"guardian.destroy.state.started","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","session":"9725.1"}}
{"timestamp":"1475653929.404460430","source":"guardian","message":"guardian.destroy.state.finished","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","session":"9725.1"}}
{"timestamp":"1475653929.404501915","source":"guardian","message":"guardian.destroy.state","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","session":"9725","state":{"Pid":28253,"Status":"created"}}}
{"timestamp":"1475653929.404526949","source":"guardian","message":"guardian.destroy.delete.started","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","session":"9725.2"}}
{"timestamp":"1475653929.515482426","source":"guardian","message":"guardian.destroy.delete.finished","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","session":"9725.2"}}
{"timestamp":"1475653929.515547752","source":"guardian","message":"guardian.destroy.destroy.started","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","session":"9725.3"}}
{"timestamp":"1475653929.517130852","source":"guardian","message":"guardian.destroy.destroy.finished","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","session":"9725.3"}}
{"timestamp":"1475653929.517160177","source":"guardian","message":"guardian.destroy.finished","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","session":"9725"}}
{"timestamp":"1475653929.535522699","source":"guardian","message":"guardian.create.containerizer-create.watch.done","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","session":"9623.1.4"}}
{"timestamp":"1475653929.726925850","source":"guardian","message":"guardian.volume-plugin.destroying.layer-already-deleted-skipping","log_level":1,"data":{"error":"could not find image: no such id: 76051528cd636f7704dea82da6cdd438a8e12de7cf99008ae715724c265cf0d2","graphID":"96692456-981b-454e-5180-7d841ca488a8","handle":"96692456-981b-454e-5180-7d841ca488a8","id":"96692456-981b-454e-5180-7d841ca488a8","session":"9728"}}
{"timestamp":"1475653929.727013826","source":"guardian","message":"guardian.destroy.finished","log_level":1,"data":{"handle":"96692456-981b-454e-5180-7d841ca488a8","session":"9724"}}

Looks like the question is why the guardian destroy
https://github.com/cloudfoundry/garden/blob/c7ed40f0b983c8d082dcdfc3dcd5adfa1020195f/server/request_handling.go#L128
does not kick in for the containers that do not run any processes anymore.

Steps to reproduce

The problem is hard to reproduce and the leak is slow (~2 containers per hour not being reaped)

  • Guardian release version garden-runc/0.8.0
  • Linux kernel version bosh-openstack-kvm-ubuntu-trusty-go_agent/3263.3
  • Concourse version 2.1.0
  • Go version 1.6.1

run ginkgo test error

Hi guys, I try to run the ginkgo tests, but just failed, and the error log is:

Running Suite: Gqt Suite
========================
Random Seed: 1445845654
Will run 1 of 1 specs

[d][guardian] spawned /tmp/gexec_artifacts826867777/g190076076/guardian (pid: 25141)
[o][guardian] {"timestamp":"1445845659.613978386","source":"guardian","message":"guardian.runner.command.starting","log_level":0,"data":{"argv":["mountpoint","-q","/tmp/cgroups-1"],"session":"2.1"}}
[o][guardian] {"timestamp":"1445845659.615394592","source":"guardian","message":"guardian.runner.command.failed","log_level":2,"data":{"argv":["mountpoint","-q","/tmp/cgroups-1"],"error":"exit status 1","exit-status":1,"session":"2.1","stderr":"","stdout":"","took":"1.406194ms"}}
[o][guardian] {"timestamp":"1445845659.615459681","source":"guardian","message":"guardian.runner.command.starting","log_level":0,"data":{"argv":["mount","-t","tmpfs","-o","uid=0,gid=0,mode=0755","cgroup","/tmp/cgroups-1"],"session":"2.2"}}
[o][guardian] {"timestamp":"1445845659.621389866","source":"guardian","message":"guardian.runner.command.succeeded","log_level":0,"data":{"argv":["mount","-t","tmpfs","-o","uid=0,gid=0,mode=0755","cgroup","/tmp/cgroups-1"],"exit-status":0,"session":"2.2","took":"5.922984ms"}}
[o][guardian] {"timestamp":"1445845659.621483088","source":"guardian","message":"guardian.setup-cgroup.started","log_level":1,"data":{"path":"/tmp/cgroups-1/cpu","session":"7","type":"cpu"}}
[o][guardian] {"timestamp":"1445845659.621526241","source":"guardian","message":"guardian.runner.command.starting","log_level":0,"data":{"argv":["mountpoint","-q","/tmp/cgroups-1/cpu"],"session":"2.3"}}
[o][guardian] {"timestamp":"1445845659.623045683","source":"guardian","message":"guardian.runner.command.failed","log_level":2,"data":{"argv":["mountpoint","-q","/tmp/cgroups-1/cpu"],"error":"exit status 1","exit-status":1,"session":"2.3","stderr":"","stdout":"","took":"1.508992ms"}}
[o][guardian] {"timestamp":"1445845659.623255491","source":"guardian","message":"guardian.runner.command.starting","log_level":0,"data":{"argv":["mount","-n","-t","cgroup","-o","cpu","cgroup","/tmp/cgroups-1/cpu"],"session":"2.4"}}
[o][guardian] {"timestamp":"1445845659.624535322","source":"guardian","message":"guardian.setup-cgroup.mount-cgroup-cmd.received-data","log_level":1,"data":{"data":"mount: cgroup is already mounted or /tmp/cgroups-1/cpu busy\n","path":"/tmp/cgroups-1/cpu","session":"7.1","type":"cpu"}}
[o][guardian] {"timestamp":"1445845659.624782324","source":"guardian","message":"guardian.setup-cgroup.mount-cgroup-cmd.received-data","log_level":1,"data":{"data":"       cgroup is already mounted on /sys/fs/cgroup/systemd\n       cgroup is already mounted on /sys/fs/cgroup/cpuset\n","path":"/tmp/cgroups-1/cpu","session":"7.1","type":"cpu"}}
[o][guardian] {"timestamp":"1445845659.624815464","source":"guardian","message":"guardian.setup-cgroup.mount-cgroup-cmd.received-data","log_level":1,"data":{"data":"       cgroup is already mounted on /sys/fs/cgroup/cpu,cpuacct\n       cgroup is already mounted on /sys/fs/cgroup/memory\n       cgroup is already mounted on /sys/fs/cgroup/devices\n       cgroup is already mounted on /sys/fs/cgroup/freezer\n       cgroup is already mounted on /sys/fs/cgroup/net_cls\n       cgroup is already mounted on /sys/fs/cgroup/blkio\n       cgroup is already mounted on /sys/fs/cgroup/perf_event\n       cgroup is already mounted on /sys/fs/cgroup/hugetlb\n       cgroup is already mounted on /tmp/cgroups-1\n","path":"/tmp/cgroups-1/cpu","session":"7.1","type":"cpu"}}
[o][guardian] {"timestamp":"1445845659.624980450","source":"guardian","message":"guardian.runner.command.failed","log_level":2,"data":{"argv":["mount","-n","-t","cgroup","-o","cpu","cgroup","/tmp/cgroups-1/cpu"],"error":"exit status 32","exit-status":32,"session":"2.4","stderr":"mount: cgroup is already mounted or /tmp/cgroups-1/cpu busy\n       cgroup is already mounted on /sys/fs/cgroup/systemd\n       cgroup is already mounted on /sys/fs/cgroup/cpuset\n       cgroup is already mounted on /sys/fs/cgroup/cpu,cpuacct\n       cgroup is already mounted on /sys/fs/cgroup/memory\n       cgroup is already mounted on /sys/fs/cgroup/devices\n       cgroup is already mounted on /sys/fs/cgroup/freezer\n       cgroup is already mounted on /sys/fs/cgroup/net_cls\n       cgroup is already mounted on /sys/fs/cgroup/blkio\n       cgroup is already mounted on /sys/fs/cgroup/perf_event\n       cgroup is already mounted on /sys/fs/cgroup/hugetlb\n       cgroup is already mounted on /tmp/cgroups-1\n","stdout":"","took":"1.713413ms"}}
[o][guardian] {"timestamp":"1445845659.625034809","source":"guardian","message":"guardian.setup-cgroup.mount-cgroup-failed","log_level":2,"data":{"error":"exit status 32","path":"/tmp/cgroups-1/cpu","session":"7","type":"cpu"}}
[o][guardian] {"timestamp":"1445845659.625054836","source":"guardian","message":"guardian.setup-cgroup.finished","log_level":1,"data":{"path":"/tmp/cgroups-1/cpu","session":"7","type":"cpu"}}
[o][guardian] {"timestamp":"1445845659.625111818","source":"guardian","message":"guardian.failed-to-start-server","log_level":3,"data":{"error":"exit status 32","trace":"goroutine 1 [running]:\ngithub.com/pivotal-golang/lager.(*logger).Fatal(0xc8200f60c0, 0x940ef0, 0x16, 0x7f716ab35f08, 0xc8200fc3e0, 0x0, 0x0, 0x0)\n\t/home/tony/Documents/go/src/github.com/pivotal-golang/lager/logger.go:131 +0xc5\nmain.main()\n\t/home/tony/Documents/go/src/github.com/cloudfoundry-incubator/guardian/cmd/guardian/main.go:228 +0x8c2\n"}}
[e][guardian] panic: exit status 32
[e][guardian] 
[e][guardian] goroutine 1 [running]:
[e][guardian] github.com/pivotal-golang/lager.(*logger).Fatal(0xc8200f60c0, 0x940ef0, 0x16, 0x7f716ab35f08, 0xc8200fc3e0, 0x0, 0x0, 0x0)
[e][guardian]   /home/tony/Documents/go/src/github.com/pivotal-golang/lager/logger.go:152 +0x698
[e][guardian] main.main()
[e][guardian]   /home/tony/Documents/go/src/github.com/cloudfoundry-incubator/guardian/cmd/guardian/main.go:228 +0x8c2
[e][guardian] 
[e][guardian] goroutine 17 [syscall, locked to thread]:
[e][guardian] runtime.goexit()
[e][guardian]   /usr/local/go/src/runtime/asm_amd64.s:1696 +0x1
[e][guardian] 
[e][guardian] goroutine 5 [syscall]:
[e][guardian] os/signal.loop()
[e][guardian]   /usr/local/go/src/os/signal/signal_unix.go:22 +0x18
[e][guardian] created by os/signal.init.1
[e][guardian]   /usr/local/go/src/os/signal/signal_unix.go:28 +0x37
β€’ Failure in Spec Setup (BeforeEach) [3.639 seconds]
Net
/home/tony/Documents/go/src/github.com/cloudfoundry-incubator/guardian/gqt/net_test.go:120
  should have a (dynamically assigned) IP address [BeforeEach]
  /home/tony/Documents/go/src/github.com/cloudfoundry-incubator/guardian/gqt/net_test.go:63

  Expected error:
      <*url.Error | 0xc820212b10>: {
          Op: "Post",
          URL: "http://api/containers",
          Err: {
              Op: "dial",
              Net: "unix",
              Source: nil,
              Addr: {
                  Name: "/tmp/garden_1.sock",
                  Net: "unix",
              },
              Err: {Syscall: "connect", Err: 0x2},
          },
      }
      Post http://api/containers: dial unix /tmp/garden_1.sock: connect: no such file or directory
  not to have occurred

  /home/tony/Documents/go/src/github.com/cloudfoundry-incubator/guardian/gqt/net_test.go:29
------------------------------


Summarizing 1 Failure:

[Fail] Net [BeforeEach] should have a (dynamically assigned) IP address 
/home/tony/Documents/go/src/github.com/cloudfoundry-incubator/guardian/gqt/net_test.go:29

Ran 1 of 1 Specs in 3.639 seconds
FAIL! -- 0 Passed | 1 Failed | 0 Pending | 0 Skipped --- FAIL: TestGqt (3.64s)
FAIL

Ginkgo ran 1 suite in 5.036839752s
Test Suite Failed

even though I run these test from scratch, I get this error, I don't know why cgroup is already mounter.
If i run this command: cat /proc/mounts | grep cgroup, I will get such output:

tmpfs /sys/fs/cgroup tmpfs rw,seclabel,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /tmp/cgroups-1 tmpfs rw,seclabel,relatime,mode=755 0 0

I did try to run this command: umount /tmp/cgroup-1, but useless. I am wondering how can I run this test case correctly.

Provide a way to share hosts certificates in containers

Concourse users would like to have a way to share their hosts common CA certificates inside of containers. Currently even though they install them on VM or configure bosh to install certs for them they have to include common CA certificates in every image their container is running.

We think that sharing certificates from host might be a feature provided by Garden. Same way garden manages /etc/hosts inside of containers it can also manage /etc/ssl/certs. And since it is a common problem for both Concourse and Diego it would be nice if that feature was baked in the backend.

Related Concourse issue: concourse/concourse#1027

Use of tcpdump in a buildpack container

We'd like to offer our cf ssh users the possibility to use tools like tcpdump in their (buildpack) containers.

This currently seems to be impossible: the tcpdump binary is (intentionally) missing in the rootfs - but after copying it to the container, the vcap user lacks the necessary privileges:

$ ./tcpdump -s 0 -w pcap.out -i wip0f8df9rl11-1 tcp port 8080
tcpdump: wip0f8df9rl11-1: You don't have permission to capture on that device
(socket: Operation not permitted)

I found this thread on the mailing list which explains that the required privilege (i.e. CAP_NET_ADMIN ) is intentionally dropped for security reasons. However, the thread is quite old and refers to garden linux.

How does the situation look like with guardian/runc? Is there a possibility to offer the required privileges to end users?

When looking at the source code, it looks like root in the container obtains the privilege; the CF doc states otherwise:

CF drops the following Linux capabilities for all container processes. Every dropped capability limits what actions the root user can perform.
...
CAP_NET_ADMIN

(But this anyway only applies to the root users and would not help our vcap users much)

wbrdg interfaces being removed from nodes

Todos

Please try the following before submitting the issue:

  • Upgrade Concourse if you are observing a failure in a test
  • Use latest BOSH stemcell if the problem occurred in a BOSH VM
  • Use Go 1.6 if the issue has to do with running unit tests locally

Description

Something appears to be removing arbitrary wbrdg- devices associated with containers being created via diego + garden-runc. Some time after pushing an app, and it being accessible, it becomes inaccessible with 502 gateway timeouts. The container is still running and healthy, but it has no network access. Digging deeper shows that the wbrdg- device associated with the container is no longer present on the host VM.

Logging and/or test output

dmesg output:

[1106633.564037] wbrdg-0afe0040: port 1(w9d6409ch3e1-0) entered forwarding state
[1106759.581682] device w9d6409ch3e1-0 left promiscuous mode
[1106759.581711] wbrdg-0afe0040: port 1(w9d6409ch3e1-0) entered disabled state```

https://gist.github.com/geofffranks/9e621a9c559d2d49dfcf7c41bb8f76ad has veritas + ip addr output on the cell node

based on the ip addr output in the gist, t looks like something removed the wbrdg-0afe00f0 interface from the cell node.

## Steps to reproduce

This is occurring on a BOSH-Lite VM running in Azure (rather than virtualbox)
cf-249
diego 1.2.0
garden 1.0.4
BOSH director version is 1.3262.9.0 (257.9)
warden-cpi version 29
bosh-warden-ubuntu stemcell v3312.17 (i'm in the process of trying this on v3363.1)

Essentially the steps to reproduce are to push a few apps, wait a period of time, and wait for one of them to become unresponsive. It's possible that this happened in conjunction with stopping/deleting other apps, as i most recently noticed this after having deleted a bunch of apps related to cf acceptance tests. 

docker 1.9 'dind' support

Description

I have some systems still requiring images in the docker 1.9 image format (there was a breaking change between 1.9 and 1.10). So I modified the docker-resource-image repo to use docker 1.9 instead of docker 1.12. Things work ok until you try do a docker build and then the failure below occurs

Logging and/or test output

I get the following error when using docker 1.9:

[8] System error: write /sys/fs/cgroup/devices/0c1e319c-49fe-4383-7e87-fd53d98c8890/docker/72d46daae1df3a409aae40009b4849273fd1e9a91971a737404f1d4122f8ce4c/devices.allow: operation not permitted

Steps to reproduce

Using concourse-ci 2.0.2 run up a dind container which containers docker 1.9 (example here: https://github.com/SillyMoo/docker-image-resource/tree/1.9). Calls to docker build will fail with cgroup errors not present with docker 1.12.

iptables setup issue

If I run the iptable setup script on Centos 7, I will get such error: iptables: command not found, the root cause is that PATH env wrong. Whether should we add path export at the beginning of setup script?

Guardian fails to StreamIn when given SteamInSpec.Path = "/"

Todos

Please try the following before submitting the issue:

  • Upgrade Concourse if you are observing a failure in a test
    Not seeing this in Concourse. Seeing it when executing a CLI by hand from our local workstation.
  • Use latest BOSH stemcell if the problem occurred in a BOSH VM
    We are running Garden inside BOSH lite. It is running the latest version.
  • Use Go 1.6 if the issue has to do with running unit tests locally
    Not an issue with tests.

Description

We are using Garden as a container runtime for executing Cloud Native Application Bundles. Part of the specification for these bundles is that parameters may be passed to the invocation container as files. The location of the files is specified by the bundle author and may include files at the root path, like /somefile.json.

We made an attempt to use the Garden container.StreamIn API to copy these parameter files into the container. Unfortunately, when we provide a path of / with our tarstream, we get the following error:

mkdir_p_as 0 0: No such file or directory

Our best guess is that this is caused by this section of code in nstar: https://github.com/cloudfoundry/guardian/blob/master/rundmc/nstar/nstar.c#L58-L59.

Steps to reproduce

Outline the steps to test or reproduce the PR here. Please also provide the
following information if applicable:

  • Garden RunC Release Version: 1.18.3
  • BOSH Warden Stemcell: Xenial v250.17
  • Concourse version: NA
  • Go version: 1.11.4

Use the Garden client API container.StreamIn feature, passing any simple tar as TarStream, and / as Path. You will see the error shown above returned to the client and printed in the server logs.

/cc @youreddy

GPU instances

Is it possible to map (nvidia or AMD) GPUs into the container using guardian/garden? If so, where can I find more information?

Configuring container's DNS search domains

I've been using concourse and I've ran into a issue with it's container's DNS config, I need to configure a search domain in order to be able to use just the hostname and not the FQDN of certain hosts.

Currently I can configure the DNS servers via the CONCOURSE_GARDEN_DNS_SERVER however there seems to be no way to configure DNS search domains. Is there a way one can configure the search domains of containers?

I'm running concourse with docker, and tried using the dns-search argument when running the worker container however that seems to have no effect on the containers the worker creates.

Potentially useful info:

  • Linux 4.8.0-46-generic
  • Docker 1.13.0
  • Concourse 2.7.3

Feasibility of net_cls cgroups and classids for Processes

Background

As part of a larger effort around using sidecar proxies inside of containers, we are exploring various mechanisms for redirecting app traffic to a sidecar Envoy proxy.

In one scenario we're considering, we'd avoid acting on packets originating from the Envoy itself, while simultaneously modifying all other packets in the container. Since both the Envoy and the app are in the same container (hence same network namespace), we'd need some way to distinguish between the originating processes.

One approach that we've explored was to use the net_cls cgroup. This seems to work for us.

The requirements are:

  1. Different processes in the container belong to different net_cls cgroups
  2. Those cgroups have different values in their net_cls.classid file

With those pieces in place, we (as the networking plugin) could install iptables rules which match or don't match on the different classid values.

The next step in exploring feasibility for Cloud Foundry would be to understand how we could do this in Garden.

Imaginary implementation

So, just to pull a possible implementation out of thin air, perhaps the ProcessSpec gets a new field NetClassID. If that is nonzero, then guardian might:

  • create a sub-cgroup of the container's cgroup, perhaps named using that classid
  • write the value of the NetClassID to the net_cls.classid file in the cgroup
  • move the process into the cgroup

The Questions

We may not need to do this at all, so we don't want to get too deep right now. But in order to help us weigh possible options, a superficial smell-test would be helpful:

Does this sound feasible in Garden & Guardian?

cc @christianang

Concourse 2.7.3 mounts /etc/hosts as UID 65534

Bug Report

/etc/hosts, for all its faults, is a great way to have non-global DNS for smoke tests. Between Concourse 2.7.0 and 2.7.3, it looks like a change was made that bind mounts it into containers as UID 65534, AKA nobody, AKA nfsnobody, depending on who you ask. This breaks the ability of Concourse tasks to use /etc/hosts as a quick and easy split-horizon DNS.

sysctl net.ipv4.tcp_keepalive_time setting needed

Description

Azure Load Balancer does not issue a reset when dropping a connection. The Linux tcp_keepalive_time needs to be below the Azure values (4 mins or 240 seconds) to avoid issues observed with applications running in Garden on a Diego cell. If the Azure LB disconnects, applications that are connected for longer than 4 minutes will eventually time out.

BOSH stemcells were updated to address this, but the values are not propagated to the Garden container. Stemcell settings added include:

net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 8
net.ipv4.tcp_keepalive_time = 120

Logging and/or test output

$ /sbin/sysctl net.ipv4.tcp_keepalive_time
sysctl: cannot stat /proc/sys/net/ipv4/tcp_keepalive_time: No such file or directory

These are the net.ipv4.tcp settings we see in the Garden container:
$ /sbin/sysctl -a | grep "net.ipv4.tcp"
net.ipv4.tcp_base_mss = 1024
net.ipv4.tcp_ecn = 2
net.ipv4.tcp_ecn_fallback = 1
net.ipv4.tcp_fwmark_accept = 0
net.ipv4.tcp_mtu_probing = 0
net.ipv4.tcp_probe_interval = 600
net.ipv4.tcp_probe_threshold = 8

Steps to reproduce

Our test environment is running Pivotal Cloud Foundry 1.9.4 on Azure. The 1.9.x release of PCF includes:
garden-runc 1.1.1

https://github.com/scottfrederick/http_client is a Go app that hits the Cloud Controller on demand, with connection reuse. After a period of inactivity a request hangs for 15 mins before timing out.

no README yet, but

  1. git clone
  2. modify the target URL https://github.com/scottfrederick/http_client/blob/master/http_client.go#L16
  3. go build
  4. cf push http_client -b https://github.com/cloudfoundry/go-buildpack

once running, you can just curl the app’s route and it will echo the /info endpoint of the configured env.

it needs master of go-buildpack to get golang 1.8 support

Outline the steps to test or reproduce the PR here. Please also provide the
following information if applicable:

  • Guardian release version
  • Linux kernel version
  • Concourse version
  • Go version

w--forward setup during container creation fails with "iptables: Index of insertion too big"

This is a copy of concourse/concourse#1157, where @vito directed me here. In summary it seems that Guardian expects the iptables chain w--forward to be in a state that it's not in some cases. In this particular case when I start a git-resource container to check a remote git repository in Concourse. The details are as follows.


I run

  • the official concourse standalone binary v2.7.7 under systemd
  • the worker and server on the same node
  • under Debian Stretch

The pipeline is as follows

resource_types:
    - name: bitbucket-notify
      type: docker-image
      source:
          repository: ecometrica/concourse-resource-bitbucket


resources:
    - name: authserver
      type: git
      source:
          uri: [email protected]:jdelic/authserver.git
          private_key: {{bitbucket_access_key}}

    - name: bitbucket-notify
      type: bitbucket-notify
      source:
          client_id: {{bitbucket_client_id}}
          secret: {{bitbucket_client_secret}}
          repo: jdelic/authserver


jobs:
    - name: mypy
      plan:
          - get: authserver
            trigger: true
          - put: bitbucket-notify
            params:
                state: PENDING
                commit: authserver
          - task: unit
            file: .ci/concourse/mypy.yaml
            on_success:
                put: bitbucket-notify
                params:
                    state: SUCCESSFUL
                    repo: authserver
            on_success:
                put: bitbucket-notify
                params:
                    state: SUCCESSFUL
                    repo: authserver

The error happens reliably on a Debian Stretch Vagrant VM based off my salt config. The pipeline is registered as authserver with a bitbucket deployment SSH private key in {{bitbucket_access_key}}. When trying to run this, I get:

$ fly -t test set-pipeline -p authserver -c .ci\concourse\pipeline.yaml  -l creds.yaml
... 
$ fly -t test cr -r authserver/authserver
error: Unexpected Response
Status: 500 Internal Server Error
Body:

The server logs show the following:

May 14 13:48:16 test concourse_linux_amd64[842]: {"timestamp":"1494762496.292508841","source":"guardian","message":"guardian.create.start","log_level":1,"data":{"handle":"","session":"112"}}
May 14 13:48:16 test concourse_linux_amd64[842]: {"timestamp":"1494762496.293031454","source":"guardian","message":"guardian.create.containerizer-create.start","log_level":1,"data":{"handle":"a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","session": "112.2"}}
May 14 13:48:16 test concourse_linux_amd64[842]: {"timestamp":"1494762496.306846619","source":"guardian","message":"guardian.create.containerizer-create.depot-create.started","log_level":1,"data":{"handle":"a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","session":"112.2.1"}}
May 14 13:48:16 test concourse_linux_amd64[842]: {"timestamp":"1494762496.307445288","source":"guardian","message":"guardian.create.containerizer-create.depot-create.finished","log_level":1,"data":{"handle":"a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","session":"112.2.1"}}
May 14 13:48:16 test concourse_linux_amd64[842]: {"timestamp":"1494762496.307856083","source":"guardian","message":"guardian.create.containerizer-create.create.creating","log_level":1,"data":{"bundle":"/srv/concourse-worker/depot/a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","bundlePath":"/srv/concourse-worker/depot/a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","handle":"a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","id":"a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","logPath":"/srv/concourse-worker/depot/a93d3942-cfe8-419c-4ccf-aa8355e1eb6e/create.log","pidFilePath":"/srv/concourse-worker/depot/a93d3942-cfe8-419c-4ccf-aa8355e1eb6e/pidfile","runc":"/srv/concourse-worker/2.7.7/linux/bin/runc","session":"112.2.3"}}
May 14 13:48:16 test concourse_linux_amd64[842]: {"timestamp":"1494762496.327848196","source":"guardian","message":"guardian.create.containerizer-create.create.finished","log_level":1,"data":{"bundle":"/srv/concourse-worker/depot/a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","handle":"a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","session":"112.2.3"}}
May 14 13:48:16 test concourse_linux_amd64[842]: {"timestamp":"1494762496.328340292","source":"guardian","message":"guardian.create.containerizer-create.finished","log_level":1,"data":{"handle":"a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","session":"112.2"}}
May 14 13:48:16 test concourse_linux_amd64[842]: {"timestamp":"1494762496.333292961","source":"guardian","message":"guardian.create.containerizer-create.watch.watching","log_level":1,"data":{"handle":"a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","session":"112.2.4"}}
May 14 13:48:16 test concourse_linux_amd64[842]: {"timestamp":"1494762496.337182283","source":"guardian","message":"guardian.create.network.started","log_level":1,"data":{"handle":"a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","session":"112.5","spec":""}}
May 14 13:48:16 test concourse_linux_amd64[842]: {"timestamp":"1494762496.337663889","source":"guardian","message":"guardian.create.network.config-create","log_level":1,"data":{"config":{"ContainerHandle":"a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","HostIntf":"wfjkreuloj13-0","ContainerIntf":"wfjkreuloj13-1","IPTablePrefix":"w--","IPTableInstance":"fjkreuloj13","BridgeName":"wbrdg-0afe0000","BridgeIP":"10.254.0.1","ContainerIP":"10.254.0.2","ExternalIP":"10.0.2.15","Subnet":{"IP":"10.254.0.0","Mask":"/////A=="},"Mtu":1500,"PluginNameservers":null,"OperatorNameservers":[],"AdditionalNameservers":[]},"handle":"a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","session":"112.5","spec":""}}
May 14 13:48:16 test concourse_linux_amd64[842]: {"timestamp":"1494762496.382477999","source":"guardian","message":"guardian.iptables-runner.command.failed","log_level":2,"data":{"argv":["/srv/concourse-worker/2.7.7/linux/iptables/sbin/iptables","--wait","-I","w--forward","2","--in-interface","wbrdg-0afe0000","--source","10.254.0.2","--goto","w--instance-fjkreuloj13","-m","comment","--comment","a93d3942-cfe8-419c-4ccf-aa8355e1eb6e"],"error":"exit status 1","exit-status":1,"session":"1.547","stderr":"iptables: Index of insertion too big.\n","stdout":"","took":"1.348488ms"}}
May 14 13:48:16 test concourse_linux_amd64[842]: {"timestamp":"1494762496.382990360","source":"guardian","message":"guardian.create.network.finished","log_level":1,"data":{"handle":"a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","session":"112.5","spec":""}}
May 14 13:48:16 test concourse_linux_amd64[842]: {"timestamp":"1494762496.383230925","source":"guardian","message":"guardian.create.create-failed-cleaningup.start","log_level":1,"data":{"cause":"iptables: create-instance-chains: iptables: Index of insertion too big.\n","handle":"","session":"112.6"}}
May 14 13:48:16 test concourse_linux_amd64[842]: {"timestamp":"1494762496.383437395","source":"guardian","message":"guardian.create.create-failed-cleaningup.destroy.started","log_level":1,"data":{"cause":"iptables: create-instance-chains: iptables: Index of insertion too big.\n","handle":"a93d3942-cfe8-419c-4ccf-aa8355e1eb6e","session":"112.6.1"}}
...

Since I assume that that's an important part of debugging it, here is a list of the current iptables chains and the number of rules in each chain. I'm going to paste all of the FORWARD rules, since some of those have been set up by concourse. Garden is configured with --garden-network-pool 10.254.0.0/22:

Chain INPUT (policy DROP 0 packets, 0 bytes)
38 rules

Chain FORWARD (policy DROP 0 packets, 0 bytes)
    0     0 DOCKER-ISOLATION  all  --  *      *       0.0.0.0/0            0.0.0.0/0
    0     0 ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0  
    0     0 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0
    0     0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0 
    0     0 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED
    0     0 ACCEPT     tcp  --  *      *       10.254.0.0/22        0.0.0.0/0            state NEW
    0     0 ACCEPT     udp  --  *      *       10.254.0.0/22        0.0.0.0/0            state NEW

Chain OUTPUT (policy DROP 0 packets, 0 bytes)
34 rules

Chain DOCKER (1 references)
0 rules

Chain DOCKER-ISOLATION (1 references)
1 rule

Chain w--default (0 references)
 pkts bytes target     prot opt in     out     source               destination 
    0     0 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED

Chain w--forward (0 references)
0 rules

Chain w--input (0 references)
0 rules

it seems the process fails when inserting the rule into w--forward. Any ideas? What state should that Chain be in?

{
   "timestamp" : "1494762496.382477999",
   "log_level" : 2,
   "message" : "guardian.iptables-runner.command.failed",
   "data" : {
      "session" : "1.547",
      "took" : "1.348488ms",
      "error" : "exit status 1",
      "argv" : [
         "/srv/concourse-worker/2.7.7/linux/iptables/sbin/iptables",
         "--wait",
         "-I",
         "w--forward",
         "2",
         "--in-interface",
         "wbrdg-0afe0000",
         "--source",
         "10.254.0.2",
         "--goto",
         "w--instance-fjkreuloj13",
         "-m",
         "comment",
         "--comment",
         "a93d3942-cfe8-419c-4ccf-aa8355e1eb6e"
      ],
      "exit-status" : 1,
      "stdout" : "",
      "stderr" : "iptables: Index of insertion too big.\n"
   },
   "source" : "guardian"
}

When I add a dummy rule to w--forward the resource seems to come up and execute correctly:

# insert a "never triggered" rule, so concourse can run `iptables -I w--forward 2`
$ iptables -I w--forward 1 -s 169.254.254.1 -d 169.254.254.1 -j ACCEPT
$ fly -t test cr -r authserver/authserver
error: check failed with exit status '128':
Identity added: /tmp/git-resource-private-key (/tmp/git-resource-private-key)
Cloning into '/tmp/git-resource-repo-cache'...
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

I have restarted concourse multiple times and flushed all iptables chains inbetween, and it never seems to set up w--forward with a single, first rule. This way iptables -I w--forward 2 will always fail on my box.

'volume graph is disabled' when creating with raw:// scheme

Description

Concourse uses BaggageClaim for all of our rootfs needs, and provides containers with a rootfs directly via raw:///path/to/rootfs, expecting Garden to not perform any layering and literally just use the given rootfs for the container. This worked up to 1.9.0, but now that we're on 1.11.0, we get a volume graph is disabled error, as our binaries don't configure Garden with a graph dir (since it shouldn't need it, when BaggageClaim is being used for everything).

Maybe the handling of raw:// moved around recently when pulling in grootfs, and happens too late?

I see that it's implemented by *gardener.VolumeProvider, which is only used when a graph dir is configured: https://github.com/cloudfoundry/guardian/blob/master/guardiancmd/command_linux.go#L66-L72

The volume graph is disabled error comes from the NoopVolumizer: https://github.com/cloudfoundry/guardian/blob/a3d6fd2dc82486ac01fa22bd8dabb883bc9cd73d/gardener/noop_volumizer.go

Logging and/or test output

{"timestamp":"1516914276.571754694","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-volume-for-base-resource-type.created","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","name":"input-1","session":"133.3.2.2.2","volume":"c5f30cd8-8161-42dd-713e-cbf422f2b796"}}
{"timestamp":"1516914276.572278261","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-volume-for-base-resource-type.release.released","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","id":[4,1],"name":"input-1","session":"133.3.2.2.2.4","volume":"c5f30cd8-8161-42dd-713e-cbf422f2b796"}}
{"timestamp":"1516914276.578666687","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-cow-volume-for-container.created-creating-volume","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","name":"input-1","session":"133.3.2.2.3","volume":"0bf29339-5197-404c-6785-b61ab989bf22"}}
{"timestamp":"1516914276.579205990","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-cow-volume-for-container.acquire.acquired","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","id":[4,2],"name":"input-1","session":"133.3.2.2.3.1","volume":"0bf29339-5197-404c-6785-b61ab989bf22"}}
{"timestamp":"1516914276.581009388","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-cow-volume-for-container.creating-real-volume","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","name":"input-1","session":"133.3.2.2.3","volume":"0bf29339-5197-404c-6785-b61ab989bf22"}}
{"timestamp":"1516914277.054263830","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-cow-volume-for-container.created","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","name":"input-1","session":"133.3.2.2.3","volume":"0bf29339-5197-404c-6785-b61ab989bf22"}}
{"timestamp":"1516914277.054919958","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-cow-volume-for-container.release.released","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","id":[4,2],"name":"input-1","session":"133.3.2.2.3.4","volume":"0bf29339-5197-404c-6785-b61ab989bf22"}}
{"timestamp":"1516914277.054978371","source":"atc","message":"atc.create-build.do.aggregate.get.creating-container-in-garden","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","name":"input-1","session":"133.3.2.2"}}
{"timestamp":"1516914277.060650349","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-volume-for-container.created-creating-volume","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","name":"input-1","session":"133.3.2.2.4","volume":"f905ac20-aba5-404d-516a-086129a111eb"}}
{"timestamp":"1516914277.061020613","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-volume-for-container.acquire.acquired","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","id":[4,3],"name":"input-1","session":"133.3.2.2.4.1","volume":"f905ac20-aba5-404d-516a-086129a111eb"}}
{"timestamp":"1516914277.062445879","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-volume-for-container.creating-real-volume","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","name":"input-1","session":"133.3.2.2.4","volume":"f905ac20-aba5-404d-516a-086129a111eb"}}
{"timestamp":"1516914277.068384886","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-volume-for-container.created","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","name":"input-1","session":"133.3.2.2.4","volume":"f905ac20-aba5-404d-516a-086129a111eb"}}
{"timestamp":"1516914277.068813324","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-volume-for-container.release.released","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","id":[4,3],"name":"input-1","session":"133.3.2.2.4.4","volume":"f905ac20-aba5-404d-516a-086129a111eb"}}
{"timestamp":"1516914277.072551966","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-volume-for-container.created-creating-volume","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","name":"input-1","session":"133.3.2.2.5","volume":"d060c14d-f5a1-4371-56e6-26951e6a72c5"}}
{"timestamp":"1516914277.072995901","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-volume-for-container.acquire.acquired","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","id":[4,4],"name":"input-1","session":"133.3.2.2.5.1","volume":"d060c14d-f5a1-4371-56e6-26951e6a72c5"}}
{"timestamp":"1516914277.074348688","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-volume-for-container.creating-real-volume","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","name":"input-1","session":"133.3.2.2.5","volume":"d060c14d-f5a1-4371-56e6-26951e6a72c5"}}
{"timestamp":"1516914277.080519915","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-volume-for-container.created","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","name":"input-1","session":"133.3.2.2.5","volume":"d060c14d-f5a1-4371-56e6-26951e6a72c5"}}
{"timestamp":"1516914277.080994606","source":"atc","message":"atc.create-build.do.aggregate.get.find-or-create-volume-for-container.release.released","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","id":[4,4],"name":"input-1","session":"133.3.2.2.5.4","volume":"d060c14d-f5a1-4371-56e6-26951e6a72c5"}}
{"timestamp":"1516914277.095414162","source":"atc","message":"atc.create-build.do.aggregate.get.failed-to-create-container-in-garden","log_level":2,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","error":"volume graph is disabled","name":"input-1","session":"133.3.2.2"}}
{"timestamp":"1516914277.095846176","source":"atc","message":"atc.create-build.do.aggregate.get.release.released","log_level":0,"data":{"container":"a46bf596-0085-451b-6cbb-e46753af5d8a","id":[5,1],"name":"input-1","session":"133.3.2.2.6"}}

Steps to reproduce

  1. Start Guardian without a graph dir.
  2. Create a container with a rootfs directly via raw:///absolute/path/to/folder.
  3. The creation will fail with volume graph is disabled. This used to work in 1.9.0.
  • Guardian release version: 1.11.0
  • Linux kernel version: n/a
  • Concourse version: 3.8.1 RC
  • Go version: n/a

Concourse containers hang towards end of lifecycle

Todos

Please try the following before submitting the issue:

  • Upgrade Concourse if you are observing a failure in a test (yep, on v2.4.0 with Garden runC 1.0.0
  • Use latest BOSH stemcell if the problem occurred in a BOSH VM (yep, bosh-aws-xen-hvm-ubuntu-trusty-go_agent/3263.8)

Description

Concourse jobs "hang" -- a get OR a task step will have finished, and the next step doesn't proceed (or if it's the last step, the job doesn't resolve).

Logging and/or test output

ps aux when hijacked into a 'hung' container:

USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root          1  0.0  0.0 116124  6820 ?        S<sl 12:30   0:00 /proc/self/exe init
root         18  0.7  0.0  20252  3260 ?        S<s  20:36   0:00 bash
root         24  0.0  0.0  17500  2188 ?        R<+  20:36   0:00 ps aux

Typical garden logs for a particular container handle -- our 'hanging' containers only get to line 29

Steps to reproduce

We have not been able to reliably reproduce the issue. We had been seeing various concourse issues (including this one, although we didn't get so deep in the garden logs to be sure it was the same, it manifested the same way), and ended up dropping the database entirely.

After spinning up a new EBS volume with bosh deploy and getting some initial pipelines to go green for the first ~12 hours, after 12-24 hours, we were seeing this 'hang' somewhat often again. But we can't make it happen.

On a job that is 'hung', if we trigger another (cancelling or not cancelling the 'hung' one), it hangs on "waiting for a suitable set of input versions" in the Concourse UI. Concourse tracker story

Container images

We are running some of these tasks on a custom-built CentOS container image. But the get steps that hang are just whatever Concourse provides, and we also have task steps that are hanging on the base Dockerhub ruby image.

garden.stderr.log -- Possibly unrelated?

Our garden.stderr.log files each have 10-20 lines of the following in them:

time="2016-11-01T09:50:07Z" level=error msg="Couldn't run auplink before unmount: exec: \"auplink\": executable file not found in $PATH"

They're coming in ~1-4 alerts per hour. Is this related?

Smarter DNS defaults

Users running Concourse often run into issues with DNS. For example, out of the box, Ubuntu 16.04 will default /etc/resolv.conf to something like:

# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 127.0.1.1
search localdomain

Any nameserver referring to a local address won't be reachable by a container, as they're in their own network. So, the container ends up without working DNS

Docker filters local entries out, and then if the set is empty, adds Google's DNS servers by default (8.8.8.8 and 8.8.4.4):

https://docs.docker.com/engine/userguide/networking/default_network/configure-dns/

This seems like a reasonable approach to me.

Typo in Kawasaki's iptables setup?

I'm debugging an issue very similar to #80, where my containers cannot reach the internet from Debian Stretch. In the process, I have encountered a perplexing iptables rule that seems to be a typo.

The rule in question is here, where a blanket forwarding rule is added to w--forward to allow all inbound (is that from the perspective of the container or host?) traffic. This rule doesn't make sense, however, as it's a rule that matches when the input interface is the default internet interface, but is added to the ${filter_forward_chain}, which is only consulted for traffic arriving on an input interface starting with ${GARDEN_NETWORK_INTERFACE_PREFIX}. Put another way, it's a rule that will never get triggered (except for host interfaces that begin with w, like wireless interfaces).

This rule makes sense if it should be -o instead of -i, as that allows all traffic inbound from containers if it's destined to the internet. Unfortunately, the use of the words "inbound" and "outbound" is ambiguous in the context of FORWARD, so I can't fully understand the intended purpose of this rule. In any case, I (hex) edited my local copy of Concourse to change that -i to a -o and my containers can reach the internet now.

Am I understanding the purpose of this rule correctly? Or is this just a dead rule and I should be looking for a missing rule in the w--instance chains to let the containers on the internet?

iptables --wait option not supported in some linux flavors

Concourse web+worker 1.3.1 running on standard AWS AMI.

With the simple hello world concourse demo, every run gives me this immediately:

Snippet:

errors.errorString{s:\\\"iptables create-instance-chains: iptables v1.4.18: unknown option \\\\\\\"--wait\\\\\\\"\\\\nTry 'iptables -h' or 'iptables --help' for more information.

This might be a guardian thing - I'm seeing cloudfoundry-incubator/guardian/cmd/kawasaki/main.go:44 in the log.

Full:

runc start: exit status 1: process_linux.go:329: running prestart hook 0 caused "exit status 2: {\"timestamp\":\"1466369914.619743586\",\"source\":\"kawasaki\",\"message\":\"kawasaki.hook.start\",\"log_level\":1,\"data\":{\"config\":{\"HostIntf\":\"wmcnismvpm14-0\",\"ContainerIntf\":\"wmcnismvpm14-1\",\"IPTablePrefix\":\"w--\",\"IPTableInstance\":\"mcnismvpm14\",\"BridgeName\":\"wbrdg-0afe0000\",\"BridgeIP\":\"10.254.0.1\",\"ContainerIP\":\"10.254.0.2\",\"ExternalIP\":\"172.31.30.19\",\"Subnet\":{\"IP\":\"10.254.0.0\",\"Mask\":\"/////A==\"},\"Mtu\":1500,\"DNSServers\":null},\"pid\":908,\"session\":\"1\"}}\n{\"timestamp\":\"1466369914.773725510\",\"source\":\"kawasaki\",\"message\":\"kawasaki.hook.panicked\",\"log_level\":3,\"data\":{\"config\":{\"HostIntf\":\"wmcnismvpm14-0\",\"ContainerIntf\":\"wmcnismvpm14-1\",\"IPTablePrefix\":\"w--\",\"IPTableInstance\":\"mcnismvpm14\",\"BridgeName\":\"wbrdg-0afe0000\",\"BridgeIP\":\"10.254.0.1\",\"ContainerIP\":\"10.254.0.2\",\"ExternalIP\":\"172.31.30.19\",\"Subnet\":{\"IP\":\"10.254.0.0\",\"Mask\":\"/////A==\"},\"Mtu\":1500,\"DNSServers\":null},\"error\":\"\\u0026errors.errorString{s:\\\"iptables create-instance-chains: iptables v1.4.18: unknown option \\\\\\\"--wait\\\\\\\"\\\\nTry `iptables -h' or 'iptables --help' for more information.\\\\n\\\"}\",\"pid\":908,\"session\":\"1\",\"trace\":\"goroutine 1 [running]:\\ngithub.com/pivotal-golang/lager.(*logger).Fatal(0xc82000f5c0, 0x97e0e0, 0x8, 0x7f2251de0028, 0xc8201e1950, 0x0, 0x0, 0x0)\\n\\t/tmp/build/9674af12/garden-runc-release/src/github.com/pivotal-golang/lager/logger.go:131 +0xc5\\nmain.main.func1(0xc820199c08)\\n\\t/tmp/build/9674af12/garden-runc-release/src/github.com/cloudfoundry-incubator/guardian/cmd/kawasaki/main.go:44 +0x13c\\npanic(0x8a3540, 0xc8201e1920)\\n\\t/usr/local/go/src/runtime/panic.go:443 +0x4e9\\nmain.main()\\n\\t/tmp/build/9674af12/garden-runc-release/src/github.com/cloudfoundry-incubator/guardian/cmd/kawasaki/main.go:92 +0x171b\\n\"}}\npanic: iptables create-instance-chains: iptables v1.4.18: unknown option \"--wait\"\nTry `iptables -h' or 'iptables --help' for more information.\n [recovered]\n\tpanic: &errors.errorString{s:\"iptables create-instance-chains: iptables v1.4.18: unknown option \\\"--wait\\\"\\nTry `iptables -h' or 'iptables --help' for more information.\\n\"}\n\ngoroutine 1 [running]:\npanic(0x8a3540, 0xc8201e1950)\n\t/usr/local/go/src/runtime/panic.go:481 +0x3e6\ngithub.com/pivotal-golang/lager.(*logger).Fatal(0xc82000f5c0, 0x97e0e0, 0x8, 0x7f2251de0028, 0xc8201e1950, 0x0, 0x0, 0x0)\n\t/tmp/build/9674af12/garden-runc-release/src/github.com/pivotal-golang/lager/logger.go:152 +0x698\nmain.main.func1(0xc820199c08)\n\t/tmp/build/9674af12/garden-runc-release/src/github.com/cloudfoundry-incubator/guardian/cmd/kawasaki/main.go:44 +0x13c\npanic(0x8a3540, 0xc8201e1920)\n\t/usr/local/go/src/runtime/panic.go:443 +0x4e9\nmain.main()\n\t/tmp/build/9674af12/garden-runc-release/src/github.com/cloudfoundry-incubator/guardian/cmd/kawasaki/main.go:92 +0x171b\n"

Can't run on armv7l (32bit) - idmapper dependency bump needed

Description

gdn makes use of a version of idmapper that is quite old and doesn't include the latest fixes that landed in the repository.

In my use case, I'm trying to get a Concourse worker running on a 32bit raspberry pi, thus, encountering the following problem when trying to get gdn running:

panic: integer overflow on token 4294967295 while parsing line "         0          0 4294967295"

goroutine 19 [running]:
code.cloudfoundry.org/idmapper.must(...)
	/home/pi/go/pkg/mod/code.cloudfoundry.org/[email protected]/max_valid_uid.go:69
code.cloudfoundry.org/idmapper.MustGetMaxValidUID(...)
	/home/pi/go/pkg/mod/code.cloudfoundry.org/[email protected]/max_valid_uid.go:15
code.cloudfoundry.org/guardian/guardiancmd.mustGetMaxValidUID(...)
	/home/pi/concourse-arm/src/garden-runc-release/src/guardian/guardiancmd/command_linux.go:212
code.cloudfoundry.org/guardian/guardiancmd.(*CommonCommand).idMappings(0x14405a0, 0x1ae74, 0x56a98, 0x16ec6c0, 0x8, 0x8, 0x76c2d8)
	/home/pi/concourse-arm/src/garden-runc-release/src/guardian/guardiancmd/command.go:633 +0x208
code.cloudfoundry.org/guardian/guardiancmd.(*CommonCommand).NewGardenFactory(0x14405a0, 0x1b724, 0x76efe004)
	/home/pi/concourse-arm/src/garden-runc-release/src/guardian/guardiancmd/command_linux.go:45 +0x1c
code.cloudfoundry.org/guardian/guardiancmd.(*CommonCommand).createWiring(0x14405a0, 0x951288, 0x164a9f0, 0x7, 0x81ec82, 0x8)
	/home/pi/concourse-arm/src/garden-runc-release/src/guardian/guardiancmd/command.go:215 +0x20
code.cloudfoundry.org/guardian/guardiancmd.(*ServerCommand).Run(0x169b6d0, 0x1492300, 0x1492340, 0x0, 0x0)
	/home/pi/concourse-arm/src/garden-runc-release/src/guardian/guardiancmd/server.go:236 +0x84
github.com/tedsuo/ifrit.(*process).run(0x149d6a0)
	/home/pi/go/pkg/mod/github.com/tedsuo/[email protected]/process.go:71 +0x38
created by github.com/tedsuo/ifrit.Background
	/home/pi/go/pkg/mod/github.com/tedsuo/[email protected]/process.go:49 +0xcc

This particular problem got fixed in this commit: cloudfoundry/idmapper@d11c131

A go get code.cloudfoundry.org/idmapper on top of guardian produces the following, which fixes the problem:

diff --git a/go.mod b/go.mod
index d8317b4..fdc1c73 100644
--- a/go.mod
+++ b/go.mod
@@ -9,7 +9,7 @@ require (
 	code.cloudfoundry.org/debugserver v0.0.0-20170501225606-70715da12ee9
 	code.cloudfoundry.org/garden v0.0.0-20181019152342-f1dfd0e2a8c4
 	code.cloudfoundry.org/grootfs v0.0.0-20180525140952-c01568707fea
-	code.cloudfoundry.org/idmapper v0.0.0-20170928154045-bd44efed5494
+	code.cloudfoundry.org/idmapper v0.0.0-20190410122444-276ec50ea425
 	code.cloudfoundry.org/lager v0.0.0-20190115020142-54c4f2530dde
 	code.cloudfoundry.org/localip v0.0.0-20170223024724-b88ad0dea95c
 	github.com/BurntSushi/toml v0.3.1
@@ -53,7 +53,7 @@ require (
 	github.com/opencontainers/go-digest v1.0.0-rc1
 	github.com/opencontainers/image-spec v1.0.1
 	github.com/opencontainers/runc v1.0.0-rc2.0.20190403200919-029124da7af7
-	github.com/opencontainers/runtime-spec v0.0.0-20180909173843-eba862dc2470
+	github.com/opencontainers/runtime-spec v1.0.1
 	github.com/pkg/errors v0.8.1
 	github.com/poy/eachers v0.0.0-20181020210610-23942921fe77 // indirect
 	github.com/sirupsen/logrus v1.4.1
diff --git a/go.sum b/go.sum
index aef654b..8315728 100644
--- a/go.sum
+++ b/go.sum
@@ -12,6 +12,8 @@ code.cloudfoundry.org/grootfs v0.0.0-20180525140952-c01568707fea h1:ojmI2HIz+14p
 code.cloudfoundry.org/grootfs v0.0.0-20180525140952-c01568707fea/go.mod h1:+PBCOqTHFttBXXkhS/6pw7W54wx/bn/UgGGFZuOHZ3k=
 code.cloudfoundry.org/idmapper v0.0.0-20170928154045-bd44efed5494 h1:YfUhI5zVZqrE3uHUttOQjJhjzvkSgyneCMwdRyKl0do=
 code.cloudfoundry.org/idmapper v0.0.0-20170928154045-bd44efed5494/go.mod h1:Pwb21U3aIPsZBnwNHXwgSwxSNR78pWSJ6psLEdo5h0o=
+code.cloudfoundry.org/idmapper v0.0.0-20190410122444-276ec50ea425 h1:WEBKXXBe4nL87BLutZMe24Nkq4BLQp9c0aSBuAemMcs=
+code.cloudfoundry.org/idmapper v0.0.0-20190410122444-276ec50ea425/go.mod h1:n14+qevUQup0WW7HhMJMVWq9OnSjktDlP3tww/eUpMU=
 code.cloudfoundry.org/lager v0.0.0-20190115020142-54c4f2530dde h1:2P/1L5rJ11ZsoLVTGbLMJArFwMIA5jfNKQ8TWDDM+6k=
 code.cloudfoundry.org/lager v0.0.0-20190115020142-54c4f2530dde/go.mod h1:O2sS7gKP3HM2iemG+EnwvyNQK7pTSC6Foi4QiMp9sSk=
 code.cloudfoundry.org/localip v0.0.0-20170223024724-b88ad0dea95c h1:dO6i+2uQgR1ZHMwPu5wTTrJAXpG75VCnaMwFgFSOmgI=
@@ -113,6 +115,8 @@ github.com/opencontainers/runc v1.0.0-rc2.0.20190403200919-029124da7af7 h1:Vor0/
 github.com/opencontainers/runc v1.0.0-rc2.0.20190403200919-029124da7af7/go.mod h1:qT5XzbpPznkRYVz/mWwUaVBUv2rmF59PVA73FjuZG0U=
 github.com/opencontainers/runtime-spec v0.0.0-20180909173843-eba862dc2470 h1:dQgS6CgSB2mBQur4Cz7kaEtXNSw56ZlRb7ZsBT70hTA=
 github.com/opencontainers/runtime-spec v0.0.0-20180909173843-eba862dc2470/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
+github.com/opencontainers/runtime-spec v1.0.1 h1:wY4pOY8fBdSIvs9+IDHC55thBuEulhzfSgKeC1yFvzQ=
+github.com/opencontainers/runtime-spec v1.0.1/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
 github.com/pkg/errors v0.8.1 h1:iURUrRGxPUNPdy5/HRSm+Yj6okJ6UtLINN0Q9M4+h3I=
 github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
 github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
@@ -140,6 +144,7 @@ github.com/vishvananda/netns v0.0.0-20171111001504-be1fbeda1936 h1:J9gO8RJCAFlln
 github.com/vishvananda/netns v0.0.0-20171111001504-be1fbeda1936/go.mod h1:ZjcWmFBXmLKZu9Nxj3WKYEafiSqer2rnvPr0en9UNpI=
 golang.org/x/net v0.0.0-20180906233101-161cd47e91fd h1:nTDtHvHSdCn1m6ITfMRqtOd/9+7a3s8RBNOZ3eYZzJA=
 golang.org/x/net v0.0.0-20180906233101-161cd47e91fd/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
+golang.org/x/net v0.0.0-20190213061140-3a22650c66bd/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
 golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f h1:wMNYb4v58l5UBM7MYRLPG6ZhfOqbKu7X5eyFl8ZhKvA=
 golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
 golang.org/x/sys v0.0.0-20180905080454-ebe1bf3edb33/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=

Logging and/or test output

Provide any output you think may help understanding further the issue. Guardian
stores its log files in /var/vcap/sys/log/garden when run through its BOSH
release.

Steps to reproduce

Try to run gdn built from 44208d0 (master as of right now) in a 32-bit raspberry pi.

  • Guardian release version: master
  • Linux kernel version: Linux raspberrypi 4.14.98-v7+ #1200 SMP Tue Feb 12 20:27:48 GMT 2019 armv7l GNU/Linux
  • Concourse version: master
  • Go version: go version go1.12.4 linux/arm

Thanks!

CAP_AUDIT_READ runc error

When using concourse which uses cf guardian which uses runc (holy cow), I get this cryptic error:

runc start: exit status 1: unknown capability "CAP_AUDIT_READ"

I'm running on the standard AWS EC2 Ubuntu 14.04 LTS HVM AMI (what a word-full).

It seems as though this kernel does not have CAP_AUDIT_READ. It DOES have the following:

ubuntu@ip-172-31-18-217:~$ man capabilities | grep CAP_AUDIT -a2
       CAP_AUDIT_CONTROL (since Linux 2.6.11)
              Enable  and  disable  kernel auditing; change auditing filter rules; retrieve auditing
              status and filtering rules.

       CAP_AUDIT_WRITE (since Linux 2.6.11)
              Write records to kernel auditing log.

This effectively makes concourse on AWS dead in the water for me, since guardian only supports Ubuntu 14.04 LTS and this bug/whatever prevents me from running it on the Ubuntu 14.04 LTS AMI.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.