Giter VIP home page Giter VIP logo

kube-mesos-framework's Introduction

Kubernetes-Mesos

build

Kubernetes-Mesos modifies Kubernetes to act as an Apache Mesos framework.

Features On Mesos

Kubernetes gains the following benefits when installed on Mesos:

  • Node-Level Auto-Scaling - Kubernetes minion nodes are created automatically, up to the size of the provisioned Mesos cluster.
  • Resource Sharing - Co-location of Kubernetes with other popular next-generation services on the same cluster (e.g. Hadoop, Spark, and Chronos, Cassandra, etc.). Resources are allocated to the frameworks based on fairness and can be claimed or passed on depending on framework load.
  • Independence from special Network Infrastructure - Mesos can (but of course doesn't have to) run on networks which cannot assign a routable IP to every container. The Kubernetes on Mesos endpoint controller is specially modified to allow pods to communicate with services in such an environment.

For more information about how Kubernetes-Mesos is different from Kubernetes, see Architecture.

Release Status

Kubernetes-Mesos is alpha quality, still under active development, and not yet recommended for production systems.

For more information about development progress, see the known issues where backlog issues are tracked.

Conferences

  • MesosCon 2016 Asia: Kubernetes on Mesos: Not Just Another Mesos Framework - Klaus Ma, IBM
  • KubeCon 2016: Kubernetes on EGO -- Bringing Enterprise Resource Management and Scheduling to Kubernetes - Da Ma, IBM

Usage

This project combines concepts and technologies from two already-complex projects: Mesos and Kubernetes. It may help to familiarize yourself with the basics of each project before reading on:

To get up and running with Kubernetes-Mesos, follow:

Analytics

kube-mesos-framework's People

Contributors

a-robinson avatar alex-mohr avatar artfulcoder avatar brendandburns avatar davidopp avatar davidwalter0 avatar deads2k avatar derekwaynecarr avatar eparis avatar fabioy avatar gmarek avatar goltermann avatar huang195 avatar j3ffml avatar jsafrane avatar k82cn avatar lavalamp avatar markturansky avatar mbforbes avatar mikedanese avatar mqliang avatar random-liu avatar resouer avatar smarterclayton avatar sttts avatar thockin avatar vishh avatar wojtek-t avatar yujuhong avatar zhengguoyong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kube-mesos-framework's Issues

heapster can't be launched

I deploy heapster refer to https://github.com/kubernetes/heapster/blob/master/docs/influxdb.md.
But heapster can't be launched. The error is described below:
[root@slave1 influxdb]# kubectl logs --previous heapster-2e2m2 --namespace=kube-system I0217 04:00:01.293472 1 heapster.go:55] /heapster --source=kubernetes:https://kubernetes.default --sink=influxdb:http://monitoring-influxdb:8086 I0217 04:00:01.293539 1 heapster.go:56] Heapster version 0.18.2 F0217 04:00:01.293593 1 heapster.go:62] open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
why km can't generate serviceaccount secret and token? please help me.

[e2e] Failed to start DNS in ci/test-e2e.sh

FOUND IN VERSION (Details about the version, e.g. 0.7. Please list the other versions which this bug exist.)
[ master ]

PLATFORMS e.g. 'uname -a':
[ Linux dcosdemo02.eng.platformlab.ibm.com 4.4.0-31-generic #50-Ubuntu SMP Wed Jul 13 00:07:12 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux ]

COMMANDS OR DAEMONS: -- list all related components
[ apiserver ]

DESCRIPTION: -- symptom of the problem a customer would see
[ TLS timeout when kubectl get pods/nodes, more log of apiserver: https://gist.github.com/k82cn/9e6395a4e6cfc3e9dbfbc186c1e13d0b ]

IMPACT: -- impact of problem in customer env (best/worse case scenarios)
[ e2e test can not finished. ]

EXPECTED BEHAVIOR:
[ ./ci/test-e2e.sh passed ]

HOW TO REPRODUCE/DETAILS OF PROBLEM: -- full details of the problem
[ ./ci/test-e2e.sh ]

Empty ServiceAccount volumes when using *mesos* provider

This issue is similar to: kubernetes/kubernetes#31062

PLATFORMS e.g. 'uname -a':
Linux 3.10.0-327.10.1.el7.x86_64 #1 SMP Tue Feb 16 17:03:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

DCOS 1.7

# docker version
Client:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        
 OS/Arch:      linux/amd64
# kubectl version
Client Version: version.Info{Major:"1", Minor:"5+", GitVersion:"v1.5.0-alpha.0.127+ab0b937c2efabe", GitCommit:"ab0b937c2efabedbb401753c8f232a14790af131", GitTreeState:"clean", BuildDate:"2016-09-05T13:40:17Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

Server Version: version.Info{Major:"", Minor:"", GitVersion:"v0.0.0-master+$Format:%h$", GitCommit:"$Format:%H$", GitTreeState:"not a git tree", BuildDate:"1970-01-01T00:00:00Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

COMMANDS OR DAEMONS: -- list all related components

km apiserver \
  --address=${KUBERNETES_MASTER_IP} \
  --etcd-servers=http://${KUBERNETES_MASTER_IP}:4001 \
  --service-cluster-ip-range=10.10.10.0/24 \
  --port=8888 \
  --cloud-provider=mesos \
  --cloud-config=mesos-cloud.conf \
  --secure-port=0 \
  --service-account-key-file=/tmp/ca.key \
  --client-ca-file=/tmp/ca.crt \
  --tls-private-key-file=/tmp/ca.crt \
  --admission-control=ServiceAccount,DefaultStorageClass,AlwaysAdmit \
  --v=1
./km controller-manager \
  --master=${KUBERNETES_MASTER_IP}:8888 \
  --cloud-provider=mesos \
  --cloud-config=mesos-cloud.conf  \
  --service-account-private-key-file=/tmp/ca.key \
  --root-ca-file=/tmp/ca.crt \
  --v=1
km scheduler \
  --address=${KUBERNETES_MASTER_IP} \
  --mesos-master=${MESOS_MASTER} \
  --etcd-servers=http://${KUBERNETES_MASTER_IP}:4001 \
  --mesos-user=root \
  --api-servers=${KUBERNETES_MASTER_IP}:8888 \
  --cluster-dns=${KUBERNETES_MASTER_IP} \
  --cluster-domain=cluster.local \
  --v=1

DESCRIPTION: -- symptom of the problem a customer would see
The directory ``/var/run/secrets/kubernetes.io/serviceaccount/` inside a container is empty

even if docker inspect shows:

        "Mounts": [
            {
                "Source": "/var/lib/mesos/slave/slaves/daf2b55c-2f31-4bae-a702-9341b9b86b04-S0/frameworks/daf2b55c-2f31-4bae-a702-9341b9b86b04-0000/executors/ad6d20ef43cf1f6a_k8sm-executor/runs/26e93fbd-194e-4d1e-a6aa-54c9145bab8d/pods
/4493a04f-9463-11e6-ba22-0e6b70457fa1/volumes/kubernetes.io~secret/default-token-dunvp",
                "Destination": "/var/run/secrets/kubernetes.io/serviceaccount",
                "Mode": "ro",
                "RW": false,
                "Propagation": "rprivate"
            },

IMPACT: -- impact of problem in customer env (best/worse case scenarios)

Every pod which uses k8s api (e.g.: ./pkg/client/restclient/config.go
function:

// InClusterConfig returns a config object which uses the service account
// kubernetes gives to pods. It's intended for clients that expect to be
// running inside a pod running on kubernetes. It will return an error if
// called from a process not running in a kubernetes environment.
func InClusterConfig() (*Config, error)

will fail/crash because InClusterConfig checks if token and ca.crt file exist.

EXPECTED BEHAVIOR:
Volume /var/run/secrets/kubernetes.io/serviceaccount/ should be mounted inside container and contains secret files (ServiceAccountTokenKey, ServiceAccountRootCAKey)

HOW TO REPRODUCE/DETAILS OF PROBLEM: -- full details of the problem
Run kubernetes as a mesos framework in DCOS 1.7+ environment (but most likely it will be reproducible just with pure mesos).

Add version into released binaries.

FOUND IN VERSION (Details about the version, e.g. 0.7. Please list the other versions which this bug exist.)
[ master ]

PLATFORMS e.g. 'uname -a':
[ all ]

COMMANDS OR DAEMONS: -- list all related components
[ all ]

DESCRIPTION: -- symptom of the problem a customer would see
[ no released version ]

IMPACT: -- impact of problem in customer env (best/worse case scenarios)
[ can not trace released binaries. ]

EXPECTED BEHAVIOR:
[ including released version. ]

HOW TO REPRODUCE/DETAILS OF PROBLEM: -- full details of the problem
[ log of kubelet: Started kubelet v0.0.0-master+$Format:%h$ ]

kube-mesos-framework 支持Slave上的GPU吗?

启动Mesos Slave时加入 选项--isolation="cgroups/devices,gpu/nvidia,docker/runtime" --image_providers="DOCKER" 后kube-mesos-framework scheduler 日志不再打印收到offer信息,去掉这个选项就可以正常收到offer。

k8sm:There is no available flag be used to search the path of the network plugin(cni) when run the cmd 'km scheduler'

FOUND IN VERSION (Details about the version, e.g. 0.7. Please list the other versions which this bug exist.)
[ all]

PLATFORMS e.g. 'uname -a':
[all ]

COMMANDS OR DAEMONS: -- list all related components
[scheduler kubelet]

DESCRIPTION: -- symptom of the problem a customer would see
[ k8sm:
when running 'km scheduler' with flag “--kubelet-network-plugin=cni” and deploying a pod, and then faild to run kubelet component in executor of the mesos-slave node the pod was scheduled.
logs:
Failed running kubelet: failed to create kubelet: Network plugin "cni" not found.
Error: failed to create kubelet: Network plugin "cni" not found.

This flag can be worked well in k8s and mesos.
when launching the kubelet component with '--network-plugin=cni --network-plugin-dir=/etc/cni/net.d',running kubelet is successful,and the pod deployed can invoke the network component of customer env ]

IMPACT: -- impact of problem in customer env (best/worse case scenarios)
[ No the flag,the pod deployed can not invoke the network components of customer env]

EXPECTED BEHAVIOR:
[ kube-mesos can implement cni ]

HOW TO REPRODUCE/DETAILS OF PROBLEM: -- full details of the problem
[ See DESCRIPTION]

[docs] Link to getting started guides is broken on README.md

FOUND IN VERSION (Details about the version, e.g. 0.7. Please list the other versions which this bug exist.)
[current master ]

PLATFORMS e.g. 'uname -a':
[na ]

COMMANDS OR DAEMONS: -- list all related components
[na]

DESCRIPTION: -- symptom of the problem a customer would see
The link to the getting started guides in the root README.md is currently broken. https://github.com/kubernetes-incubator/kube-mesos-framework/docs/getting-started-guides/mesos.md yields a 404, but I'm unsure where the actual link should point to. That's why I didn't submit a PR.

IMPACT: -- impact of problem in customer env (best/worse case scenarios)
The getting started guides cannot be accessed directly.

EXPECTED BEHAVIOR:
The link works.

HOW TO REPRODUCE/DETAILS OF PROBLEM: -- full details of the problem
Check the link in the README.md

build error

hi,mada.I found a problem on buliding .

vendor/github.com/google/certificate-transparency/go/x509/x509.go:1461: undefined: elliptic.P224
make: *** [all] Error 1

build environm:

go version : go1.6.3 linux/amd64
os info : 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Can you help me to solve this problem?

Read kubelet/kube-proxy configuration from conf file

In 0.8's design, we'll use kubelet/kube-proxy from upstream. To avoid backward compatibility issue of parameters, all kubelet/kube-proxy's parameters are included in a conf file, and then pass from scheduler to executor to launch them.

Example of agent.yaml:

kubelet:
  - "--apiserver=xxx"
kube-proxy:
  - "--apiserver=xxx"

Is the project still active?

Looks like there haven't been any progress in the past 2-3 months. Is this project still active? Is there anything that we (the devs that have interest in this project) can help with?

@k82cn

Reset history

I think we should consider resetting the history on this project. We started as a straight clone to ease the transition, but now that we've transitioned to a more permanent structure, there isn't a whole lot of benefit to keeping history on all of vendor as a for instance.

Keeping the old history is hiding the real actors in this repo from people searching through.

@k82cn, what do you think?

The pods between hosts can't communicate

slave1:
docker0 172.16.40.1/24
flannel0:172.16.40.0/24

slave2:
docker0:172.16.78.1/24
flannel0:172.16.78.0/24

Description: Any pod with ip 172.16.40.x can communicate 172.16.78.0 and 172.16.78.1 with ping.
Any pod with ip 172.16.78.x can communicate 172.16.40.0 and 172.16.40.1 with ping
But pod with ip 172.16.40.x can't communicate 172.16.78.x.

框架代码make 出现错误

在centOS7.2上,go版本1.6.3 上,编译出现以下错误,求助如何解决:

kube-mesos-framework]# make
+++ [0106 22:55:06] Generating bindata:
/root/kube-mesos-framework/test/e2e/framework/gobindata_util.go
+++ [0106 22:55:06] Building the toolchain targets:
github.com/kubernetes-incubator/kube-mesos-framework/hack/cmd/teststale
+++ [0106 22:55:06] Building go targets for linux/amd64:
cmd/k8sm-scheduler
cmd/k8sm-executor
cmd/k8sm-controller-manager
cmd/km

github.com/kubernetes-incubator/kube-mesos-framework/vendor/github.com/google/certificate-transparency/go/x509

vendor/github.com/google/certificate-transparency/go/x509/x509.go:342: undefined: elliptic.P224
vendor/github.com/google/certificate-transparency/go/x509/x509.go:355: undefined: elliptic.P224
vendor/github.com/google/certificate-transparency/go/x509/x509.go:1461: undefined: elliptic.P224
make: *** [all] 错误 1

[e2e] Failed to start DNS in ci/test-e2e.sh

FOUND IN VERSION (Details about the version, e.g., 3.2, 4.1, 4.1SP1. Please list the other versions which this bug exist.)
[0.7 & master]

PLATFORMS e.g. 'uname -a':
[ Linux dcosdemo02.eng.platformlab.ibm.com 4.4.0-31-generic #50-Ubuntu SMP Wed Jul 13 00:07:12 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux ]

COMMANDS OR DAEMONS: -- list all related components
[ k8sm-controllor-manager]

DESCRIPTION: -- symptom of the problem a customer would see
[ ci/test-e2e.sh hang when creating DNS ]

IMPACT: -- impact of problem in customer env (best/worse case scenarios)
[ ci/test-e2e.sh hang ]

EXPECTED BEHAVIOR:
[ DNS is started, and continue for other test cases. ]

HOW TO REPRODUCE/DETAILS OF PROBLEM: -- full details of the problem
[ ci/test-e2e.sh]

Depreciated framework

kube-mesos-framework is depreciated, this repo should be removed, as it is misleading users.

Can't spawn any container on mesos 1.x

FOUND IN VERSION (Details about the version, e.g. 0.7. Please list the other versions which this bug exist.)
master

PLATFORMS e.g. 'uname -a':

Linux iriln075 4.5.0-coreos-r1 #2 SMP Thu May 5 07:27:26 UTC 2016 x86_64 Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz GenuineIntel GNU/Linux

COMMANDS OR DAEMONS: -- list all related components

executor:

 /var/lib/mesos/slave/slaves/08447169-dda2-487a-934e-3700a19724f9-S1/frameworks/b48146c9-5646-4de7-8ad4-b4edfcbf8449-0000/executors/1b66ae598c8cebcf_k8sm-executor/runs/e85d47a9-e50d-4fc2-8874-4f856184d619/km executor --api-servers=10.1.0.73:8888 --v=0 --allow-privileged=true --suicide-timeout=20m0s --mesos-launch-grace-period=5m0s --cadvisor-port=4194 --sync-frequency=10s --enable-debugging-handlers=true --cluster-dns=10.10.10.10 --cluster-domain=cluster.local --hostname-override=10.1.0.75 --kubelet-cgroups= --cgroup-root=/mesos/e85d47a9-e50d-4fc2-8874-4f856184d619 --housekeeping_interval=10s --global_housekeeping_interval=1m0s

mesos slave:

ExecStart=/usr/bin/docker run \
        --net=host \
        --publish 5051:5051 \
        --privileged \
        --name mesos_slave_IRILN075 \
        -e MESOS_CONTAINERIZERS=docker,mesos \
        -e MESOS_DOCKER_SOCKET=/var/run/weave/weave.sock \
        -e MESOS_EXECUTOR_REGISTRATION_TIMEOUT=15mins \
        -e MESOS_HOSTNAME=${PRIVATEIP} \
        -e Mesos_IP=${PRIVATEIP} \
        -e MESOS_ISOLATOR=cgroups/cpu,cgroups/mem \
        -e MESOS_LOG_DIR=/var/log/mesos/slave \
        -e MESOS_MASTER=zk://10.1.0.73:2181/mesos \
        -e MESOS_PORT=5051 \
        -e MESOS_WORK_DIR=/var/lib/mesos/slave \
        -v /usr/bin/docker:/usr/bin/docker:ro \
        -v /var/run/weave/weave.sock:/var/run/weave/weave.sock \
        -v /var/run/docker.sock:/var/run/docker.sock \
        -v /sys:/sys \
        -v /lib64/libsystemd.so.0:/lib/libsystemd.so.0:ro  \
        -v /lib64/libgcrypt.so.20:/lib/libgcrypt.so.20:ro  \
        -v /lib64/libdevmapper.so.1.02:/lib/libdevmapper.so.1.02:ro \
        -v /lib64/libpthread.so.0:/lib/libpthread.so.0:ro  \
        -v /lib64/libsqlite3.so.0:/lib/lib.0:ro \
        -v /lib64/libudev.so.1:/lib/libudev.so.1:ro \
        mesosphere/mesos-slave:1.0.11.0.1-2.0.93.ubuntu1404

I was also tried with different isolator for mesos 1.0 .

DESCRIPTION: -- symptom of the problem a customer would see

Logs from executor:

I1018 10:44:25.873247    8146 kubelet_node_status.go:76] Successfully registered node 10.1.0.75
E1018 10:44:25.892211    8146 manager.go:1007] Failed to create existing container: /system.slice/docker-4d0b27fcffaecf5bf77adf8c2c90ee3184ac3f59ac8d36520f1ae3f2662f3626.scope: failed to identify the read-write layer ID for container "4d0b27fcffaecf5bf77adf8c2c90ee3184ac3f59ac8d36520f1ae3f2662f3626". - open /var/lib/docker/image/overlay/layerdb/mounts/4d0b27fcffaecf5bf77adf8c2c90ee3184ac3f59ac8d36520f1ae3f2662f3626/mount-id: no such file or directory
W1018 10:44:25.893055    8146 container.go:352] Failed to create summary reader for "/system.slice/system-systemd\\x2dfsck.slice": none of the resources are being tracked.
E1018 10:44:25.894442    8146 manager.go:1007] Failed to create existing container: /system.slice/docker-6b502f8a43c6ab08608b162648dde06afa0915509429e359859052dc5688c70c.scope: failed to identify the read-write layer ID for container "6b502f8a43c6ab08608b162648dde06afa0915509429e359859052dc5688c70c". - open /var/lib/docker/image/overlay/layerdb/mounts/6b502f8a43c6ab08608b162648dde06afa0915509429e359859052dc5688c70c/mount-id: no such file or directory
E1018 10:44:25.895563    8146 manager.go:1007] Failed to create existing container: /system.slice/docker-cb69f656f3c77e38d9849c1207040550ce5c6c1501f4e47ca0fdf1df455c541c.scope: failed to identify the read-write layer ID for container "cb69f656f3c77e38d9849c1207040550ce5c6c1501f4e47ca0fdf1df455c541c". - open /var/lib/docker/image/overlay/layerdb/mounts/cb69f656f3c77e38d9849c1207040550ce5c6c1501f4e47ca0fdf1df455c541c/mount-id: no such file or directory
W1018 10:44:25.897602    8146 container.go:352] Failed to create summary reader for "/system.slice/systemd-fsck-root.service": none of the resources are being tracked.
E1018 10:44:25.898423    8146 manager.go:1007] Failed to create existing container: /system.slice/docker-abd5207ce0896c6bf4d40285fbb4dd721ac7803de095c6051d223f2ef5cb4446.scope: failed to identify the read-write layer ID for container "abd5207ce0896c6bf4d40285fbb4dd721ac7803de095c6051d223f2ef5cb4446". - open /var/lib/docker/image/overlay/layerdb/mounts/abd5207ce0896c6bf4d40285fbb4dd721ac7803de095c6051d223f2ef5cb4446/mount-id: no such file or directory
I1018 10:44:25.900884    8146 manager.go:290] Recovery completed
E1018 10:44:25.941027    8146 manager.go:1007] Failed to create existing container: /system.slice/docker-6b502f8a43c6ab08608b162648dde06afa0915509429e359859052dc5688c70c.scope: failed to identify the read-write layer ID for container "6b502f8a43c6ab08608b162648dde06afa0915509429e359859052dc5688c70c". - open /var/lib/docker/image/overlay/layerdb/mounts/6b502f8a43c6ab08608b162648dde06afa0915509429e359859052dc5688c70c/mount-id: no such file or directory
E1018 10:44:25.941672    8146 manager.go:1007] Failed to create existing container: /system.slice/docker-cb69f656f3c77e38d9849c1207040550ce5c6c1501f4e47ca0fdf1df455c541c.scope: failed to identify the read-write layer ID for container "cb69f656f3c77e38d9849c1207040550ce5c6c1501f4e47ca0fdf1df455c541c". - open /var/lib/docker/image/overlay/layerdb/mounts/cb69f656f3c77e38d9849c1207040550ce5c6c1501f4e47ca0fdf1df455c541c/mount-id: no such file or directory
E1018 10:44:25.942288    8146 manager.go:1007] Failed to create existing container: /system.slice/docker-4d0b27fcffaecf5bf77adf8c2c90ee3184ac3f59ac8d36520f1ae3f2662f3626.scope: failed to identify the read-write layer ID for container "4d0b27fcffaecf5bf77adf8c2c90ee3184ac3f59ac8d36520f1ae3f2662f3626". - open /var/lib/docker/image/overlay/layerdb/mounts/4d0b27fcffaecf5bf77adf8c2c90ee3184ac3f59ac8d36520f1ae3f2662f3626/mount-id: no such file or directory
E1018 10:44:25.942898    8146 manager.go:1007] Failed to create existing container: /system.slice/docker-abd5207ce0896c6bf4d40285fbb4dd721ac7803de095c6051d223f2ef5cb4446.scope: failed to identify the read-write layer ID for container "abd5207ce0896c6bf4d40285fbb4dd721ac7803de095c6051d223f2ef5cb4446". - open /var/lib/docker/image/overlay/layerdb/mounts/abd5207ce0896c6bf4d40285fbb4dd721ac7803de095c6051d223f2ef5cb4446/mount-id: no such file or directory
I1018 10:44:26.000418    8146 executor.go:687] Executor sending status update &StatusUpdate{FrameworkId:&FrameworkID{Value:*b48146c9-5646-4de7-8ad4-b4edfcbf8449-0000,XXX_unrecognized:[],},ExecutorId:&ExecutorID{Value:*1b66ae598c8cebcf_k8sm-executor,XXX_unrecognized:[],},SlaveId:&SlaveID{Value:*08447169-dda2-487a-934e-3700a19724f9-S1,XXX_unrecognized:[],},Status:&TaskStatus{TaskId:&TaskID{Value:*pod.d4c7c445-951f-11e6-b83e-c4346bb852d0,XXX_unrecognized:[],},State:*TASK_STARTING,Data:*[123 34 109 101 116 97 100 97 116 97 34 58 123 34 110 97 109 101 34 58 34 110 103 105 110 120 50 95 100 101 102 97 117 108 116 34 44 34 115 101 108 102 76 105 110 107 34 58 34 47 112 111 100 115 116 97 116 117 115 114 101 115 117 108 116 34 44 34 99 114 101 97 116 105 111 110 84 105 109 101 115 116 97 109 112 34 58 110 117 108 108 125 44 34 115 116 97 116 117 115 34 58 123 125 125],Message:*create-binding-success,SlaveId:&SlaveID{Value:*08447169-dda2-487a-934e-3700a19724f9-S1,XXX_unrecognized:[],},Timestamp:*1.476787466e+09,ExecutorId:nil,Healthy:nil,Source:nil,Reason:nil,Uuid:nil,Labels:nil,ContainerStatus:nil,XXX_unrecognized:[],},Timestamp:*1.476787466e+09,Uuid:*[214 179 175 144 149 31 17 230 168 240 196 52 107 183 172 0],LatestState:nil,XXX_unrecognized:[],}
I1018 10:44:26.025796    8146 executor.go:445] Executor statusUpdateAcknowledgement

IMPACT: -- impact of problem in customer env (best/worse case scenarios)
can't spawn container

EXPECTED BEHAVIOR:
should be possibe to spawn container
HOW TO REPRODUCE/DETAILS OF PROBLEM: -- full details of the problem
Run kube mesos framework on mesos 1.0

Build e2e test cases for kube-mesos

When releasing, e2e test cases help on the confidence; it's better to re-use it as much as possible. Those e2e test cases will also up-merge to master for next release.

Build HTTPExtender service

In 0.8, we're going to use k8s-scheduler's HTTPExtender:

-----------------              ---------------
| k8s-scheduler |   <- http -> | kube-mesos  |
-----------------              ---------------
  1. Filter: filter related nodes based on Mesos's offer
  2. Prioritize: none for 0.8
  3. Bind: check whether the offer is OK; if it's OK, launch the Mesos tasks; otherwise, return error to k8s-scheduler (kubernetes/kubernetes#41447)

And here's checklist in my mind:

  • HTTP server for k8s-scheduler to connect
  • Filter callback handler: return the nodes that has enough offer for Pod
  • Bind callback handler: check whether the offer is still available

libprocess https with basic auth possible?

FOUND IN VERSION (Details about the version, e.g. 0.7. Please list the other versions which this bug exist.)
[latest ]

PLATFORMS e.g. 'uname -a':
[ Linux * 3.10.0-514.21.1.el7.x86_64 #1 SMP Thu May 25 17:04:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux]

COMMANDS OR DAEMONS: -- list all related components
[km scheduler ]

EXPECTED BEHAVIOR:
In spark you can set environmental variables to effect libprocess.
As shown below.
But http seems to be hardcoded within http_transporter.go with no option to add basic auth.

export LIBPROCESS_SSL_ENABLED=true
export LIBPROCESS_SSL_SUPPORT_DOWNGRADE=false
export LIBPROCESS_SSL_CA_FILE=/etc/pki/ca-trust/source/anchors/cert.pem
export LIBPROCESS_SSL_KEY_FILE=/app/mesos/security/server.key
export LIBPROCESS_SSL_CERT_FILE=/app/mesos/security/server.crt
export LIBPROCESS_SSL_VERIFY_CERT=0
export LIBPROCESS_SSL_REQUIRE_CERT=0

Docker container port does not match nodePort: No endpoints visible

FOUND IN VERSION (Details about the version, e.g. 0.7. Please list the other versions which this bug exist.)
[ Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.4", GitCommit:"3b417cc4ccd1b8f38ff9ec96bb50a81ca0ea9d56", GitTreeState:"clean", BuildDate:"2016-10-21T02:48:38Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"", Minor:"", GitVersion:"v0.0.0-master+$Format:%h$", GitCommit:"$Format:%H$", GitTreeState:"not a git tree", BuildDate:"1970-01-01T00:00:00Z", GoVersion:"go1.7.1", Compiler:"gc", Platform:"linux/amd64"}]

PLATFORMS e.g. 'uname -a':
[ Linux 3.10.0-327.36.1.el7.x86_64]

COMMANDS OR DAEMONS: -- list all related components
[ km apiserver, km controller-manager and km scheduler, docker 1.12.1]

DESCRIPTION: -- symptom of the problem a customer would see
[ No endpoints visible except k8sm-scheduler and kubernetes]

IMPACT: -- impact of problem in customer env (best/worse case scenarios)
[ DNS and Dashboard are not running on port shown by api server and nodePorts do not match docker container port]

EXPECTED BEHAVIOR:
[ targetPort and ipTable rules match docker container port]

HOW TO REPRODUCE/DETAILS OF PROBLEM: -- full details of the problem
[Install latest kubernetes-mesos from git and yum packages for Kubernetes on CentOs 7. Deploy skydns-rc, skydns-svc and dashboard]

After creating pods/services from dashboard 1.4.2 standard yaml, the docker container for kubernetes-dashboard is started on port 31005:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 19bbd2b9fae1 gcr.io/google_containers/kubernetes-dashboard-amd64:v1.4.2 "/dashboard --port=90" 8 minutes ago Up 8 minutes k8s_kubernetes-dashboard.5fdb8a_kubernetes-dashboard-266040510-cuvdg_kube-system_11bcc3a7-aa5e-11e6-99ee-005056b56e98_0039044e 8ef41f78432d gcr.io/google_containers/pause-amd64:3.0 "/pause" 9 minutes ago Up 9 minutes 0.0.0.0:31005->9090/tcp

iptables-save shows (tried both modes, this is iptable mode):
-A KUBE-SERVICES -d 10.10.10.253/32 -p tcp -m comment --comment "kube-system/kubernetes-dashboard:http cluster IP" -m tcp --dport 80 -j KUBE-SVC-3MCIRW5DISVHFVDN -A KUBE-NODEPORTS -p tcp -m comment --comment "kube-system/kubernetes-dashboard:http" -m tcp --dport 32173 -j KUBE-MARK-MASQ -A KUBE-NODEPORTS -p tcp -m comment --comment "kube-system/kubernetes-dashboard:http" -m tcp --dport 32173 -j KUBE-SVC-3MCIRW5DISVHFVDN

I can access the dashboard by port 31005 but Kubernetes seems to expect it to run on port 32173. 32173 is the port which is also shown as nodePort when I look at the actual config in the dashboard.yaml. I've also run the debugging guide from Kubernetes. DNS and endpoints not working. The rest seems to be okay. Services pods and iptables checked successfully.

build error

FOUND IN VERSION (Details about the version, e.g. 0.7. Please list the other versions which this bug exist.)
[ 0.8]

PLATFORMS e.g. 'uname -a':
[[root@centos-node-1 kube-mesos-framework-release-0.8]# uname -a
Linux centos-node-1 3.10.0-514.6.1.el7.x86_64 #1 SMP Wed Jan 18 13:06:36 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
[root@centos-node-1 kube-mesos-framework-release-0.8]# cat /etc/redhat-release
CentOS Linux release 7.3.1611 (Core) ]

COMMANDS OR DAEMONS: -- list all related components
[make release ]

DESCRIPTION: -- symptom of the problem a customer would see
[[root@centos-node-1 kube-mesos-framework-release-0.8]# make release
+++ [0207 14:57:52] Verifying Prerequisites....
+++ [0207 14:57:55] Building Docker image kube-build:build-bab9018bcb
+++ Docker build command failed for kube-build:build-bab9018bcb

Sending build context to Docker daemon 15.92 MB
Step 1/11 : FROM gcr.io/google_containers/kube-cross:v1.6.3-7
---> 33fe81337964
Step 2/11 : RUN touch /kube-build-image
---> Using cache
---> 2fe7b5c8972f
Step 3/11 : RUN chmod -R a+rwx /usr/local/go/pkg ${K8S_PATCHED_GOROOT}/pkg
---> Running in b0c49e972463
chmod: cannot access '/pkg': No such file or directory
The command '/bin/sh -c chmod -R a+rwx /usr/local/go/pkg ${K8S_PATCHED_GOROOT}/pkg' returned a non-zero code: 1

To retry manually, run:

docker build -t kube-build:build-bab9018bcb --pull=false /root/mesos+k8s/kubernetes/kube-mesos-framework-release-0.8/_output/images/kube-build:build-bab9018bcb ]

IMPACT: -- impact of problem in customer env (best/worse case scenarios)
[ ]

EXPECTED BEHAVIOR:
[ ]

HOW TO REPRODUCE/DETAILS OF PROBLEM: -- full details of the problem
[ ]

Create a SECURITY_CONTACTS file.

As per the email sent to kubernetes-dev[1], please create a SECURITY_CONTACTS
file.

The template for the file can be found in the kubernetes-template repository[2].
A description for the file is in the steering-committee docs[3], you might need
to search that page for "Security Contacts".

Please feel free to ping me on the PR when you make it, otherwise I will see when
you close this issue. :)

Thanks so much, let me know if you have any questions.

(This issue was generated from a tool, apologies for any weirdness.)

[1] https://groups.google.com/forum/#!topic/kubernetes-dev/codeiIoQ6QE
[2] https://github.com/kubernetes/kubernetes-template-project/blob/master/SECURITY_CONTACTS
[3] https://github.com/kubernetes/community/blob/master/committee-steering/governance/sig-governance-template-short.md

Start/Monitor kubelet/kube-proxy process

In 0.8's design, the binaries of kubelet/kube-proxy from upstream will be used. And k8sm-executor will start and monitor it, e.g. if exit, restart it. But we did not provide "perfect " error handling in 0.8. Systemd is also an option for that.

Re-architect mesos-framework to avoid vendoring kube

While the mesos-framework seems like it should be a composition of stock kube, it is built as a fork. The nature of the fork means that the functionality is incomplete with respect to kube. Even when living in tree, there was no parity. A cursory inspection revealed that these features didn't function properly at the time of the split:

  1. quota
  2. pod disruption budgets
  3. petsets
  4. scheduled jobs
  5. attach detatch
  6. CSRs
  7. garbage collection

This repo has made the dependency chain obvious, so it should be possible to identify true dependencies, but the only path to clear parity an maintainability to layer mesos-framework on top of an already built kube ("bring your own kube") and manage the integration via configuration.

An design is needed to identify and demonstrate refactoring targets in kubernetes to enable this kind of integration. They can then be proposed and worked in upstream kube.

[e2e] Update docker version in Mesos Slave container.

FOUND IN VERSION (Details about the version, e.g. 0.7. Please list the other versions which this bug exist.)
[ 0.7 ]

PLATFORMS e.g. 'uname -a':
[ Linux dcosdemo02.eng.platformlab.ibm.com 4.4.0-31-generic #50-Ubuntu SMP Wed Jul 13 00:07:12 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux ]

COMMANDS OR DAEMONS: -- list all related components
[ e2e ]

DESCRIPTION: -- symptom of the problem a customer would see
[

I1015 00:52:58.545079     196 kubelet.go:2261] skipping pod synchronization - [container runtime is down]
E1015 00:52:58.565074     196 kubelet.go:2651] Container runtime sanity check failed: container runtime version is older than 1.21
I1015 00:53:03.545400     196 kubelet.go:2261] skipping pod synchronization - [container runtime is down]
E1015 00:53:03.565996     196 kubelet.go:2651] Container runtime sanity check failed: container runtime version is older than 1.21
E1015 00:53:04.622051     196 kubelet.go:2192] Failed to check if disk space is available for the runtime: failed to get fs info for "runtime": unable to find data for container /

]

IMPACT: -- impact of problem in customer env (best/worse case scenarios)
[ Can not run e2e test. ]

EXPECTED BEHAVIOR:
[ ]

HOW TO REPRODUCE/DETAILS OF PROBLEM: -- full details of the problem
[ ]

Determine the fate of kubecontainer.Option which is only needed by mesos

Hi guys,

We are now refactoring kubelet to use new CRI, and we noticed mesos integration introduced a kubecontainer.Option to kube-runtime, which blocked the refactoring a little bit.

Because we consider sandbox env should be docker specific for now, so we have to remove the runtime levelkubecontainer.Option and move pod sandbox env to dockershim.

So we'd like to discuss is kube-mesos-framework still relying on this option? If we change it to dockershim, how will mesos set pod sandbox env?

ref: kubernetes/kubernetes#33561

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.