Giter VIP home page Giter VIP logo

Comments (4)

t0yv0 avatar t0yv0 commented on August 30, 2024

Thank you for filing this issue @JustASquid, could you add a quick repro program to make it easier to reproduce this exact issue. Thank you.

from pulumi-eks.

rquitales avatar rquitales commented on August 30, 2024

@JustASquid, thank you for reporting this issue and providing detailed information. I am able to repro the problem using the example in our repository: Managed NodeGroups Example - with skipDefaultNodeGroup is set to false. I confirmed the issue by deploying an nginx workload tainted to the default node group and a curlimage workload to the managed node group. The curlimage workload couldn't reach the nginx pod unless the workloads were switched between node groups.

You've correctly identified that the two different security groups are causing communication issues. However, it's actually the managed node group that's using the security group created by EKS, while the default node group uses the security group managed by Pulumi. When the EKS provider is set to not skip the default node group creation, we create a security group that only allows intra-node communication within that group. Reference: Security Group Configuration.

The EKS managed node group uses the default cluster security group created by AWS. Even if an additional security group is specified during cluster creation (which is used by the default node group), it won't be attached to the managed node group instances. To enable communication between these node groups, you need to use a custom launch template for the ManagedNodeGroup to specify the security group created by Pulumi.

Here’s a TypeScript example of setting this up:

const cluster = new eks.Cluster("cluster", {
  // ... (other configurations)
  skipDefaultNodeGroup: false,
});

// Create Managed Node Group with custom launch template to use the security group that the default node group uses.

function createUserData(cluster: aws.eks.Cluster, extraArgs: string): pulumi.Output<string> {
    const userdata = pulumi
        .all([
            cluster.name,
            cluster.endpoint,
            cluster.certificateAuthority.data,
        ])
        .apply(([clusterName, clusterEndpoint, clusterCertAuthority]) => {
            return `MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="

--==MYBOUNDARY==
Content-Type: text/x-shellscript; charset="us-ascii"

#!/bin/bash

/etc/eks/bootstrap.sh --apiserver-endpoint "${clusterEndpoint}" --b64-cluster-ca "${clusterCertAuthority}" "${clusterName}" ${extraArgs}
--==MYBOUNDARY==--`;
        });

    // Encode the user data as base64.
    return pulumi
        .output(userdata)
        .apply((ud) => Buffer.from(ud, "utf-8").toString("base64"));
}

const lt = new aws.ec2.LaunchTemplate("my-mng-lt", {
  imageId: "ami-0cfd96d646e5535a8",
  instanceType: "t3.medium",
  vpcSecurityGroupIds: [cluster.defaultNodeGroup!.nodeSecurityGroup.id], // <- This is where we define the SG to be used by the MNG.
    userData: createUserData(cluster.core.cluster, "--kubelet-extra-args --node-labels=mylabel=myvalue"), // This is required to enable instances to join the cluster.

});

const mng = new eks.ManagedNodeGroup("cluster-my-mng", {
  // ... (other configurations)
  cluster: cluster,
  launchTemplate: {
    id: lt.id,
    version: pulumi.interpolate`${lt.latestVersion}`,
  },
});

Alternatively, as you mentioned, you could skip creating the default node group, and everything should work as expected. Please let us know if this resolves your issue or if you need further assistance!

from pulumi-eks.

JustASquid avatar JustASquid commented on August 30, 2024

Thank you @rquitales and you are exactly right - I got the order back to front in my original post, indeed the MNG's are using the EKS-created SG.

And I was able to work around the issue by skipping the default node group.

I do feel that this behavior is non-ideal though, as the path of least resistance when setting up a cluster is to use the default node group. It's easy to run into the case where it cannot communicate with any subsequent MNGs, and the problem can manifest in a very non-obvious way (In my case, no DNS resolution on the MNG nodes).

Going down the road of specifying a custom launch template is not trivial, least of all because you need to fetch an AMI ID. So I guess the question is, is there a particular reason it works this way? Why not just have the default NG be assigned the cluster's EKS-created security group?

from pulumi-eks.

flostadler avatar flostadler commented on August 30, 2024

Another possible workaround is setting up the necessary security group rules to allow the different node groups to communicate. Like this for example:

const cluster = new eks.Cluster("example-managed-nodegroups", {
  // ... (other configurations)
  skipDefaultNodeGroup: false,
});

export const clusterSecurityGroup = cluster.eksCluster.vpcConfig.clusterSecurityGroupId;

const eksClusterIngressRule = new aws.vpc.SecurityGroupIngressRule(
  `selfManagedNodeIngressRule`,
  {
      description: "Allow managed workloads to communicate with self managed nodes",
      fromPort: 0,
      toPort: 0,
      ipProtocol: "-1",
      securityGroupId: clusterSecurityGroup,
      referencedSecurityGroupId: cluster.nodeSecurityGroup.id,
  }
);

const nodeIngressRule = new aws.vpc.SecurityGroupIngressRule(
  `managedNodeIngressRule`,
  {
      description: "Allow self managed workloads to communicate with managed workloads",
      fromPort: 0,
      toPort: 0,
      ipProtocol: "-1",
      securityGroupId: cluster.nodeSecurityGroup.id,
      referencedSecurityGroupId: clusterSecurityGroup,
  }
);

I'm gonna check what it would take to add this to the component itself. I'm not necessarily concerned about security implications here because we already have open firewalls within the Pulumi managed security group:

const nodeIngressRule = new aws.ec2.SecurityGroupRule(
`${name}-eksNodeIngressRule`,
{
description: "Allow nodes to communicate with each other",
type: "ingress",
fromPort: 0,
toPort: 0,
protocol: "-1", // all
securityGroupId: nodeSecurityGroup.id,
self: true,
},
{ parent, provider },
);

Equally the EKS managed security group also allows all communication within itself: https://docs.aws.amazon.com/eks/latest/userguide/sec-group-reqs.html

from pulumi-eks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.