Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat (Karpenter Add-on): v0.35 update, Node Role custom policies, and other fixes #947

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

youngjeong46
Copy link
Collaborator

@youngjeong46 youngjeong46 commented Mar 13, 2024

Issue #, if available: Fixes #859 #893 #945

Description of changes:

This PR makes the following changes in the Karpenter Add-on:

  • Update to v0.35
  • Ability to use a directory of IAM policy JSONs to add custom policies to the Karpenter Node Role, in addition to the required policies
  • Remove EKS CNI policy if the VPC CNI addon is in the cluster
  • Fixes minor error on instanceStorePolicy under EC2NodeClass not setting correctly when not provided.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@jsamuel1
Copy link
Contributor

@youngjeong46 -- Rather than requiring unverified JSON for the extra node role policies, are we able to add an option to pass in an array of iam ManagedPolicy names (string name to feed to fromAwsManagedPolicyName) as another pathway?

// Set up Instance Profile
const instanceProfileName = md5.Md5.hashStr(stackName+region);
const karpenterInstanceProfile = new iam.CfnInstanceProfile(cluster, 'karpenter-instance-profile', {
const karpenterInstanceProfile = new iam.CfnInstanceProfile(clusterInfo.cluster, 'karpenter-instance-profile', {
roles: [karpenterNodeRole.roleName],
instanceProfileName: `KarpenterNodeInstanceProfile-${instanceProfileName}`,
path: '/'
});

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add here:
karpenterInstanceProfile.node.addDependency(karpenterNodeRole);
or else cdk will fail to delete the resource later.

Copy link
Contributor

@jsamuel1 jsamuel1 Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, a better fix would be to use iam.InstanceProfile, instead of the CfnInstanceProfile.
The name of the profile & cdk item need to both change as part of that sort of change, so you might need to add something to the md5 hash above to ensure unique.
eg.

        const instanceProfileName = md5.Md5.hashStr(stackName+region  + clusterInfo.cluster.clusterName);
        const karpenterInstanceProfile = new iam.CfnInstanceProfile(clusterInfo.cluster, 'karpenter-iam-instance-profile', {

],
const karpenterNodeRole = new iam.Role(clusterInfo.cluster, 'karpenter-node-role', {
assumedBy: new iam.ServicePrincipal(`ec2.${clusterInfo.cluster.stack.urlSuffix}`),
managedPolicies: DEFAULT_KARPENTER_NODE_ROLE_POLICIES,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we allow adding to the list of managed policies here, via the config of the addon?
eg. When using the CloudwatchInsights addon, you need to add:

    iam.ManagedPolicy.fromAwsManagedPolicyName("CloudWatchAgentServerPolicy"),
    iam.ManagedPolicy.fromAwsManagedPolicyName("AWSXrayWriteOnlyAccess")

Much easier to add 2 lines like that than exporting the entire policy and putting in a folder.

@shapirov103
Copy link
Collaborator

@youngjeong46 please take a look when you are back. Since it is addressing a few issues, potentially missing features and update to 0.35. the complexity of the karpenter addon is affecting maintainability of this addon.
It needs to be refactored and componentized. Separate classes for NodePool and NodeClass as well as moving out version specific logic to AlphaImplementation and BetaImplementation respectively. Let's discuss.

@thpham
Copy link
Contributor

thpham commented Jul 11, 2024

@youngjeong46 thank you very much for your effort, if you are doing a refactor would you be able to address the following bug as well: #1039

Thank you.

Copy link

This PR has been automatically marked as stale because it has been open 60 days
with no activity. Remove stale label or comment or this PR will be closed in 10 days

@github-actions github-actions bot added the stale label Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants