Name	Name	Last commit message	Last commit date
parent directory ..
base	base
rds-only	rds-only
s3-only	s3-only
README.md	README.md
kustomization.yaml	kustomization.yaml
requirements.txt	requirements.txt

Kustomize Manifests for RDS and S3

Overview

This Kustomize Manifest can be used to deploy Kubeflow Pipelines (KFP) and Katib with RDS and S3.

RDS

Amazon Relational Database Service (RDS) is a managed relational database service that facilitates several database management tasks such as database scaling, database backups, database software patching, OS patching, and more.

In the default kubeflow installation, the KFP and Katib components both use their own MySQL pod to persist KFP data (such as experiments, pipelines, jobs, etc.) and Katib experiment observation logs, respectively.

As compared to using the MySQL setup in the default installation, using RDS provides the following advantages:

Easier to configure availability: RDS provides high availability and failover support for DB instances using Multi Availability Zone (Mulit-AZ) deployments with a single standby DB instance, increasing the availability of KFP and Katib services during unexpected network events
Easy to configure scalability: While the default Kubeflow installation uses a EBS hosted Peristent Volume Claim that is AZ bound and does not support automatic online resizing, RDS can be configured to handle availability and scaling needs
KFP and Katib data can be persisted beyond single Kubeflow installations: Using RDS decouples the KFP and Katib datastores from the Kubeflow deployment, allowing multiple Kubeflow installations to reuse the same RDS instance provided that the KFP component versions stores data in a format that is compatible with each other.
Higher level of customizability and management: RDS provides management features to facilitate changing database instance types, updating SQL versions, and more.

S3

Amazon Simple Storage Service (S3) is an object storage service that is highly scalable, available, secure, and performant.

In the default kubeflow installation, the KFP component uses the MinIO object storage service that can be configured to store objects in S3. However, by default the installation hosts the object store service locally in the cluster. KFP stores data such as pipeline architectures and pipeline run artifacts in MinIO.

Configuring MinIO to read and write to S3 provides the following advantages:

Higher scalability and availability: S3 offers industry-leading scalability and availability and is more durable than the default MinIO object storage solution provided by Kubeflow.
KFP artifacts can be persisted beyond single Kubeflow installations: Using S3 decouples the KFP artifact store from the Kubeflow deployment, allowing multiple Kubeflow installations to access the same artifacts provided that the KFP component versions stores data in a format that is compatible with each other.
Higher level of customizability and management: S3 provides management features to help optimize, organize, and configure access to your data to meet your specific business, organizational, and compliance requirements

To get started with configuring and installing your Kubeflow installation with RDS and S3 follow the install steps below to configure and deploy the Kustomize manifest.

Install

The following steps show how to configure and deploy Kubeflow with supported AWS services.

Using only RDS or only S3

Steps relevant only to the RDS installation will be prefixed with [RDS].

Steps relevant only to the S3 installation will be prefixed with [S3].

Steps without any prefixing are necessary for all installations.

To install for either only RDS or S3 complete the steps relevant to your installation choice.

To install for both RDS and S3 complete all the below steps.

1.0 Prerequisites

Follow the pre-requisites section from this guide

Verify that your are in the root of this repository by running the pwd command. The path should be <PATH/kubeflow-manifests>

pwd

Create an OIDC provider for your cluster
Important :
You must make sure you have an OIDC provider for your cluster and that it was added from eksctl >= 0.56 or if you already have an OIDC provider in place, then you must make sure you have the tag alpha.eksctl.io/cluster-name with the cluster name as its value. If you don't have the tag, you can add it via the AWS Console by navigating to IAM->Identity providers->Your OIDC->Tags.

2.0 Setup RDS, S3 and configure Secrets

There are two ways to create the RDS and S3 resources before you deploy the Kubeflow manifests. Either use the automated python script we have provided by following the steps in section 2.1 or fall back on the manual setup steps as in section 2.2 below -

2.1 Option 1: Automated Setup

This setup performs all the manual steps in an automated fashion.
The script takes care of creating the S3 bucket, creating the S3 secrets using the secrets manager, setting up the RDS database and creating the RDS secret using the secrets manager. It also edits the required configuration files for Kubeflow pipeline to be properly configured for the RDS database during Kubeflow installation.
The script also handles cases where the resources already exist in which case it will simply skips the step.

Note : The script will not delete any resource therefore if a resource already exists (eg: secret, database with the same name or S3 bucket etc), it will skip the creation of those resources and use the existing resources instead. This was done in order to prevent unwanted results such as accidental deletion. For instance, if a database with the same name already exists, the script will skip the database creation setup. If it's expected in your scenario, then perhaps this is fine for you, if you simply forgot to change the database name used for creation then this gives you the chance to retry the script with the proper value. See python auto-rds-s3-setup.py --help for the list of parameters as well as their default values.

Navigate to the Navigate to tests/e2e directory

cd tests/e2e

Install the script dependencies pip install -r requirements.txt
Create an IAM user with permissions to get bucket location and allow read and write access to objects in an S3 bucket where you want to store the Kubeflow artifacts. Use the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY of the IAM user below:
Export values for CLUSTER_REGION, CLUSTER_NAME, S3_BUCKET, AWS_ACCESS_KEY_ID, and AWS_SECRET_ACCESS_KEY then run the script

export CLUSTER_REGION=
export CLUSTER_NAME=
export S3_BUCKET=
export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=

PYTHONPATH=.. python utils/rds-s3/auto-rds-s3-setup.py --region $CLUSTER_REGION --cluster $CLUSTER_NAME --bucket $S3_BUCKET --s3_aws_access_key_id $AWS_ACCESS_KEY_ID --s3_aws_secret_access_key $AWS_SECRET_ACCESS_KEY

Advanced customization

The script applies some sensible default values for the db user password, max storage, storage type, instance type etc but if you know what you are doing, you can always tweak those preferences by passing different values.
You can learn more about the different parameters by running PYTHONPATH=.. python utils/rds-s3/auto-rds-s3-setup.py --help.

2.2 Option 2: Manual Setup

If you prefer to manually setup each components then you can follow this manual guide.

[S3] Create an S3 Bucket

Refer to the S3 documentation for steps on creating an S3 bucket. To complete the following steps you will need to keep track of the S3 bucket name.
[RDS] Create an RDS Instance

Refer to the RDS documentation for steps on creating an RDS MySQL instance.

When creating the RDS instance for security and connectivity reasons we recommend that:
- The RDS instance is in the same VPC as the cluster
- The RDS instance subnets must belong to at least 2 private subnets within the VPC
- The RDS instance security group is the same security group used by the EKS node instances
To complete the following steps you will need to keep track of the following:
- RDS database name (not to be confused with the DB identifier)
- RDS database admin username
- RDS database admin password
- RDS database endpoint URL
- RDS database port

Create Secrets in AWS Secrets Manager

[RDS] Create the RDS secret and configure the secret provider:

Configure a secret, for example named rds-secret, with the RDS DB name, RDS endpoint URL, RDS DB port, and RDS DB credentials that were configured when following the steps in Create RDS Instance.
- For example, if your database name is kubeflow, your endpoint URL is rm12abc4krxxxxx.xxxxxxxxxxxx.us-west-2.rds.amazonaws.com, your DB port is 3306, your DB username is admin, and your DB password is Kubefl0w your secret should look like:
- ```
export RDS_SECRET=<your rds secret name>
aws secretsmanager create-secret --name $RDS_SECRET --secret-string '{"username":"admin","password":"Kubefl0w","database":"kubeflow","host":"rm12abc4krxxxxx.xxxxxxxxxxxx.us-west-2.rds.amazonaws.com","port":"3306"}' --region $CLUSTER_REGION
```

Rename the parameters.objects.objectName field in the rds secret provider configuration to the name of the secret.

One line command:

yq e -i '.spec.parameters.objects |= sub("rds-secret",env(RDS_SECRET))' awsconfigs/common/aws-secrets-manager/rds/secret-provider.yaml

For example, if your secret name is rds-secret-new, the configuration would look like:

apiVersion: secrets-store.csi.x-k8s.io/v1alpha1
kind: SecretProviderClass
metadata:
   name: rds-secret

   ...
   
   parameters:
      objects: | 
      - objectName: "rds-secret-new" # This line was changed
        objectType: "secretsmanager"
        jmesPath:
           - path: "username"
              objectAlias: "user"
           - path: "password"
              objectAlias: "pass"
           - path: "host"
              objectAlias: "host"
           - path: "database"
              objectAlias: "database"
           - path: "port"
              objectAlias: "port"

[S3] Create the S3 secret and configure the secret provider:

Configure a secret, for example named s3-secret, with your AWS credentials. These need to be long term credentials from an IAM user and not temporary.
- Find more details about configuring/getting your AWS credentials here: https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
- ```
export S3_SECRET=<your s3 secret name>
aws secretsmanager create-secret --name S3_SECRET --secret-string '{"accesskey":"AXXXXXXXXXXXXXXXXXX6","secretkey":"eXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXq"}' --region $CLUSTER_REGION
```

Rename the parameters.objects.objectName field in the s3 secret provider configuration to the name of the secret.

One line command:

yq e -i '.spec.parameters.objects |= sub("s3-secret",env(S3_SECRET))' awsconfigs/common/aws-secrets-manager/s3/secret-provider.yaml

For example, if your secret name is s3-secret-new, the configuration would look like:

apiVersion: secrets-store.csi.x-k8s.io/v1alpha1
kind: SecretProviderClass
metadata:
  name: s3-secret

  ...
  
  parameters:
    objects: | 
      - objectName: "s3-secret-new" # This line was changed
        objectType: "secretsmanager"
        jmesPath:
            - path: "accesskey"
              objectAlias: "access"
            - path: "secretkey"
              objectAlias: "secret"

Install AWS Secrets & Configuration Provider with Kubernetes Secrets Store CSI driver

Run the following commands to enable oidc and create an iamserviceaccount with permissions to retrieve the secrets created from AWS Secrets Manager

eksctl utils associate-iam-oidc-provider --region=$CLUSTER_REGION --cluster=$CLUSTER_NAME --approve

eksctl create iamserviceaccount  --name kubeflow-secrets-manager-sa  --namespace kubeflow  --cluster $CLUSTER_NAME --attach-policy-arn  arn:aws:iam::aws:policy/AmazonSSMReadOnlyAccess --attach-policy-arn arn:aws:iam::aws:policy/SecretsManagerReadWrite --override-existing-serviceaccounts   --approve --region $CLUSTER_REGION

Run these commands to install AWS Secrets & Configuration Provider with Kubernetes Secrets Store CSI driver

 kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/secrets-store-csi-driver/v1.0.0/deploy/rbac-secretproviderclass.yaml
 kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/secrets-store-csi-driver/v1.0.0/deploy/csidriver.yaml
 kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/secrets-store-csi-driver/v1.0.0/deploy/secrets-store.csi.x-k8s.io_secretproviderclasses.yaml
 kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/secrets-store-csi-driver/v1.0.0/deploy/secrets-store.csi.x-k8s.io_secretproviderclasspodstatuses.yaml
 kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/secrets-store-csi-driver/v1.0.0/deploy/secrets-store-csi-driver.yaml
 kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/secrets-store-csi-driver/v1.0.0/deploy/rbac-secretprovidersyncing.yaml

 kubectl apply -f https://raw.githubusercontent.com/aws/secrets-store-csi-driver-provider-aws/main/deployment/aws-provider-installer.yaml

Update the KFP configurations
1. [RDS] Configure the RDS params file with the RDS endpoint url and the metadata db name.
  
  For example, if your RDS endpoint URL is rm12abc4krxxxxx.xxxxxxxxxxxx.us-west-2.rds.amazonaws.com and your metadata db name is metadata_db your params.env file should look like:
```
 dbHost=rm12abc4krxxxxx.xxxxxxxxxxxx.us-west-2.rds.amazonaws.com
 mlmdDb=metadata_db
```
2. [S3] Configure the S3 params file with with the S3 bucket name, and S3 bucket region..
  
  For example, if your S3 bucket name is kf-aws-demo-bucket and s3 bucket region is us-west-2 your params.env file should look like:
```
 bucketName=kf-aws-demo-bucket
 minioServiceHost=s3.amazonaws.com
 minioServiceRegion=us-west-2
```

3.0 Build Manifests and Install Kubeflow

Once you have the resources ready, you can continue on to deploying the Kubeflow manifests using the single line command below -

Choose one of the deployment options from below:

Deploying the configuration for both RDS and S3
Deploying the configuration for RDS only
Deploying the configuration for S3 only

[RDS and S3] Deploy both RDS and S3

while ! kustomize build docs/deployment/rds-s3 | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done

[RDS] Deploy RDS only

while ! kustomize build docs/deployment/rds-s3/rds-only | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done

[S3] Deploy S3 only

while ! kustomize build docs/deployment/rds-s3/s3-only | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done

Once, everything is installed successfully, you can access the Kubeflow Central Dashboard by logging in to your cluster.

Congratulations! You can now start experimenting and running your end-to-end ML workflows with Kubeflow.

4.0 Verify the installation

4.1 Verify RDS

Connect to the RDS instance from a pod within the cluster

kubectl run -it --rm --image=mysql:5.7 --restart=Never mysql-client -- mysql -h <YOUR RDS ENDPOINT> -u <YOUR LOGIN> -p<YOUR PASSWORD>

Note that you can find your credentials by visiting aws secrets manager or by using AWS CLI

For instance, to retrieve the value of a secret named rds-secret, we would do :

aws secretsmanager get-secret-value \
    --region $CLUSTER_REGION \
    --secret-id rds-secret \
    --query 'SecretString' \
    --output text

Once connected verify the databases kubeflow and mlpipeline exist.

mysql> show databases;

+--------------------+
| Database           |
+--------------------+
| information_schema |
| kubeflow           |
| mlpipeline         |
| mysql              |
| performance_schema |
+--------------------+

Verify the database mlpipeline has the following tables:

mysql> use mlpipeline; show tables;

+----------------------+
| Tables_in_mlpipeline |
+----------------------+
| db_statuses          |
| default_experiments  |
| experiments          |
| jobs                 |
| pipeline_versions    |
| pipelines            |
| resource_references  |
| run_details          |
| run_metrics          |
+----------------------+

Access the Kubeflow Central Dashboard by logging in to your cluster and navigate to Katib (under Experiments (AutoML)).
Create an experiment using the following yaml file.
Once the experiment is complete verify the following table exists:

mysql> use kubeflow; show tables;

+----------------------+
| Tables_in_kubeflow   |
+----------------------+
| observation_logs     |
+----------------------+

Describe observation_logs to verify it is being populated.

mysql> select * from observation_logs;

4.2 Verify S3

Access the Kubeflow Central Dashboard by logging in to your cluster and navigate to Kubeflow Pipelines (under Pipelines).
Create an experiment named test and create a run using the sample pipeline [Demo] XGBoost - Iterative model training.
Once the run is completed go to the S3 AWS console and open the bucket you specified for the Kubeflow installation.
Verify the bucket is not empty and was populated by outputs of the experiment.

5.0 Uninstall Kubeflow

Run the following command to uninstall:

kustomize build docs/deployment/rds-s3 | kubectl delete -f -

Additionally, the following cleanup steps may be required:

kubectl delete mutatingwebhookconfigurations.admissionregistration.k8s.io webhook.eventing.knative.dev webhook.istio.networking.internal.knative.dev webhook.serving.knative.dev

kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io config.webhook.eventing.knative.dev config.webhook.istio.networking.internal.knative.dev config.webhook.serving.knative.dev

kubectl delete endpoints -n default mxnet-operator pytorch-operator tf-operator

To uninstall AWS resources created by the automated setup run the cleanup script

Navigate to the Navigate to tests/e2e directory

cd tests/e2e

Install the script dependencies pip install -r requirements.txt
Make sure you have the configuration file created by the script in tests/e2e/utils/rds-s3/metadata.yaml

PYTHONPATH=.. python utils/rds-s3/auto-rds-s3-cleanup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rds-s3

rds-s3

README.md

Kustomize Manifests for RDS and S3

Overview

RDS

S3

Install

Using only RDS or only S3

1.0 Prerequisites

2.0 Setup RDS, S3 and configure Secrets

2.1 Option 1: Automated Setup

Advanced customization

2.2 Option 2: Manual Setup

3.0 Build Manifests and Install Kubeflow

[RDS and S3] Deploy both RDS and S3

[RDS] Deploy RDS only

[S3] Deploy S3 only

4.0 Verify the installation

4.1 Verify RDS

4.2 Verify S3

5.0 Uninstall Kubeflow

Files

rds-s3

Directory actions

More options

Directory actions

More options

Latest commit

History

rds-s3

Folders and files

parent directory

README.md

Kustomize Manifests for RDS and S3

Overview

RDS

S3

Install

Using only RDS or only S3

1.0 Prerequisites

2.0 Setup RDS, S3 and configure Secrets

2.1 Option 1: Automated Setup

Advanced customization

2.2 Option 2: Manual Setup

3.0 Build Manifests and Install Kubeflow

[RDS and S3] Deploy both RDS and S3

[RDS] Deploy RDS only

[S3] Deploy S3 only

4.0 Verify the installation

4.1 Verify RDS

4.2 Verify S3

5.0 Uninstall Kubeflow