first commit

outerbounds · Jul 28, 2023 · 6bfee4f · 6bfee4f
commit 6bfee4f
Show file tree

Hide file tree

Showing 10 changed files with 330 additions and 0 deletions.
diff --git a/.github/workflows/assess_new_production_model.yml b/.github/workflows/assess_new_production_model.yml
@@ -0,0 +1,32 @@
+name: Deploy new production model
+on:
+ push:
+ branches: ['main']
+jobs:
+ deploy:
+ runs-on: ubuntu-latest
+ name: Evaluate model and deploy to production if successful
+ permissions:
+ id-token: write
+ contents: read
+ steps:
+ - uses: actions/checkout@v2
+ - name: Configure AWS Credentials
+ uses: aws-actions/configure-aws-credentials@v1
+ with:
+ role-to-assume: <YOUR SERVICE PRINCIPAL ROLE ARN>
+ aws-region: us-west-2
+ - run: aws sts get-caller-identity
+ - name: Set up Python 3.x
+ uses: actions/setup-python@v1
+ with:
+ python-version: '3.10'
+ - name: Install Outerbounds
+ run: |
+ python3 -m pip install --user outerbounds
+ - name: Test flow
+ env:
+ METAFLOW_HOME: /tmp/.metaflowconfig
+ run: |
+ <YOUR OB CONFIGURE COMMAND FOR SERVICE PRINCIPALS>
+ python evaluate_new_model_flow.py run --with card
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,2 @@
+__pycache__
+.metaflow
diff --git a/README.md b/README.md
@@ -0,0 +1,110 @@
+# GitHub Actions on Outerbounds Platform Demo
+A basic repo structure to run CI/CD jobs on Outerbounds platform. 
+
+## Related resources
+[Github Actions x Outerbounds Service Principals](https://docs.google.com/document/d/1If-Nh4EY4cs5wDihWhnDglE-NKqu8Gv0-ZwXcw4cons/edit)
+- This document describes how to set up a Github CI job that can run flows using Outerbounds Service Principals.
+
+## Workflows
+
+<img src="./static/lifecycle.png" style="display: block; float: left; max-width: 20%; height: auto; margin: auto; float: none!important;">
+
+[<img src="./static/thumbnail.png" style="display: block; float: left; max-width: 20%; height: auto; margin: auto; float: none!important;">](https://www.youtube.com/watch?v=XnW5MXzMEW8)
+
+### Engineer UX 
+
+#### Initial Setup
+This is something the person who handles cloud engineering/security will do once.
+
+##### 1. Find your AWS account ID as instructed [here](https://docs.google.com/document/d/1O0ap2_hnz8VHQqIhiCDUruNCFNKxhiwt9JTWePlYAnc/edit#heading=h.n2f7xpi062t8).
+##### 2. Follow this section of the [Allow Github Actions Permissions to Assume your IAM Role](https://docs.google.com/document/d/1O0ap2_hnz8VHQqIhiCDUruNCFNKxhiwt9JTWePlYAnc/edit#heading=h.5cp00dpcus00) part of the [Service Principals guide](https://docs.google.com/document/d/1O0ap2_hnz8VHQqIhiCDUruNCFNKxhiwt9JTWePlYAnc/edit).
+##### 3. [Create a new Permission Policy for Service Principals](https://docs.google.com/document/d/1O0ap2_hnz8VHQqIhiCDUruNCFNKxhiwt9JTWePlYAnc/edit#heading=h.p55n5nuncamf).
+##### 4. [Create a Service Principal in Outerbounds UI](https://docs.google.com/document/d/1O0ap2_hnz8VHQqIhiCDUruNCFNKxhiwt9JTWePlYAnc/edit#heading=h.tdalusawlhk1).
+
+#### Modifying or adding a new CI/CD task in your action
+This is a pattern that may require the end user who writes code that goes into the tasks of a FlowSpec run to communicate with the person who manages cloud engineering/security. It is the place where the security to run the action using the Outerbounds platform service principal connects to the logic that:
+- runs the GitHub action CI job that runs the FlowSpec,
+- deploys a new FlowSpec to a production branch or an experimental branch for A/B & multi-armed bandit scenarios. You might use the Metaflow [client API](https://docs.metaflow.org/api/client) to determine when the run has met some criteria.
+
+#### The workflow
+##### 1. Identify the GitHub organization containing the repository where you want to add a new GitHub action. For example, this repository would be the “outerbounds” organization and “github-actions-on-obp-demo” repository.
+##### 2. Go to step 5 and 6 of the [Create and Configure your IAM Role](https://docs.google.com/document/d/1If-Nh4EY4cs5wDihWhnDglE-NKqu8Gv0-ZwXcw4cons/edit) section, and follow instructions to add the action you want to the trust policy of your service principal IAM role. For example: here we define an action that runs when new code is pushed directly or merged to the main branch of the repository.
+
+<img src="./static/trust-policy-git-action.png" style="display: block; float: left; max-width: 20%; height: auto; margin: auto; float: none!important;">
+
+##### 3. [Define the GitHub action](https://docs.google.com/document/d/1O0ap2_hnz8VHQqIhiCDUruNCFNKxhiwt9JTWePlYAnc/edit#heading=h.shunrk8q1a9d) in your GitHub repository with your FlowSpec code and its dependencies.
+
+There are two key pieces to look at in the example at `.github/workflows/assess_new_production_model.yml` to get this to work for your service principal. Descriptions are annotated inside `<>` in the following snippet. You will find the ARN in your AWS account in the IAM Role for the service pricipal, and you will find your Outerbounds configure command in the Outerbounds platform UI where you have connected the service principal to your account as a machine identity.
+```
+name: Deploy new production model
+
+...
+
+jobs:
+ deploy:
+
+ ...
+ 
+ steps:
+ - uses: actions/checkout@v2
+ - name: Configure AWS Credentials
+ uses: aws-actions/configure-aws-credentials@v1
+ with:
+ role-to-assume: <YOUR SERVICE PRINCIPAL ROLE ARN>
+ 
+ ...
+ 
+ - name: Test flow
+ 
+ ...
+ 
+ run: |
+ <YOUR OB CONFIGURE COMMAND FOR SERVICE PRINCIPALS>
+ python evaluate_new_model_flow.py run --with card
+```
+
+
+### Data Scientist UX
+Our goal is to update the model used in the `Predict` workflow defined in `prediction_flow.py`. As a starting point for the CI/CD lifecycle, consider how a data scientist iterates locally or on a cloud workstation.
+
+This repository demonstrates how the data scientist can:
+- take the result of such experimentation, 
+- create a GitHub branch, 
+- let an automatic CI/CD process built with GitHub Actions validate the model's quality (using Outerbounds platform resources), 
+- and only if the new model code meets certain user-defined criteria, automatically deploy the newly trained model to be used in the production workflow that makes predictions accessed by other production applications.
+
+
+### Deploy the `Predict` workflow to production
+A data scientist or ML engineer would do this rarely, and typically less frequently than the model selction/architecture in `my_data_science_module.py` updates.
+This only needs to be done if the code in `predict_flow.py` file updates.
+```
+python predict_flow.py --production argo-workflows create
+```
+
+#### Manually trigger the production workflow
+This is a way to manually trigger a refresh of the production run that populates the model prediction cache accessed by other production applications.
+```
+python predict_flow.py --production argo-workflows trigger
+```
+
+### Development phase: Local iteration on `EvaluateNewModel`
+Local/workstation testing:
+```
+python evaluate_new_model_flow.py run
+```
+
+### Moving to production phase: a template for a CI/CD process using GitHub Actions
+When a data scientist is satisfied with what they see on local runs, then they can use GitHub commands like a regular software development workflow:
+```
+git switch -c 'my-new-model-branch'
+git add .
+git commit -m 'A model I think is ready for production'
+git push --set-upstream origin my-new-model-branch
+```
+
+After the model is pushed to the remote branch of `my-new-model-branch`, the data scientist or an engineering colleague can open a pull request against the main branch. When this pull request gets merged to the `main` branch of the repository, a GitHub action defined in `.github/workflows/assess_new_production_model.yml` is triggered. To explore the many complex patterns like this you can implement with GitHub actions, consider step 5 of the [Create and Configure your IAM Role](https://docs.google.com/document/d/1If-Nh4EY4cs5wDihWhnDglE-NKqu8Gv0-ZwXcw4cons/edit) section, and the many types of [events you can use to trigger a GitHub Action](https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows).
+
+The GitHub Action in this template will do the following:
+1. Run the `EvaluateNewModel` workflow defined in `evaluate_new_model_flow.py`.
+2. If the `EvaluateNewModel` workflow produces a model that meets some user-defined criteria (e.g., beyond some performance metric threshold), then tag the Metaflow run in which the model was trained as a `deployment_candidate`.
+3. If the upstream `EvaluateNewModel` run is tagged as a `deployment_candidate` and the model meets any other criteria you add to this template, then the production workflow will use a new version of the model in the `predict.py` flow in an on-going fashion.
diff --git a/constants.py b/constants.py
@@ -0,0 +1,12 @@
+### model training constants
+EVALUATE_DEPLOYMENT_CANDIDATES_COMMAND = ["python", "evaluate_deployment_candidates.py"]
+
+# This is the threshold that determines whether a model is a candidate for deployment.
+# In practice, you might define this by comparing the result against a baseline model's performance.
+PERFORMANCE_THRESHOLDS = {
+ 'accuracy': 90
+}
+
+### prediction constants
+UPSTREAM_FLOW_NAME = "EvaluateNewModel"
+CICD_NAMESPACE = "user:my-sp-1"
diff --git a/evaluate_new_model_flow.py b/evaluate_new_model_flow.py
@@ -0,0 +1,54 @@
+from metaflow import FlowSpec, step, Parameter, current, Flow, catch, retry
+from constants import (
+ PERFORMANCE_THRESHOLDS,
+ EVALUATE_DEPLOYMENT_CANDIDATES_COMMAND,
+)
+
+
+class EvaluateNewModel(FlowSpec):
+
+ """
+ A workflow to train a model and evaluate its performance.
+ A data scientist may wish to run this locally after making edits to my_data_science_module.py.
+ This will run in the CI/CD process via GitHub Actions on the Outerbounds platform.
+ """
+
+ data_param = Parameter("data_input", help="Input to the model.", default=0)
+
+ @catch(var="model_evaluation_error")
+ @retry(times=3)
+ @step
+ def start(self):
+ "Train and evaluate a model defined in my_data_science_module.py."
+
+ # Import my organization's custom modules.
+ from my_data_science_module import MyDataLoader, MyModel
+
+ # Load some data.
+ self.train_data = MyDataLoader().load(input=self.data_param)
+ # In this toy example, the "data loader" will return the the same value (a no op).
+ # In practice this may return a tabular dataframe or a DataLoader object for images or text.
+
+ # Simulate scores measured on your model's performance.
+ self.model = MyModel() # When this flow passes your CI/CD criteria, this artifact will be used in production to produce predictions.
+ self.eval_metrics = self.model.score(data=self.train_data)
+ # In this toy example, the "model evaluation" will just add 1 to the "self.train_data" integer.
+
+ self.next(self.end)
+
+ @step
+ def end(self):
+ # A simple example of how to use Metaflow's tags to mark a run as a candidate for deployment.
+ # In practice, you might want to add additional conditions to identify the model suitability for production.
+ # For example, you may want to run a test suite over the APIs called in upstream steps, such as MyDataLoader().load().
+ if self.eval_metrics['accuracy'] >= PERFORMANCE_THRESHOLDS['accuracy']:
+ run = Flow(current.flow_name)[current.run_id]
+ run.add_tag("deployment_candidate")
+ else:
+ print(
+ f"Run {current.run_id} did not meet production performance threshold."
+ )
+
+
+if __name__ == "__main__":
+ EvaluateNewModel()
diff --git a/my_data_science_module.py b/my_data_science_module.py
@@ -0,0 +1,51 @@
+class MyDataLoader:
+ def __init__(self):
+ pass
+
+ def load(self, input):
+ """
+ A toy function that returns an integer.
+ This function mimics loading data from your warehouse / lake.
+ In this case we return a single number to reduce complexity.
+ In practice this might be a dataframe, PyTorch DataLoader, etc.
+ """
+ my_dataset_or_dataloader = input
+ return my_dataset_or_dataloader
+
+
+class MyModel:
+ def __init__(self):
+ pass
+
+ def predict(self, data):
+ """
+ A toy function that returns the input plus one.
+ This function mimics a prediction from a model.
+ """
+ return data + 2 # a very silly "model" 
+
+ def score(self, data):
+ """
+ A toy function that returns the input plus one.
+ This function mimics an evaluation of a model's performance.
+ """
+ return {'accuracy': 100.}
+
+
+class MyPredictionStore:
+ def __init__(self):
+ self.store_url = "https://my-prediction-store.com"
+
+ def cache_new_preds(self, preds=None):
+ """
+ Logic to push a model's predictions to a cache accessible by other production apps.
+ This definition is just a placeholder.
+ For a realistic example of doing this using DynamoDB, see: https://outerbounds.com/docs/recsys-tutorial-S2E4/.
+ There are many patterns for storing predictions
+ """
+ assert (
+ preds is not None
+ ), "Not a valid set of predictions... Not overwriting the current prediction cache."
+ # You probably want to insert other logic here, such as ensuring the predictions are properly formed,
+ # and the cache contents you are about to replace are versioned/backed up somewhere in case you need to roll back.
+ print(f"Pushing predictions to {self.store_url}")
diff --git a/predict_flow.py b/predict_flow.py
@@ -0,0 +1,69 @@
+from metaflow import FlowSpec, step, Flow, Parameter, schedule, project, namespace
+from constants import UPSTREAM_FLOW_NAME, CICD_NAMESPACE
+
+
+def fetch_default_run_id():
+ """
+ Return the run id of the latest successful upstream flow's deployment_candidate.
+ In practice, you will want far more rigorous conditions. 
+ For example, you might want to smoke test the model rather than just assert is not None.
+ """
+ namespace(CICD_NAMESPACE)
+ for run in Flow(UPSTREAM_FLOW_NAME):
+ if (
+ "deployment_candidate" in run.tags
+ and run.successful
+ and run.data.model is not None
+ ):
+ return run.id
+
+
+@project(name="batch_prediction_cicd_on_obp")
+@schedule(daily=True)
+class Predict(FlowSpec):
+ "A FlowSpec to run predictions on the Outerbounds platform using a model vetted in a CI/CD process."
+
+ data_param = Parameter("data_param", help="Input to the model.", default=0)
+ upstream_run_id = Parameter(
+ "upstream-run-id",
+ help="The run ID of the upstream flow.",
+ default=fetch_default_run_id(),
+ )
+
+ @step
+ def start(self):
+ # Valdiate the upstream_run_id returned by fetch_default_run_id is usable.
+ if self.upstream_run_id is None:
+ raise ValueError(
+ "Please provide the run ID of the upstream flow as a parameter in: python predict_flow.py run --upstream-run-id <ID>"
+ )
+ print("Using upstream run with ID: ", self.upstream_run_id)
+
+ # Import my organization's custom modules.
+ from my_data_science_module import MyDataLoader, MyPredictionStore
+
+ # Load some data.
+ self.train_data = MyDataLoader().load(input=self.data_param)
+ # In this toy example, the "data loader" will add 1 to self.data_param and return the same value (a no op) to keep it simple.
+ # In practice this might return a tabular dataframe or a DataLoader object for images or text.
+
+ # Load model from the upstream_run.
+ upstream_run = Flow(UPSTREAM_FLOW_NAME)[self.upstream_run_id]
+ model = upstream_run.data.model
+ # Notice we don't need to store the model as artifact, since we can use self.upstream_run_id to fetch it.
+
+ # Make predictions and cache them in a prediction store accessible to your production apps.
+ self.predictions = model.predict(data=self.train_data)
+
+ production_store_handler = MyPredictionStore()
+ production_store_handler.cache_new_preds(preds=self.predictions)
+
+ self.next(self.end)
+
+ @step
+ def end(self):
+ pass
+
+
+if __name__ == "__main__":
+ Predict()
diff --git a/static/lifecycle.png b/static/lifecycle.png
diff --git a/static/thumbnail.png b/static/thumbnail.png
diff --git a/static/trust-policy-git-action.png b/static/trust-policy-git-action.png