Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Automated Weekly Smoke Tests #4112

Open
andylizf opened this issue Oct 17, 2024 · 2 comments · May be fixed by #4113
Open

Implement Automated Weekly Smoke Tests #4112

andylizf opened this issue Oct 17, 2024 · 2 comments · May be fixed by #4113

Comments

@andylizf
Copy link
Contributor

andylizf commented Oct 17, 2024

Implement Automated Weekly Smoke Tests

Problem

Currently, smoke tests for SkyPilot (implemented in test_smoke.py) are being run manually. This process could be improved by automating these tests on a weekly basis.

Proposed Solution

Implement an automated weekly smoke test run using a suitable CI/CD platform or automation tool, leveraging the existing test_smoke.py script.

Implementation Details

  1. Automation Setup:

    • Create a new workflow or job for weekly smoke tests
    • Schedule the workflow to run weekly
    • Use the existing test_smoke.py script in the automation
  2. Environment Setup:

    • Add necessary cloud credentials securely to the CI/CD platform
    • Ensure the test runner has the required dependencies installed
  3. Test Execution:

    • Run pytest tests/test_smoke.py --terminate-on-failure
    • Handle different test groups (AWS, GCP, Azure, etc.) as defined in the script

Challenges to Address

  1. Credit Control:

    • Implement a mechanism to limit cloud credits used during automated tests
    • Consider setting up budget alerts or usage limits
  2. Test Stability and Retries:

    • Implement retry logic for transient failures
    • Define criteria for test failures vs. retries
    • Set a maximum number of retry attempts
  3. Multi-Cloud Testing:

    • Ensure tests cover all supported cloud providers as defined in test_smoke.py
    • Handle potential differences in setup or execution across clouds

Next Steps

  • Select and set up an appropriate automation tool or CI/CD platform
  • Implement the automated workflow
  • Implement credit control and retry mechanisms
  • Test the workflow across all supported cloud providers
  • Document the process

Feedback on implementation details and challenge mitigation strategies is welcome, particularly from those familiar with test_smoke.py and our current testing processes.

andylizf added a commit to andylizf/skypilot that referenced this issue Oct 17, 2024
Fixes skypilot-org#4112

Implement an automated weekly smoke test run using GitHub Actions leveraging the existing `tests/test_smoke.py` script.

* Add a new GitHub Actions workflow file `.github/workflows/weekly-smoke-tests.yml`.
* Schedule the workflow to run weekly using cron syntax.
* Set up the environment and install dependencies.
* Run `pytest tests/test_smoke.py --terminate-on-failure`.
* Handle different test groups (AWS, GCP, Azure, Lambda, Kubernetes) as defined in the script.
* Add necessary cloud credentials as GitHub secrets.
* Upload test results as artifacts.
@asaiacai
Copy link
Contributor

i have some buildkite stuff i've already setup for GKE that would probably adapt well to running skypilot smoke tests. Happy to work with the team on sharing it since I'm depending pretty heavily on skypilot+k8s right now

@romilbhardwaj
Copy link
Collaborator

Another cost optimization - we can run a k8s cluster as a part of our CI and move many of the cloud agnostic tests (e.g., those which test core functionality of SkyPilot) to run on a Kubernetes cluster provisioned in github actions for the duration of the test.

See: https://github.com/marketplace/actions/kind-kubernetes-in-docker-action
Step-by-step blog: https://dev.to/kitarp29/running-kubernetes-on-github-actions-f2c

Need to evaluate cost-benefit (test migration effort vs reduced cloud cost) before we implement the above.

@andylizf andylizf changed the title Implement Automated Weekly Smoke Tests with GitHub Actions Implement Automated Weekly Smoke Tests Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants