-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stackrox: set e2e-benchmarking EXTRA_FLAGS to include --metrics-profile acs metrics url #57412
stackrox: set e2e-benchmarking EXTRA_FLAGS to include --metrics-profile acs metrics url #57412
Conversation
/pj-rehearse periodic-ci-openshift-qe-ocp-qe-perfscale-ci-main-aws-4.16-nightly-x86-control-plane-24nodes-acs |
@davdhacs: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
/pj-rehearse periodic-ci-openshift-qe-ocp-qe-perfscale-ci-main-aws-4.16-nightly-x86-control-plane-24nodes-acs |
@davdhacs: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
/pj-rehearse periodic-ci-openshift-qe-ocp-qe-perfscale-ci-main-aws-4.16-nightly-x86-control-plane-24nodes-acs |
@davdhacs: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
@@ -55,7 +55,10 @@ tests: | |||
env: | |||
BASE_DOMAIN: perfscale.rox.systems | |||
COMPUTE_NODE_REPLICAS: "24" | |||
E2E_REPOSITORY: https://github.com/davdhacs/e2e-benchmarking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just cut a new kube-burner-ocp version v1.4.0 including the patch for metrics-profiles, you can override the kube-burner version for this step with the env var KUBE_BURNER_VERSION https://github.com/cloud-bulldozer/e2e-benchmarking/blob/master/workloads/kube-burner-ocp-wrapper/run.sh#L11C23-L11C42
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I'll switch this PR to use that.
/pj-rehearse periodic-ci-openshift-qe-ocp-qe-perfscale-ci-main-aws-4.16-nightly-x86-control-plane-24nodes-acs |
@davdhacs: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
Test run used the metrics-aggregated.yml,*-acs.yaml:
|
/pj-rehearse periodic-ci-openshift-qe-ocp-qe-perfscale-ci-main-aws-4.16-nightly-x86-control-plane-24nodes-acs |
@davdhacs: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
[REHEARSALNOTIFIER]
A total of 214 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs. A full list of affected jobs can be found here Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
Is this ready to merge @davdhacs ? if so, please add pj-rehearsal-ack |
/cc @jtaleric |
lgtm - only concern I would have is the number of metrics some of these queries return 😨
|
@mtodor should we change these metrics to reduce the volume before this runs (often)? |
/pj-rehearsal ack |
/unhold |
/pj-rehearse ack |
@davdhacs: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
In the metrics file https://raw.githubusercontent.com/stackrox/stackrox/refs/heads/master/tests/performance/scale/config/metrics-acs.yml, you're capturing lot of raw prometheus timeseries, you should consider adding some aggregation expressions (sum, rate, histogram_quantile, etc..) to reduce the number of documents. Indexing such amount of documents can lead to performance issues in the ElasticSearch database |
@mtodor If you're okay with this, please add a |
/lgtm |
@rsevilla87 could you /lgtm this pr for us? (we're not in the owners files for these) |
@mtodor and I discussed the volume of these metrics and decided to start with this and iterate on aggregating and reducing the volume as we start using the data (and since we're in a separate elasticsearch, we will not be bad neighbor even if this is too much data right now) |
ack! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: davdhacs, jtaleric, mtodor The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@davdhacs: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Test of kube-burner/kube-burner-ocp#111
New test-run with KUBE_BURNER_VERSION=1.4.0
e2e-benchmarking log shows the $cmd used to call kube-burner-ocp includes the EXTRA_FLAGS string using the new kube-burner-ocp arg:
The kube-burner log then shows the url metrics file was loaded and the metrics targets inside that file are used: