Manifests scanning performance improvements #68
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes https://github.com/kubeshop/monokle-saas/issues/2258.
TL;DR: There are not many changes here, since the main issue was solved by recent PR. See Analysis below.
Changes
node_modules
dirs by default. Since adding proper support for.gitignore
s is not trivial, I think this is a good enough solution for now (especially that it wasn't the cause here, at least not main one). For gitignores, I extracted separate issue, see reasoning behind this in the issue itself - Support.gitignore
when scanning workspace for manifests #71.Fixes
Activating extensions
status in VSC for so long) - especially that reading yamls from workspace (which is a part of validation) takes, like 10 - 20 seconds (see below).Analysis
A bit on files finding
What is used now for scanning for manifests is native
workspace.findeFiles
. It takes into account files ignored in VSC viafiles.exclude
but does not look on.gitignore
.Now measuring times for
monokle-saas
repo, it seems not finding files but parsing them takes most time:and interestingly we got 50 resources from 837 files, which means most files are not K8s resources (I quickly check and just by ignoring
node_modules
we get to 137 files and less than 5 seconds).Still the initial issue is not about a time, but resources. And finding yamls is the only place we do any find with globbing so this is a solid candidate for triggering resource intensive
rg
calls (and first thing to check is if excluding large dirs, like node_modules, makes it less CPU demanding).On
rg
I tested how
rg
behaves when switching branches withmonokle-saas
repo (from some old feature branch to main). But what I noticed is that it mostly happen when there are lots changes to pull, when switching to new branch for the first time. So the procedure for me was:main
branch (with latest changes) in local test repo.Monokle: Validate
.0.6.5
First, with current
0.6.5
version and results are similar as mentioned in initial issue, it goes wild a bit (output fromatop
with 1s interval, each screenshot is next snapshot):With file watcher changes
There were significant changes how files are processed in #62. So I also tested with those changes (not released yet). Especially, we got rid of inefficient file finding logic - notice line 122 and 142 below:
vscode-monokle/src/utils/workspace.ts
Lines 121 to 142 in 63a0984
☝️ So for every workspace there was a watcher using globbing (L122). And one thing which could happen is when you switch branches it was triggered for multiple files. And for each file it will get resource ids (L142).
vscode-monokle/src/utils/workspace.ts
Lines 216 to 222 in 63a0984
☝️ Now getting resource id uses
findYamlFiles
for each file.vscode-monokle/src/utils/workspace.ts
Lines 171 to 184 in 63a0984
☝️ And
findYamlFiles
does scanning entire workspace again 😓 🙈 Sound like a good party...Anyways, as mentioned this logic was reworked entirely. The results with new logic:
First run:
Second run:
This looks really good. Unfortunately, found a related regression too - #70. And even though as part of the test procedure
Monokle: Validate
is run, with the regression I also checked forrg
processes during Monokle extension initialization to also check how first repo scanning behaves:Still no
rg
party which is good 👍Checklist