Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Extractor is trying to read whole APIM instance when specific config provided #610

Closed
nepsmaddy opened this issue Jul 31, 2024 · 10 comments

Comments

@nepsmaddy
Copy link
Contributor

Release version

v6.0.1-rc1

Describe the bug

When providing specific details to extract for the api, extractor should only fetch the specific data and complete the process.

This was working fine in old release v4.7.x but in new release it is trying read through all the metadata which is causing lot of time to run the extractor instead of seconds.

Consider below is my extractor config looks like as example.

apiNames:

  • apim-health

backendNames: [ignore]

namedValueNames:

  • environment
  • region

productNames: [ignore]

tagNames:

  • finance-nocharge

diagnosticNames: [ignore]
loggerNames: [ignore]
policyFragmentNames: [ignore]
subscriptionNames: [ignore]

Expected behavior

Now as per the config, it extractor should only scan specific api, namedvalues and tags. rest it should not worry and give the artifacts.

Note: This is exactly same behavior in 4.7.x but not is current release. it scan through everything in current release and taking long time to do it.

Actual behavior

Extractor scan through the configs given in config.yaml inclusive of all other metadata of the apim instance which are not required, showing as warning and skipped the resource in logs. please find below snippet for the same.

warn: extractor.ShouldExtractFactory[0]
NamedValueName allegroinvoice-mugf-sappipo-prx-password is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName allegroinvoice-mugf-sappipo-prx-username is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName ami-ccevents-password is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName ami-ccevents-subscription-key is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName ami-ccevents-username is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName ami-optout-client-id is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName ami-optout-password is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName ami-optout-secret is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName ami-optout-username is not in configuration and will be skipped.

Reproduction Steps

  1. Update the configration.extractor.yaml with specific api and related meta data.
  2. In configuration, ignore some configs like provided as example.
  3. run extractor and observe the logs in steps.
Copy link

  Thank you for opening this issue! Please be patient while we will look into it and get back to you as this is an open source project. In the meantime make sure you take a look at the [closed issues](https://github.com/Azure/apiops/issues?q=is%3Aissue+is%3Aclosed) in case your question has already been answered. Don't forget to provide any additional information if needed (e.g. scrubbed logs, detailed feature requests,etc.).
  Whenever it's feasible, please don't hesitate to send a Pull Request (PR) our way. We'd greatly appreciate it, and we'll gladly assess and incorporate your changes.

@guythetechie
Copy link
Contributor

@nepsmaddy - just to be clear: is the extractor creating artifacts for resources that should be skipped? Or are you just noticing references to other resource names in the logs?

@nepsmaddy
Copy link
Contributor Author

@guythetechie , yes logs creation as well. At first point extractor should not scan anything except the configuration provided. Also i noticed this was working as expected in v4.7.0. But in latest i see its generating all this logs as well.

@DSpirit
Copy link
Contributor

DSpirit commented Aug 1, 2024

+1, as in our environment this is quite time consuming (300+ API's) I don't know why the extractor needs to loop through all resources when a specific resource set is defined in the extractor config. Even worse is the fact, that the extractor hasn't got a proper subscription read limit handling, so that long running operations often result in a SubscriptionRequestsThrottled error, e.g.:

System.Net.Http.HttpRequestException: HTTP request to URI https://management.azure.com/subscriptions/***/resourceGroups/***/providers/Microsoft.ApiManagement/service/***/apiVersionSets/***?api-version=2023-09-01-preview failed with status code 429. Content is '{"error":{"code":"SubscriptionRequestsThrottled","message":"Number of 'read' requests for subscription actor '***:***' exceeded. Please try again after '1' seconds after additional tokens are available. Refer to https://aka.ms/arm-throttling for additional information."}}'.

Causing the entire extraction to fail occasionally.

@guythetechie guythetechie moved this to 🏗 In progress in APIOPS Roadmap Aug 1, 2024
@guythetechie
Copy link
Contributor

@nepsmaddy - we've always retrieved all APIs, then filtered by API name in configuration. v4.7.0 behaves the same way. There are two major differences:

  • We now log which resources were skipped to make it clear that some have not been extracted.
  • We also retrieve the specification contents before filtering the API. This is probably what's causing performance issues now; prior to v6, performance did not seem to be a problem.

@DSpirit - could you define "quite time consuming"? How long does it take to run on 300+ APIs? Will add to our backlog for fixing, but prioritization will depend on how bad it is.

@DSpirit
Copy link
Contributor

DSpirit commented Aug 1, 2024

After merging my change from #612 my pipeline retries became unnecessary, so extraction dropped from 25 mins to 3-6 minutes for a single API. Extracting all assets takes about 8-11 minutes. This is acceptable however, just with the missing 429 handling it became really annoying. Sure this could be improved for single API extraction, but for now it's completely fine, since the APIM Resource Kit hasn't been any better :)

Thanks for the quick feedback today, really appreciate it!

@einzweirad
Copy link

@guythetechie Nevertheless, this behavior in v6 is a huge problem for larger APIM instances (around 850 APIs)

In our case, the extractor runs for about 8 minutes to loop trough the 1700 NamedValues alone. All the other parts (Tags, Products, Subscriptions and so on) add up to a total time of more than 50 Minutes to extract a single API.

The same extraction for a single API with everything else is set to [ignore] finished in under 1 minute in v5.1.4.

@guythetechie
Copy link
Contributor

Thanks for the feedback, all. Will prioritize addressing this.

@nepsmaddy
Copy link
Contributor Author

For me as well in v6, my extractor is taking almost 50 mins to complete the run.

@guythetechie guythetechie moved this from 🏗 In progress to ✅ Done in APIOPS Roadmap Aug 2, 2024
@guythetechie
Copy link
Contributor

Fix pushed to main branch, should be deployed in our next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

4 participants