[BUG] Extractor is trying to read whole APIM instance when specific config provided #610

nepsmaddy · 2024-07-31T15:23:54Z

Release version

v6.0.1-rc1

Describe the bug

When providing specific details to extract for the api, extractor should only fetch the specific data and complete the process.

This was working fine in old release v4.7.x but in new release it is trying read through all the metadata which is causing lot of time to run the extractor instead of seconds.

Consider below is my extractor config looks like as example.

apiNames:

apim-health

backendNames: [ignore]

namedValueNames:

environment
region

productNames: [ignore]

tagNames:

finance-nocharge

diagnosticNames: [ignore]
loggerNames: [ignore]
policyFragmentNames: [ignore]
subscriptionNames: [ignore]

Expected behavior

Now as per the config, it extractor should only scan specific api, namedvalues and tags. rest it should not worry and give the artifacts.

Note: This is exactly same behavior in 4.7.x but not is current release. it scan through everything in current release and taking long time to do it.

Actual behavior

Extractor scan through the configs given in config.yaml inclusive of all other metadata of the apim instance which are not required, showing as warning and skipped the resource in logs. please find below snippet for the same.

warn: extractor.ShouldExtractFactory[0]
NamedValueName allegroinvoice-mugf-sappipo-prx-password is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName allegroinvoice-mugf-sappipo-prx-username is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName ami-ccevents-password is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName ami-ccevents-subscription-key is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName ami-ccevents-username is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName ami-optout-client-id is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName ami-optout-password is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName ami-optout-secret is not in configuration and will be skipped.
warn: extractor.ShouldExtractFactory[0]
NamedValueName ami-optout-username is not in configuration and will be skipped.

Reproduction Steps

Update the configration.extractor.yaml with specific api and related meta data.
In configuration, ignore some configs like provided as example.
run extractor and observe the logs in steps.

github-actions · 2024-07-31T15:24:06Z

  Thank you for opening this issue! Please be patient while we will look into it and get back to you as this is an open source project. In the meantime make sure you take a look at the [closed issues](https://github.com/Azure/apiops/issues?q=is%3Aissue+is%3Aclosed) in case your question has already been answered. Don't forget to provide any additional information if needed (e.g. scrubbed logs, detailed feature requests,etc.).
  Whenever it's feasible, please don't hesitate to send a Pull Request (PR) our way. We'd greatly appreciate it, and we'll gladly assess and incorporate your changes.

guythetechie · 2024-07-31T20:26:39Z

@nepsmaddy - just to be clear: is the extractor creating artifacts for resources that should be skipped? Or are you just noticing references to other resource names in the logs?

nepsmaddy · 2024-07-31T21:14:28Z

@guythetechie , yes logs creation as well. At first point extractor should not scan anything except the configuration provided. Also i noticed this was working as expected in v4.7.0. But in latest i see its generating all this logs as well.

DSpirit · 2024-08-01T07:33:48Z

+1, as in our environment this is quite time consuming (300+ API's) I don't know why the extractor needs to loop through all resources when a specific resource set is defined in the extractor config. Even worse is the fact, that the extractor hasn't got a proper subscription read limit handling, so that long running operations often result in a SubscriptionRequestsThrottled error, e.g.:

System.Net.Http.HttpRequestException: HTTP request to URI https://management.azure.com/subscriptions/***/resourceGroups/***/providers/Microsoft.ApiManagement/service/***/apiVersionSets/***?api-version=2023-09-01-preview failed with status code 429. Content is '{"error":{"code":"SubscriptionRequestsThrottled","message":"Number of 'read' requests for subscription actor '***:***' exceeded. Please try again after '1' seconds after additional tokens are available. Refer to https://aka.ms/arm-throttling for additional information."}}'.

Causing the entire extraction to fail occasionally.

guythetechie · 2024-08-01T12:49:28Z

@nepsmaddy - we've always retrieved all APIs, then filtered by API name in configuration. v4.7.0 behaves the same way. There are two major differences:

We now log which resources were skipped to make it clear that some have not been extracted.
We also retrieve the specification contents before filtering the API. This is probably what's causing performance issues now; prior to v6, performance did not seem to be a problem.

@DSpirit - could you define "quite time consuming"? How long does it take to run on 300+ APIs? Will add to our backlog for fixing, but prioritization will depend on how bad it is.

DSpirit · 2024-08-01T15:15:44Z

After merging my change from #612 my pipeline retries became unnecessary, so extraction dropped from 25 mins to 3-6 minutes for a single API. Extracting all assets takes about 8-11 minutes. This is acceptable however, just with the missing 429 handling it became really annoying. Sure this could be improved for single API extraction, but for now it's completely fine, since the APIM Resource Kit hasn't been any better :)

Thanks for the quick feedback today, really appreciate it!

einzweirad · 2024-08-01T16:27:35Z

@guythetechie Nevertheless, this behavior in v6 is a huge problem for larger APIM instances (around 850 APIs)

In our case, the extractor runs for about 8 minutes to loop trough the 1700 NamedValues alone. All the other parts (Tags, Products, Subscriptions and so on) add up to a total time of more than 50 Minutes to extract a single API.

The same extraction for a single API with everything else is set to [ignore] finished in under 1 minute in v5.1.4.

guythetechie · 2024-08-01T20:41:18Z

Thanks for the feedback, all. Will prioritize addressing this.

nepsmaddy · 2024-08-02T09:16:21Z

For me as well in v6, my extractor is taking almost 50 mins to complete the run.

guythetechie · 2024-08-02T17:28:31Z

Fix pushed to main branch, should be deployed in our next release.

guythetechie added this to APIOPS Roadmap Aug 1, 2024

guythetechie moved this to 🏗 In progress in APIOPS Roadmap Aug 1, 2024

guythetechie pushed a commit that referenced this issue Aug 2, 2024

Optimize resource name filtering in extractor. Addresses #610.

f1c1231

guythetechie moved this from 🏗 In progress to ✅ Done in APIOPS Roadmap Aug 2, 2024

guythetechie closed this as completed Aug 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Extractor is trying to read whole APIM instance when specific config provided #610

[BUG] Extractor is trying to read whole APIM instance when specific config provided #610

nepsmaddy commented Jul 31, 2024

github-actions bot commented Jul 31, 2024

guythetechie commented Jul 31, 2024

nepsmaddy commented Jul 31, 2024

DSpirit commented Aug 1, 2024

guythetechie commented Aug 1, 2024

DSpirit commented Aug 1, 2024

einzweirad commented Aug 1, 2024

guythetechie commented Aug 1, 2024

nepsmaddy commented Aug 2, 2024

guythetechie commented Aug 2, 2024

[BUG] Extractor is trying to read whole APIM instance when specific config provided #610

[BUG] Extractor is trying to read whole APIM instance when specific config provided #610

Comments

nepsmaddy commented Jul 31, 2024

Release version

Describe the bug

Expected behavior

Actual behavior

Reproduction Steps

github-actions bot commented Jul 31, 2024

guythetechie commented Jul 31, 2024

nepsmaddy commented Jul 31, 2024

DSpirit commented Aug 1, 2024

guythetechie commented Aug 1, 2024

DSpirit commented Aug 1, 2024

einzweirad commented Aug 1, 2024

guythetechie commented Aug 1, 2024

nepsmaddy commented Aug 2, 2024

guythetechie commented Aug 2, 2024