[BUG] Extractor pipeline extracting additional data which are not specified in configuration.extractor.yaml #421

DibyaRanjan1 · 2023-11-16T15:00:37Z

Release version

latest

Describe the bug

[BUG] Extractor pipeline extracting additional data which are not specified in configuration.extractor.yaml
Below is my content is configuration.extractor.yaml . I can see many other data are extracted which makes the pipeline slow and unnceccary. I was expecting only api, namedvalues and product to be extracted.

apiNames:
  - apiminitiativeprivate

namedValueNames:
  - apiminitiativeprivate-config

productNames:
  - apiminitiativeprivate

backendNames: []

diagnosticNames: []

loggerNames: []

subscriptionNames: []

tagNames: []

policyFragmentNames: []

Expected behavior

I was expecting only api, namedvalues and product to be extracted.

Actual behavior

Many data like backends, diagnostics, gateways, loggers, policyfragemts, subscriptions, tags are downloaded but It was not mentioned in configuration.extractor.yaml. We have lot of data in APIM instance. It is making the pipeline slow and we are getting the data for each pipeline runs.

Reproduction Steps

Created a feature branch using git checkout -b feature/api-code-changes2
Modified the configuration.extractor.yaml like below.
apiNames:

apiminitiativeprivate

namedValueNames:

apiminitiativeprivate-config

productNames:

apiminitiativeprivate

backendNames: []

diagnosticNames: []

loggerNames: []

subscriptionNames: []

tagNames: []

policyFragmentNames: []
3. Run the extractor pipeline by using feature/api-code-changes2 .
4. Pipline downloads many data which are not needed. But as expected api, namedvalues and product are extracted correctly.

The text was updated successfully, but these errors were encountered:

github-actions · 2023-11-16T15:00:51Z

  Thank you for opening this issue! Please be patient while we will look into it and get back to you as this is an open source project. In the meantime make sure you take a look at the [closed issues](https://github.com/Azure/apiops/issues?q=is%3Aissue+is%3Aclosed) in case your question has already been answered. Don't forget to provide any additional information if needed (e.g. scrubbed logs, detailed feature requests,etc.).
  Whenever it's feasible, please don't hesitate to send a Pull Request (PR) our way. We'd greatly appreciate it, and we'll gladly assess and incorporate your changes.

guythetechie · 2023-11-16T18:12:01Z

@DibyaRanjan1 - We don't support filtering every type. Looking at the code, I see support for these resource types:

        public IEnumerable<string>? ApiNamesToExport { get; init; }
        public IEnumerable<string>? LoggerNamesToExport { get; init; }
        public IEnumerable<string>? DiagnosticNamesToExport { get; init; }
        public IEnumerable<string>? NamedValueNamesToExport { get; init; }
        public IEnumerable<string>? ProductNamesToExport { get; init; }
        public IEnumerable<string>? BackendNamesToExport { get; init; }
        public IEnumerable<string>? TagNamesToExport { get; init; }
        public IEnumerable<string>? SubscriptionNamesToExport { get; init; }
        public IEnumerable<string>? PolicyFragmentNamesToExport { get; init; }

If the extractor downloads more resources than you need, you can always delete those resources from the branch created by the pipeline before merging it into your main branch.

waelkdouh · 2023-11-16T18:20:12Z

Here is the supporting docs

waelkdouh · 2023-11-16T19:20:25Z

@DibyaRanjan1 you are correct. The empty array doesn't work. Rather than specify an empty array, you can put a bogus name like backendNames: [ ignore]. It's just the way YAML configurations work with .NET configuration. Passing a field with an empty array is essentially the same not passing that field at all; it gets ignored. We went ahead and updated the docs to reflect this.

DibyaRanjan1 · 2023-11-17T06:35:51Z

@waelkdouh Thank you. It worked. I can still see all the version sets. I request you to support version sets in the extractor configuration.

waelkdouh · 2023-11-17T12:24:10Z

Since it's not a top priority for us right now, it would be great if you can submit Apple request and we will be more than happy to merge it.

ghost · 2024-02-04T16:05:45Z

I am facing the same issue as DibyaRanjan1. I only want to extract apiNames, namedValueNames and productNames. What is the proper syntax to ignore the others please?

I have tried:
backendNames:[ignore]
(in the doc there is no space after the colon, but this is underlined in red in visual studio)

backendNames: [ignore]
(what @waelkdouh suggested above, with a space after the colon, but it didn't work for me)

backendNames:

ignore
(this worked for me but not always. It only works when I update the configuration.extractor file before running the extractor. If I don't update the file before running the extractor it will extract everything)

guythetechie · 2024-02-05T21:36:17Z

@eyvictorye - your configuration.extractor.yaml should look like this:

 backendNames:
  - ignore

loggerNames:
  - ignore

...etc

As noted above, we only support ignoring certain resources.

ghost · 2024-02-06T15:25:47Z

@guythetechie yes, my configuration.extractor.yaml file looks just like that and i am only ignoring resources that can be ignored from above. it is just that sometimes it extracts only the resources specify sometimes it extracts all even though i checked the option of using my configuration file for the extractor pipeline run.

waelkdouh added the question Further information is requested label Nov 16, 2023

waelkdouh closed this as completed Nov 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Extractor pipeline extracting additional data which are not specified in configuration.extractor.yaml #421

[BUG] Extractor pipeline extracting additional data which are not specified in configuration.extractor.yaml #421

DibyaRanjan1 commented Nov 16, 2023

github-actions bot commented Nov 16, 2023

guythetechie commented Nov 16, 2023

waelkdouh commented Nov 16, 2023

waelkdouh commented Nov 16, 2023

DibyaRanjan1 commented Nov 17, 2023

waelkdouh commented Nov 17, 2023

ghost commented Feb 4, 2024 •

edited by ghost

Loading

guythetechie commented Feb 5, 2024

ghost commented Feb 6, 2024

[BUG] Extractor pipeline extracting additional data which are not specified in configuration.extractor.yaml #421

[BUG] Extractor pipeline extracting additional data which are not specified in configuration.extractor.yaml #421

Comments

DibyaRanjan1 commented Nov 16, 2023

Release version

Describe the bug

Expected behavior

Actual behavior

Reproduction Steps

github-actions bot commented Nov 16, 2023

guythetechie commented Nov 16, 2023

waelkdouh commented Nov 16, 2023

waelkdouh commented Nov 16, 2023

DibyaRanjan1 commented Nov 17, 2023

waelkdouh commented Nov 17, 2023

ghost commented Feb 4, 2024 • edited by ghost Loading

guythetechie commented Feb 5, 2024

ghost commented Feb 6, 2024

ghost commented Feb 4, 2024 •

edited by ghost

Loading