Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Extractor pipeline extracting additional data which are not specified in configuration.extractor.yaml #421

Closed
DibyaRanjan1 opened this issue Nov 16, 2023 · 9 comments
Labels
question Further information is requested

Comments

@DibyaRanjan1
Copy link

Release version

latest

Describe the bug

[BUG] Extractor pipeline extracting additional data which are not specified in configuration.extractor.yaml
Below is my content is configuration.extractor.yaml . I can see many other data are extracted which makes the pipeline slow and unnceccary. I was expecting only api, namedvalues and product to be extracted.

apiNames:
  - apiminitiativeprivate

namedValueNames:
  - apiminitiativeprivate-config

productNames:
  - apiminitiativeprivate

backendNames: []

diagnosticNames: []

loggerNames: []

subscriptionNames: []

tagNames: []

policyFragmentNames: []

Expected behavior

I was expecting only api, namedvalues and product to be extracted.

Actual behavior

Many data like backends, diagnostics, gateways, loggers, policyfragemts, subscriptions, tags are downloaded but It was not mentioned in configuration.extractor.yaml. We have lot of data in APIM instance. It is making the pipeline slow and we are getting the data for each pipeline runs.

Reproduction Steps

  1. Created a feature branch using git checkout -b feature/api-code-changes2
  2. Modified the configuration.extractor.yaml like below.
    apiNames:
  • apiminitiativeprivate

namedValueNames:

  • apiminitiativeprivate-config

productNames:

  • apiminitiativeprivate

backendNames: []

diagnosticNames: []

loggerNames: []

subscriptionNames: []

tagNames: []

policyFragmentNames: []
3. Run the extractor pipeline by using feature/api-code-changes2 .
4. Pipline downloads many data which are not needed. But as expected api, namedvalues and product are extracted correctly.

Copy link

  Thank you for opening this issue! Please be patient while we will look into it and get back to you as this is an open source project. In the meantime make sure you take a look at the [closed issues](https://github.com/Azure/apiops/issues?q=is%3Aissue+is%3Aclosed) in case your question has already been answered. Don't forget to provide any additional information if needed (e.g. scrubbed logs, detailed feature requests,etc.).
  Whenever it's feasible, please don't hesitate to send a Pull Request (PR) our way. We'd greatly appreciate it, and we'll gladly assess and incorporate your changes.

@waelkdouh waelkdouh added the question Further information is requested label Nov 16, 2023
@guythetechie
Copy link
Contributor

@DibyaRanjan1 - We don't support filtering every type. Looking at the code, I see support for these resource types:

        public IEnumerable<string>? ApiNamesToExport { get; init; }
        public IEnumerable<string>? LoggerNamesToExport { get; init; }
        public IEnumerable<string>? DiagnosticNamesToExport { get; init; }
        public IEnumerable<string>? NamedValueNamesToExport { get; init; }
        public IEnumerable<string>? ProductNamesToExport { get; init; }
        public IEnumerable<string>? BackendNamesToExport { get; init; }
        public IEnumerable<string>? TagNamesToExport { get; init; }
        public IEnumerable<string>? SubscriptionNamesToExport { get; init; }
        public IEnumerable<string>? PolicyFragmentNamesToExport { get; init; }

If the extractor downloads more resources than you need, you can always delete those resources from the branch created by the pipeline before merging it into your main branch.

@waelkdouh
Copy link
Contributor

Here is the supporting docs

@waelkdouh
Copy link
Contributor

@DibyaRanjan1 you are correct. The empty array doesn't work. Rather than specify an empty array, you can put a bogus name like backendNames: [ ignore]. It's just the way YAML configurations work with .NET configuration. Passing a field with an empty array is essentially the same not passing that field at all; it gets ignored. We went ahead and updated the docs to reflect this.

@DibyaRanjan1
Copy link
Author

@waelkdouh Thank you. It worked. I can still see all the version sets. I request you to support version sets in the extractor configuration.

@waelkdouh
Copy link
Contributor

Since it's not a top priority for us right now, it would be great if you can submit Apple request and we will be more than happy to merge it.

@ghost
Copy link

ghost commented Feb 4, 2024

I am facing the same issue as DibyaRanjan1. I only want to extract apiNames, namedValueNames and productNames. What is the proper syntax to ignore the others please?

I have tried:
backendNames:[ignore]
(in the doc there is no space after the colon, but this is underlined in red in visual studio)

backendNames: [ignore]
(what @waelkdouh suggested above, with a space after the colon, but it didn't work for me)

backendNames:

  • ignore
    (this worked for me but not always. It only works when I update the configuration.extractor file before running the extractor. If I don't update the file before running the extractor it will extract everything)

@guythetechie
Copy link
Contributor

@eyvictorye - your configuration.extractor.yaml should look like this:

 backendNames:
  - ignore

loggerNames:
  - ignore

...etc

As noted above, we only support ignoring certain resources.

@ghost
Copy link

ghost commented Feb 6, 2024

@guythetechie yes, my configuration.extractor.yaml file looks just like that and i am only ignoring resources that can be ignored from above. it is just that sometimes it extracts only the resources specify sometimes it extracts all even though i checked the option of using my configuration file for the extractor pipeline run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants