Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform rank-aware padding before taxonomy collapse #138

Closed

Conversation

FranckLejzerowicz
Copy link

This pull request fixes #137, by allowing rank-aware padding before taxonomy collapse, in order to avoid features duplication, when ranks are homogeneous.

@FranckLejzerowicz
Copy link
Author

I lack knowledge on how to run the tests using qiime taxa collapse directly, I have tested the "content" of the functions on a python interpreter, hence, I'm not sure about whether the unittesting will succeed in loading the test of PR'ed file test_util.py, which does the following:

import unittest

import pandas as pd
from q2_taxa._util import _get_padding_ranks


class PaddingTests(unittest.TestCase):


    def test_padding(self):

        # ideal case, no conflict at all ranks -> ranks inferred
        taxonomy = pd.Series([
            'k__a; p__b; o__c',
            'k__a; p__b; o__c',
            'k__a'
        ], index=['feat1', 'feat2', 'feat3'])
        observed = _get_padding_ranks(taxonomy)
        expected = ['k__', 'p__', 'o__']
        self.assertEqual(observed, expected)

Help!

@Oddant1
Copy link
Member

Oddant1 commented Oct 7, 2022

Hello @FranckLejzerowicz, sorry this PR fell by the wayside. We're starting to triage old PRs now. I fixed the merge conflict on this PR and updated the copyright year to make it continue passing lint. It looks like this PR was failing on your last commit, but that was long enough ago that we can't see what was making it fail. It is still failing now for presumably the same reasons, but I'm not sure because I can't check and make sure 😂. If you want to continue working on this, feel free. We can also try to take this over the line ourselves if you want.

@nbokulich
Copy link
Member

I personally do not think that this is a beneficial change, see my comments in #137 , and would lead to unintentionally misleading use by users. I would suggest creating a separate action in a different plugin. Instead of adding complexity to collapse, it might also be possible to instead use existing actions in RESCRIPt, e.g., to use edit-taxonomy to drop the empty ranks prior to collapsing, or if the missing ranks are wanted to use the backfilling functions to add the missing ranks to a taxonomy prior to collapsing.

@ebolyen
Copy link
Member

ebolyen commented Oct 28, 2022

Thanks for the initiative @FranckLejzerowicz, and thanks for the insight @nbokulich, I'm going to close this as a wont-fix.

@ebolyen ebolyen closed this Oct 28, 2022
@ebolyen ebolyen added the stat:wont-fix This is outside the scope/abilities/concerns of the project. label Oct 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:wont-fix This is outside the scope/abilities/concerns of the project.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Collapse will create different features for the same level, because of non rank-aware padding
4 participants