Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #174: TreeSE support non-taxonomic ranks #201

Merged
merged 5 commits into from
Sep 16, 2023

Conversation

Daenarys8
Copy link
Contributor

#174
@antagomir with some suggestions, I can update the docs as well.

Copy link
Contributor

@antagomir antagomir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's more directly utilize the already existing and well tested functionality from mia instead of reimplementing checks here, if it can be avoided.

R/ancombc_prep.R Outdated
@@ -30,7 +30,7 @@ tse_construct = function(data, assay_name, tax_level, phyloseq) {

# Check if agglomeration should be performed
if (is.null(tax_level)) {
tax_levels = mia::taxonomyRanks(tse)
tax_levels = union(mia::taxonomyRanks(tse), rownames(SummarizedExperiment::rowData(tse)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I think this is mistaken. The rownames refer to individual taxa (rownames of rowData), not the high-level categories of those taxa (colnames of rowData). Moreover, we already have functionality to agglomerate TreeSE objects based on any rowData field, through the function mia::mergeFeatures() and for taxonomic ranks specifically via mia::mergeFeaturesByRank(). The internal function mia::.merge_features was intended to check whether the given field is in taxonomyRanks or not, and then choose the merging function based on that. That was the entire motivation for the recent PR work in mia. Therefore there is no need to take union of different field names here but instead we should use the background work that was already laid out in mia to automatically handle these things, if possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that mia::mergeFeatures() automatically checks whether the field is in taxonomy ranks or not and then calls the internal function based on that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. mia::mergeFeaturesByRank was not exported. There is a PR for that now. Similarly mia::.merge_features is not exported. mia::mergeFeatures() deprecated mia::mergeRows.
The union to give tax_levels is to display in the message the possible tax_level params when it is NULL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .merge_features is internal to mia and should not be imported. But I think it can also be called if necessary, like with mia:::.merge_features (/w three double colons). To check.

Let's first merge the relevant PRs in mia and solve this based on those.

We can discuss in Slack or in mia issues whether we should have exported function that combines these two, or do we want to always separate them and create different combinations for different situations and packages like here now?

R/ancombc_prep.R Outdated Show resolved Hide resolved
R/ancombc_prep.R Outdated Show resolved Hide resolved
R/ancombc_prep.R Outdated
Comment on lines 79 to 87
# Check if tax_level parameter belongs to taxonomyRanks
if (is.character(tax_level) && length(tax_level) == 1 && tax_level %in% mia::taxonomyRanks(tse)) {
#Merge using agglomerateByRank
tse_alt <- mia::agglomerateByRank(tse, tax_level)
} else {
# Merge using mergeRows
tse_alt <- mia::mergeRows(tse, tax_level)
}
tse_alt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we export mia::.merge_features?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No but I think you can it with triple semicolon. Check with TB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notably mia:::.merge_features is available from mia 1.9.9

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As pointed out by @Daenarys8 that the mia:::.merge_features function becomes available starting from mia version 1.9.9, while the currently released version of mia on Bioconductor remains at 1.8.0. Consequently, the ".merge_features" function is currently not accessible.

> mia:::.merge_features
Error: object '.merge_features' not found

I'm wondering the best approach to import this function. One option is to consider copying the function directly from this source into the ANCOMBC utils functions. I will update the package accordingly in the future once this function becomes available in the release version of mia. Any ideas? @Daenarys8 @antagomir

Thank you,
Huang

@antagomir
Copy link
Contributor

Updating the docs would be very useful. Which docs exactly? manpages, vignettes, in ANCOMBC?

Signed-off-by: Daenarys8 <[email protected]>
Signed-off-by: Daenarys8 <[email protected]>
Signed-off-by: Daenarys8 <[email protected]>
Copy link
Contributor

@antagomir antagomir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also check that the roxygen documentation is up-to-date, and the roxygen examples field and vignette include an example that is up-to-date.

R/ancombc_prep.R Outdated
@@ -30,7 +30,7 @@ tse_construct = function(data, assay_name, tax_level, phyloseq) {

# Check if agglomeration should be performed
if (is.null(tax_level)) {
tax_levels = mia::taxonomyRanks(tse)
tax_levels = union(mia::taxonomyRanks(tse), colnames(SummarizedExperiment::rowData(tse)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? the rowData may contain also other features than taxonomic ranks, do we need to define them as tax_levels? What is the intended purpose of this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily

R/ancombc_prep.R Outdated
@@ -57,7 +57,7 @@ tse_construct = function(data, assay_name, tax_level, phyloseq) {
}

if (is.null(tax_level)) {
tax_levels = mia::taxonomyRanks(tse)
tax_levels = union(mia::taxonomyRanks(tse), colnames(SummarizedExperiment::rowData(tse)))
txt = sprintf(paste0("`tax_level` is not speficified \n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Owner

@FrederickHuangLin FrederickHuangLin Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Daenarys8 @antagomir Thank you both for addressing the taxonomy rank issue, I really appreciate it. I seem to have lost track of the conversation. Could you please confirm whether the commit is now ready for merging? I tend to publish a new version of ANCOMBC in a week. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit is ready.

@antagomir
Copy link
Contributor

In fact I was wondering the same if @Daenarys8 can comment, thanks!

Signed-off-by: Daenarys8 <[email protected]>
@antagomir
Copy link
Contributor

It is ready as @Daenarys8 says but I would still have some requests if time allows. Otherwise, we can add these in the next round of updates.

  1. The Bioc framework has switched from argument assay_name to assay.type and it would be important to make this update also in ANCOMBC for consistency
  2. The tax_level is used in phyloseq ( I think) but in TreeSE world we use rank. Could we add an alias for the argument, so that one could also use "rank" as in TreeSE?
  3. It seems possible to use the parameter passing (...) in function calls more than now, this would help to avoid specifying the arguments many times (see mia for examples). This change might take a bit more thinking and is easier to postpone if things otherwise work as expected.

SOme more testing about interoperability will be useful but also easier after merging this.

@FrederickHuangLin
Copy link
Owner

Thank you, @antagomir, for the heads-up!

I'm planning to publish a new version of ANCOMBC early next week, most likely on Monday or Tuesday. It might be more convenient to incorporate the updates you mentioned in the next iteration.

Once again, I appreciate your contribution!

Best regards,
Huang

@FrederickHuangLin FrederickHuangLin merged commit 9e617f3 into FrederickHuangLin:bugfix Sep 16, 2023
@antagomir antagomir mentioned this pull request Sep 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants