-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add wrapper for agglomerateByRank/mergeRows #389
Conversation
Signed-off-by: Daenarys8 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat!
Add couple of checks: .merge_features vs agglomerateByRank and .merge_features vs mergeRows
Signed-off-by: Daenarys8 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple small thhings, looks good!
Signed-off-by: Daenarys8 <[email protected]>
Signed-off-by: Daenarys8 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice
Status of this? |
Complete for now. |
Great! We had elsewhere discussion about function naming. I would suggest to update as follows:
Downside of "merge" term is that it is also used in another meaning, to combine full SE objects, data.frames, matrices etc (instead of within object features as here). Another option would be: agglomerateFeatures and agglomerateFeaturesByRank but that is slower to write.. or combineFeature / combineFeaturesByRank (but combine is also used in the same way than merge, between distinct DataFrames) Shall we deal with these issues here in this closely related PR, or should we open a new separate issue & PR? |
I also think We discussed also about enabling There must be some reason why we didn't implement this before??? Do you remember @antagomir? I think this PR can be merged, and these new modifications can be done in different PR. @Daenarys8 can you implement these? |
I agree. I don't think there is a specific reason, the rowData variables could be called by name as well. If @Daenarys8 can open a new issue and PR about that it would be great. You can close this issue when you have checked that it is ready. |
Sure, will give it a shot |
Can someone close this PR when it is confirmed to be ready. |
Was this ever merged? If I check correctly from above, it was just closed without merging ? The idea was to merge I guess? |
Hmm - - ok now I noticed that the renaming scheme discussed above, and agreed, has not made it to this PR yet. Can we add it (@Daenarys8) ? More specifically, the idea was to rename "Rows" to "Features" and "Cols" to "Samples". In addition, to harmonize terminology (use "merge" instead of "agglomerate"). However, one last thing to discuss first: the meaning of "to agglomerate" is somewhat better fit with our case (in terms of language & meaning), in particular when the phylogenetic tree is involved in the process. The phyloseq equivalent to TreeSE We could use "glom" instead of "merge". That would be as fast to write, and the meaning would be more specific (merge is easier to confuse with merging of matrices). However, "glom" might be a bit weird term to introduce right now, and it is possible to re-evaluate that later as well. Hence, in summary, I suggest to rename as:
@Daenarys8 could you add that to this PR as discussed above, OR open a new issue proposing this change, then merging and closing the current PR. Also confirm that the other points discussed / agreed above have now been addessred. |
Ok it is more clear to do the other discussed changes in a separate PR. I am merging and closing this one. |
Ok - now as this is merged: the original motivation was ANCOMBC issue #174 where we would like to let users specify taxonomic rank or also other rowData variable to merge rows before running ANCOMBC: FrederickHuangLin/ANCOMBC#174 Now, ideally this new wrapper will help there; it would group by taxonomic rank if this is available in rowData, otherwise it uses the more general grouping. |
There are 2 different methods doing similar (merging/grouping rows/features);
This wrapper method that combines these two methods.