Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extending order source beyond exact matches #450

Open
lydiayliu opened this issue May 11, 2022 · 4 comments
Open

extending order source beyond exact matches #450

lydiayliu opened this issue May 11, 2022 · 4 comments
Labels
effort: 8 Needs 8 points of effect to fix enhancement New feature or request

Comments

@lydiayliu
Copy link
Collaborator

Here is the result of summarizeFasta if I specified --order-source Mutation,Fusion, and if I put the Fusion GVF before the Mutation GVF in the list of GVF inputs:

sources n_total n_0_misc        n_1_misc        n_2_misc
Mutation        516     106     199     211
Fusion  0       0       0       0
Noncoding       3218171 1005170 1259917 953084
Fusion-Mutation 0       0       0       0
Mutation-Noncoding      260     69      103     88
Fusion-Noncoding        15616   5194    6132    4290
Fusion-Mutation-Noncoding       0       0       0       0

I think things are totally fine as is and this file is not really intended for human consumption, but it would really make intuitive sense that if --order-source Mutation,Fusion is specified, then among the sources that are not specified in order source, instead of going with the order of the GVFs, the order in order-source would be consulted and Mutation-Noncoding would go before Fusion-Mutation. So, like this

sources n_total n_0_misc        n_1_misc        n_2_misc
Mutation        516     106     199     211
Fusion  0       0       0       0
Noncoding       3218171 1005170 1259917 953084
Mutation-Fusion 0       0       0       0
Mutation-Noncoding      260     69      103     88
Fusion-Noncoding        15616   5194    6132    4290
Mutation-Fusion-Noncoding       0       0       0       0

I know we may have discussed this before and that I said going by the order of the GVF would be fine, but I realized that when I generate the list of GVFs files for input, I simply use something like ls or for *.gvf and that results in the GVFs always being in alphanumeric order and it is more annoying to change the order of the GVF files inputted than I realized.

This might also apply for splitFasta?

@lydiayliu lydiayliu added enhancement New feature or request effort: 8 Needs 8 points of effect to fix labels May 11, 2022
@lydiayliu
Copy link
Collaborator Author

lydiayliu commented May 12, 2022

Also, this other thing

When I do
--order-source Mutation,Fusion,Mutation-Fusion,Mutation-Noncoding,Fusion-Noncoding

I get

sources n_total n_0_misc        n_1_misc        n_2_misc
Mutation        516     106     199     211
Fusion  0       0       0       0
Noncoding       3218171 1005170 1259917 953084
Fusion-Mutation 0       0       0       0
Mutation-Noncoding      260     69      103     88
Fusion-Noncoding        15616   5194    6132    4290
Fusion-Mutation-Noncoding       0       0       0       0

The order of the file is fine, but you can see that Mutation-Fusion is Fusion-Mutation for some reason in the summary output? It really don't make a difference, but I'd prefer it to be Mutation-Fusion as specified

@zhuchcn
Copy link
Member

zhuchcn commented May 12, 2022

Related to #428. Sources and source combinations are now just sorted alphabetically. The reason why Mutation-Fusion becomes Fusion-Mutation is just because Fusion is smaller alphabetically than Mutation. What you requested is possible, so we just need to make the source as an object with order but the effort/benefit ratio seems to be pretty hight. We can still get the information with just a little bit tweak on the source order, right?

@zhuchcn
Copy link
Member

zhuchcn commented May 12, 2022

instead of going with the order of the GVFs, the order in order-source would be consulted and Mutation-Noncoding would go before Fusion-Mutation

And here, are you saying Mutation-Noncoding should go before Fusion-Mutation? I thought we want Noncoding to be always the least. Or are you just talking about the order of individual source in source groups to have Mutation always go before Fusion like 'Mutation-Fusion-Noncoding`?

@zhuchcn zhuchcn mentioned this issue May 12, 2022
@lydiayliu
Copy link
Collaborator Author

lydiayliu commented May 12, 2022

Or are you just talking about the order of individual source in source groups to have Mutation always go before Fusion like 'Mutation-Fusion-Noncoding`?

Yes! Since I put Mutation before Fusion as individual sources, I want the source groups to have the same order, is that possible? Aka instead of sorting alphabetically (which puts Fusion before Mutation), sort by a new order IF the individual source had been specified in order_source (which puts Mutation before Fusion)

Also, if I specified Mutation-Fusion in order_source I expect the output to be Mutation-Fusion, not Fusion-Mutation

We can still get the information with just a little bit tweak on the source order, right?

Yeah that's why I put the enhancement label on it. It is in no way urgent. I don't feel like it should be part of Release 0.4.2

I thought we want Noncoding to be always the least.

Well that should depend on order_source!

@lydiayliu lydiayliu added this to Potential To Do in Release 1.0.0 Jul 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort: 8 Needs 8 points of effect to fix enhancement New feature or request
Projects
No open projects
Release 1.0.0
  
Potential To Do
Development

No branches or pull requests

2 participants