Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linked scatter plots through vega? #76

Open
mortonjt opened this issue Aug 22, 2019 · 4 comments
Open

Linked scatter plots through vega? #76

mortonjt opened this issue Aug 22, 2019 · 4 comments

Comments

@mortonjt
Copy link
Collaborator

mortonjt commented Aug 22, 2019

Relevant to #75

One question is, how can one choose the right microbes? Right now the only way to do this is to select microbe by pointing and clicking through Emperor.

A better way is to first select by how many samples they are most abundance (i.e. their maximal samples) and see how their balances are correlated

For instance, we could have something like this (where microbes are points)

image

And selecting 2 points will inform something like this

image

It'll make this process a little more streamlined (its quite manual in notebooks atm). @fedarko, would this be something appropriate for qurro? If so, maybe able to contribute something.

@nbokulich
Copy link
Contributor

@mortonjt @fedarko FYI, @thermokarst and I have been talking about adding a vega scatterplots visualizer to q2-metadata, and it is on our radar for 2019.

This visualizer would generate interactive scatterplots on any continuous metadata columns. So either of the scatterplots you show could be accomplished using this — though the linked scatterplots would not.

If either of you want to contribute to q2-metadata to speed up development, we would be glad to have the help! I just want to make sure we are not duplicating effort here.

@mortonjt
Copy link
Collaborator Author

@nbokulich , that actually maybe a good home for the linked scatterplots.
One possibility is to have the main python functions for that living in q2-metadata, and have more specific commands (i.e. the scenario above) inherit that functionality.

Note that here, the linked scatterplots would be tightly coupled like that in qurro, where clicking on the points specifies the contents of the second scatterplot.

It may also be worth looking at how much overlap there would be with qurro, since that can already have linked scatter plots through altair -> vega-lite.

@fedarko
Copy link

fedarko commented Aug 22, 2019

re: integrating this stuff into qurro

@mortonjt This does seem like it overlaps a decent amount with Qurro, but integrating it into that might take a lot of work. As I see it the main differences between the proposed way of doing things and Qurro's codebase now are:

  • We'd need to store two separate count datasets (microbes and metabolites) in the client-side JS code
    • For large datasets this could pose a problem. Qurro stores count data in a sparse fashion, but:
      1. it's still fairly space-heavy, and we could improve this
      2. since metabolite count data is pretty dense (in my limited experience) having both count datasets available without using a better representation would probably drag down performance in many cases
  • the "sample plot" (showing selected log-ratios) would have two log-ratios to keep track of (one between the selected microbes and one between the selected microbes' "compounds").
    • I'm not sure how you're determining this now, but there would need to be a clear way to indicate which metabolites are "compounds" of which microbe(s).
  • the "rank plot" would need to be adjusted to use a different y-axis and dots instead of bars (this is probably the easiest part of this to implement)

TLDR: this should be doable, but it might take some pretty heavy effort (and I can't really commit that sort of time right now). If you're interested in pushing on these changes that'd be great, though!

re: the metadata visualizer

@nbokulich @thermokarst that's exciting! :) As of writing Qurro's "sample plot" fixes the y-axis to just show whatever the currently selected log-ratio is, so a visualizer that can configure both the x- and y-axis fields would be cool.

However, I think it should be pretty simple to make the y-axis field in Qurro's sample plot also configurable. Here's a demo of Qurro, for reference (the sample plot is the one on the right -- you need to select some sort of log-ratio to populate it).

Making the sample plot y-axis configurable like the x-axis already is (and thereby removing Qurro's reliance on selecting a log-ratio) sounds like it would address the goals of the metadata visualizer, without much added work. (And the added benefit is that this allows for either categorical or quantitative metadata fields to be shown in the scatterplot, as well as for boxplots if the y-axis is set to something quantitative and the x-axis is set to something categorical.)

That being said I can see how there'd be merit in a simpler visualization within q2-metadata that only relies on metadata, instead of being a bit heavier like Qurro (which as discussed above basically requires you chuck in a BIOM table to the browser). I don't really have a ton of bandwidth right now, but I'd be happy to talk about this to prevent duplication of efforts—maybe we can reuse some of Qurro's code in a q2-metadata visualizer, or add another subcommand to Qurro (e.g. qiime qurro metadata-plot) that just creates a basic lightweight metadata scatter-/box-plot?

Thanks all!

@mortonjt
Copy link
Collaborator Author

@fedarko regarding the paired-omics plots, the workaround the space limitation would be to do rigorous filtering. For instance we could only focus on microbes that are the most abundant in at least 1 sample and their top 50 metabolites -- that'll typically reduce it down to ~50 microbes and 200 metabolites, which should be more manageable in JS. Furthermore, the ranks can inform which metabolites best match to which microbes.

This would be great to have, but not urgent - #75 should be enough in the meantime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants