-
Notifications
You must be signed in to change notification settings - Fork 74
Use snakemake for pub-ready figure generation #630
Comments
Since I did do a bit of proof of concept work on this, I should link it here for anybody who might work on this in the future. https://github.com/jashapiro/OpenPBTA-analysis/blob/jashapiro/snakemake-results/Snakefile One thing to note in that implementation is that it uses scripts as inputs in an attempt to capture both changes in data files AND analysis/figure generation code. But it might not catch all changes, esepcially in scripts called by the defined scripts. |
@jaclyn-taroni are we still interested in doing this? If so, it would be good to do in conjunction with/instead of #1261 |
Are we interested? Sure, but I think we'd need to estimate the amount of effort before I comment on whether or not we should do it. |
My gut tells me the effort would be a little too high, but this depends on how much is in what @jashapiro had previously written up. This link is now long gone it seems and I can't find the branch here. @jashapiro, still have this locally? |
In some previous cleaning, I must have removed that branch on github, but I did have it locally, and it is now back up. Note that that file was setting up to do all the analysis: not just the figures, but also the analysis that generated the inputs to those figures. Doing just the figures should be a bit easier/ Whether it is worth it really depends on how much effort it is to go through all the figure scripts and figure out what all the inputs and outputs are, since (at least in the past) the scripts declare their input files internally, not with arguments. We probably don't want to be in a situation where an input file changes and does not result in the figure being correctly regenerated. (We can get around that with |
Thanks for branch + context, @jashapiro!! This point is key -
The figure generation script calls scripts in both scenarios - inputs internally, and inputs as arguments. We'd want this more consistent for a robust workflow, and it doesn't make sense to me to modify analysis module files specifically to work with a manuscript figure-generating workflow. This is especially true because many of the scripts called don't actually generate figures, but prepare data to generate figures from. My sense now is snakemake is not the move at this point, and we should stick with the existing (and soon-to-be reorganized! #1261 ) bash script. |
Originally suggested by @jashapiro.
With #613, we're using a large bash script to regenerate figures. Some kind of workflow management system would be better.
snakemake
is already on the project Docker container. The CCDL does not have bandwidth to implement this as part of our initial effort to get publication-ready figures together (#571) but wanted to document this potential improvement.The text was updated successfully, but these errors were encountered: