Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Remove R script/notebook package installs from remaining modules (Part 4 of 4) #704

Merged

Conversation

cansavvy
Copy link
Collaborator

@cansavvy cansavvy commented May 21, 2020

Purpose/implementation Section

What was your approach?

I'm removing R package installations in the scripts in accordance with #690

Here I've addressed what should be the last of the changes to wrap up and closes #690.

  • Got rid of maftools install in 01-plot-oncoprint.R and reran run-oncoprint.sh
  • Get rid of package installations in run-dimension-reducion.R and reran ./dimension-reduction-plots.sh
  • Got rid of install in util/survival_models.R and reran survival-analysis_template.Rmd

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

  • t-SNE numbers look different. Are we able to set the seed for that?

  • Everything else look okay? Results that should be there are there?

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?

Yes, but the results shouldn't really be different.

Reproducibility Checklist

No changes are needed here.

  • The dependencies required to run the code in this pull request have been added to the project Dockerfile.
  • This analysis has been added to continuous integration.

@cansavvy cansavvy marked this pull request as ready for review May 22, 2020 13:42
@cansavvy cansavvy requested a review from cbethell May 22, 2020 15:12
@jaclyn-taroni
Copy link
Member

t-SNE numbers look different. Are we able to set the seed for that?

PCA numbers also look slightly different. I am rerunning:

bash analyses/transcriptomic-dimension-reduction/dimension-reduction-plots.sh

In the Docker container on the branch for #675 right now and I've gotten through some of the steps that are showing changes here and nothing is showing up for me as altered in the scores TSV files.

Hardcoded seed in shell script 01-dimension-reduction.sh:

Here's that seed getting passed to run-dimension-reduction.R:

Here's were we run t-SNE via a custom function in scripts/run-dimension-reduction.R:

Here's that seed getting set in the custom function:

Wondering if something else is going on.

@jaclyn-taroni
Copy link
Member

Ran the same thing on my master (which is even with AlexsLemonade master) and the only changes I get are to the PDF dates.

@cansavvy
Copy link
Collaborator Author

Ran the same thing on my master (which is even with AlexsLemonade master) and the only changes I get are to the PDF dates.

@jaclyn-taroni I've checked out AlexsLemonade master and reran on the docker image and am still getting changed data. So I think we've narrowed down that it is due to changes in the docker image/container. Can you tell me more about what docker container/image you used? Specifically what version of Rtsne is it using? I'm gonna double check that I'm using the most updated Docker image as well.

@jaclyn-taroni
Copy link
Member

The versions for Rtsne are unchanged as far as I know (the last release was end of 2018, I believe). I have a version of the Docker image that is post-refactor/change of the Bioconductor install process.

@cbethell
Copy link
Contributor

cbethell commented Jun 8, 2020

Upon digging into this PR, I was unable to pinpoint a specific issue and/or solution but I did determine the following:



  • The seed function must be working as intended because my PCA and tSNE results stay the same (although they are different from the results in this PR and in master)
  • The input files have not changed since the dimension reduction script was last ran and merged into master
  • All other changes in this PR seem to be fine, the only issues appear to be with the PCA and tSNE result files
  • The issue appears to be with the versions of the Rtsne and prcomp functions/packages included in Docker — with no evidence of them being updated (from CRAN)
  • My results, however, were all only slightly off from the results in master (and from the results in this PR)

In other words, the mystery, unfortunately, still continues...

@jaclyn-taroni
Copy link
Member

Can someone file an issue to track the t-SNE results not being reproducible (including how we might test if it's the underlying prcomp result changing)? And then we can revisit the rest of the changes here.

@cbethell
Copy link
Contributor

Can someone file an issue to track the t-SNE results not being reproducible (including how we might test if it's the underlying prcomp result changing)? And then we can revisit the rest of the changes here.

Filed issue #716 to track this.

Copy link
Contributor

@cbethell cbethell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cansavvy, everything in this PR looks LGTM (once merge conflicts are fixed with what is being tracked in issue #716 in mind).

I did notice that the analyses/survival-analysis/results/cox_regression_tmb.tsv output file also changed in this PR, with the conf.high variable within containing a change of 1.026 -> 7.275 (the most significant change I noticed).

It appears that the input file pbta-snv-consensus-mutation-tmb-coding.tsv was last updated in the v14 data release, while what's in master was last ran 6 months ago so this makes sense.

That being said, I am approving these changes (once the results also make sense to you, someone who would be more familiar with the survival-analysis module than I am) 👍 .

@cansavvy
Copy link
Collaborator Author

I did notice that the analyses/survival-analysis/results/cox_regression_tmb.tsv output file also changed in this PR, with the conf.high variable within containing a change of 1.026 -> 7.275 (the most significant change I noticed).

Yes! Because of changes to tmb calculations (#564) I would expect the regression to have different results. I'll get those conflicts resolved.

@jaclyn-taroni jaclyn-taroni merged commit 6c87783 into AlexsLemonade:master Jun 14, 2020
@cansavvy cansavvy deleted the cansavvy/remove-package-installs-4 branch August 20, 2020 14:12
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove any R package installation from notebooks or scripts, especially with BiocManager
3 participants