-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some comparisons of the LFC results between GI Mapping and gimap #49
Conversation
@cansavvy the R-CMD-check for windows-latest was failing because it didn't have the |
It's in the gitignore file because the |
joindf <- dplyr::full_join(old_lfc_results, gimap_dataset$normalized_log_fc, | ||
by = c("pgRNA_id" = "pg_ids", "rep"), | ||
suffix = c("_old", "_new")) %>% | ||
select(lfc_adj, CRISPR_score, rep) %>% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear lfc_adj is the column in the old data that is actually the CRISPR score? So we are comparing CRISPR scores to CRISPR scores?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is the part I'm not sure about. Looking at the GI_Mapping code, the CRISPR_score
column is a rename of the lfc_adj3
column. I use the CRISPR_score column from the GI_Mapping results and compare that to the lfc_adj
column from the gimap
results. If CRISPR_score
from GI_Mapping doesn't correspond to lfc_adj
from gimap, which one does? I did some code comparing this afternoon and I think it'll be the lfc_adj2
column from GI_Mapping as it has a similar calculation as lfc_adj
in gimap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah in this case from our data what you want is the CRISPR score which you can get from gimap_dataset$crispr_score
if calc_crispr()
has been ran.
That should be a more apples to apples comparison.
It would also still be worth comparing precursor lfc's to our lfc's as well but perhaps starting with CRISPR is good start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm comparing the same precursor lfc's now as of commit e2b0ece
So should I remove the |
Yeah I think I included it in there because it wasn't too big but I also like the idea of being consistent. Up to you. |
I'm working on comparing the old code and the new code side-by-side for each related chunk.... When I ran this
I got this warning message:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is such a great PR to see. I was very worried about these calculations coming out right so glad you are working to verify them!
I'll do a tad more review myself but in the meantime feel free to use these comments. My comments here are really only the most minor of comments.
R/04-normalize.R
Outdated
# TODO: This section needs a careful review to make sure it is relfective of the previous code's calculations | ||
values_to = "lfc_adj1") %>% | ||
group_by(rep) %>% | ||
#dplyr::left_join(gimap_dataset$annotation, by = c("pg_ids" = "pgRNA_id")) %>% #annotation is already joined with late_vs_plasmid_df |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to delete things that are unnecessary instead of commenting out.
R/04-normalize.R
Outdated
@@ -120,28 +120,28 @@ gimap_normalize <- function(.data = NULL, | |||
dplyr::filter(norm_ctrl_flag == "negative_control") %>% | |||
dplyr::select(pg_ids, dplyr::starts_with("late")) | |||
|
|||
# TODO: There's one main median that's found by taking the median of the medians? | |||
neg_control_median <- median(apply(neg_control_median_df[, -1], 1, median)) | |||
# TODO: They find a median for each rep, so apply across columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks for clarifying this. I couldn't quite determine what kind of median was being calculated.
R/04-normalize.R
Outdated
names_from = c(timepoints, replicates)) | ||
|
||
# Extract only the late columns we'll keep these replicates for calculations later | ||
late_df <- dplyr::select(lfc_df, dplyr::starts_with("late")) | ||
#late_df <- dplyr::select(lfc_df, dplyr::starts_with("late")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to delete if it is unnecessary!
R/04-normalize.R
Outdated
|
||
# First and second adjustments to LFC | ||
lfc_df <- lfc_df %>% | ||
lfc_df_withAdjs <- late_vs_plasmid_df %>% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such a minor minor comment I'm going to make here but can we do lfc_df_adj
so it sticks with the style?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I can change that!
R/04-normalize.R
Outdated
|
||
# Save this at the construct level | ||
gimap_dataset$normalized_log_fc <- lfc_df | ||
gimap_dataset$normalized_log_fc <- lfc_df_withAdjs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Samesies for here lfc_df_adj
@cansavvy Is this good to merge now? The R-CMD-check failure appears to be an error on the vignette that's a 403 error during the chunk with a string of steps. Probably the annotate step. |
The vignette will continue to fail until I update the docker image. The window failure is a known thing: #48 So yes! We can merge! |
This is an Rmd and its rendered html to compare the LFC results from GI Mapping and gimap. This assumes that the CRISPR_score and lfc_adj columns correspond to one another.