Avoid skipping candidate RBS positions in rbs_score
#102
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi, one final PR 😃
In the test sequence I used for #100 I noticed the following bug: after reverse-complementing a sequence, the RBS spacer for one of the predicted gene was changing when the contig was reverse-complemented:
Indeed, the gene with the
GGA/GAG/AGG
RBS motif has a spacer detected as3-4bp
when on the forward strand, and5-10bp
on the reverse strand. The contig in question starts with the following sequence:so it has both a match in the
3-4bp
range (AGG
) and in the5-10bp
range (GGA
), but since the5-10bp
spacer has a higher score it should be the one to be selected. This actually matters on the gene score, so it could cause some predictions to change.The problem was coming from the loops in
rbs_score
which skip some positions before index0
; however, when there may be a partial match (as it is the case here, with aGGA
motif right on the contig edge), the positions should not be skipped, and the decision to ignore some positions should be made by theshine_dalgarno_exact
andshine_dalgarno_mm
functions directly.After applying the patch, the predictions are consistent independent of the directionality of the contig, the RBS spacers and hence the gene scores match: