Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Counts seems incorrect #2

Open
enricorox opened this issue Dec 29, 2022 · 1 comment
Open

Counts seems incorrect #2

enricorox opened this issue Dec 29, 2022 · 1 comment

Comments

@enricorox
Copy link

Hi!
I'm a computer engineering student and I'm doing my master thesis on improving UST basically (see here if interested).

I wrote a simple C++ program that extracts canonical kmers from simplitigs and appends sequentially its counts using UST output files.
Then I sorted the kmers list and compared to the one computed by Jellyfish-2.

There are difference between counts, though kmers are the same. Can you confirm this?

How to reproduce

Extract kmers and counts from ust output files:

  • g++ kmers-extractor.cpp -o kmers-extractor
  • ./kmers-extractor <kmer-size> <ust-fasta> <ust-counts>
  • sort ust-kmers.txt -o ust-kmers-sorted.txt

Extract kmers and counts from starting sequence (not the bcalm one):

  • jellyfish-linux count -m <kmer-size> -C -s 100M -L 2 <starting-fasta>
  • jellyfish-linux dump -c mer_counts.jf > kmers.txt
  • sort kmers.txt -o kmers-sorted.txt

Compare the two files:

  • cmp kmers-sorted.txt ust-kmers-sorted.txt

kmers-extractor is attached.

Note that kmers with abundance 1 are ignored.

@enricorox
Copy link
Author

Making things simple, here there is an easy example.

  • Let's take the first 31-mer of the first simplitig: CCCTGACAAAAAGGGCCCCAAGCTTCCAATA
  • Take the first count of the counts file: 3.
  • Find it or its reverse-complement TATTGGAAGCTTGGGGCCCTTTTTGTCAGGG in the unitigs file: it's on unitig 0
  • Its count is the last element on the unitig counts vector: 2

I think it's because you don't reverse the unitig counts vector when you reverse-complement the unitig.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant