Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for DNA methylation signal #8

Open
xies4 opened this issue Dec 15, 2023 · 6 comments
Open

Support for DNA methylation signal #8

xies4 opened this issue Dec 15, 2023 · 6 comments

Comments

@xies4
Copy link

xies4 commented Dec 15, 2023

Thank you for developing this amazing tool. I am wondering if there is a way to simulate reads with DNA methylation signals. If not, do you have any plan to include the function in the near future? Thanks again.

@hasindu2008
Copy link
Owner

Hello,

This is something that is straightforward to implement given we already have methylation current level tables in f5c, yet, haven't found time to really implement it. Given that you asked now, I can get into implementing this sometime in the coming months.

@xies4
Copy link
Author

xies4 commented Dec 15, 2023

Great! Looking forward to it! Thanks.

@HLHsieh
Copy link

HLHsieh commented Jan 29, 2024

Hi, I am looking forward to it as well!

@hasindu2008
Copy link
Owner

Implementing support for this new RNA004 kit to squigulator. Will do the methylation as soon as that is finished.

@hasindu2008
Copy link
Owner

@HLHsieh and @xies4
Thanks for your patience. A very preliminary methylation simulation feature has been now implemented under the "meth" branch at https://github.com/hasindu2008/squigulator/tree/meth. Note that I have yet tested only using a tiny corona-virus genome and there could be bugs/issues. I am planning to do some testing in the next few weeks, however, if you are interested you can give it a try.

You need to create a 3-column tsv file to have the chr, 0-based position and methylation frequency. For example:

MN908947.3	99	1.0
MN908947.3	1184	0.0
MN908947.3	1456	0.5
MN908947.3	20031 	0.7

Note that they must be tab separated. This example file is in test/methfreq.tsv

If you have a f5c/nanopolish methylation frequency TSV, you can use the following command to generate the necessary file for squigulator:

tail -n+2 methfreq.tsv | cut -f 1,2,7 > sqmeth.tsv

Then you can provide this file to squigulator like below:

./squigulator -x dna-r9-min -n 4000 -o a.blow5 test/nCoV-2019.reference.fasta --meth-freq test/methfreq.tsv

I tested this generated BLOW5 by basecalling (using buttery-eel) and running f5c:

eel -x cuda:all -i a.blow5 -o a.fastq --config dna_r9.4.1_450bps_sup.cfg
minimap2 -ax map-ont test/nCoV-2019.reference.fasta a.fastq  --secondary=no | samtools sort - -o a.bam && samtools index a.bam
f5c index --slow5 a.blow5 a.fastq
f5c call-methylation --slow5 a.blow5 -r a.fastq -b a.bam -g test/nCoV-2019.reference.fasta -o a.tsv
f5c meth-freq -i a.tsv -s -o afreq.tsv

The frequencies in afreq.tsv quite matched the ones in test/methfreq.tsv.

@xies4
Copy link
Author

xies4 commented Mar 1, 2024

Thank you @hasindu2008

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants