Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mspms slow compared with ms #1652

Open
grahamgower opened this issue Apr 15, 2021 · 7 comments
Open

mspms slow compared with ms #1652

grahamgower opened this issue Apr 15, 2021 · 7 comments
Milestone

Comments

@grahamgower
Copy link
Member

grahamgower commented Apr 15, 2021

Here's a small three population example without recombination.

# ms.sh
$MS 2 1 \
	-t 1.0 \
	-I 3 1 0 1 \
	-em 0.0 3 2 1.0 \
	-ej 100.0 2 1 \
	-eM 100.0 0.0
$ time MS=ms sh ms.sh > /dev/null

real    0m0,020s
user    0m0,013s
sys     0m0,007s

mspms

$ mspms -V
mspms 1.0.0
$ time MS=mspms sh ms.sh > /dev/null

real    0m0,267s
user    0m0,344s
sys     0m0,806s

I thought maybe this was just because of the value of theta, so I bumped it up to 10000.0, and now the results are even worse.

$ time MS=ms sh ms.sh |grep segsites | head
segsites: 2022897

real    0m0,800s
user    0m1,059s
sys     0m0,016s

$ time MS=mspms sh ms.sh |grep segsites | head
segsites: 2006494

real    0m28,929s
user    0m29,017s
sys     0m1,093s
@jeromekelleher
Copy link
Member

Interesting, thanks @grahamgower. The first case is probably just overhead because msprime is doing more complicated things, but the second is a genuine perf regression all right. Looks like the time is all being spent in mutation generation.

Sigh.

@jeromekelleher jeromekelleher added this to the 1.0.1 milestone Apr 15, 2021
@grahamgower
Copy link
Member Author

Looks like the time is all being spent in mutation generation.

Or perhaps writing the ms-format output?

@jeromekelleher
Copy link
Member

Or perhaps writing the ms-format output?

No, I just had a quick look at perf top and a lot of time was being spent in mutgen. No doubt there'll be a good chunk spent in the ms format output too.

@petrelharp
Copy link
Contributor

Darn. What do you run to see the perf, btw?

@jeromekelleher
Copy link
Member

Lots of time spent in the AVL trees, IIRC. I didn't look too hard though, just confirming that it wasn't Python overhead or something.

@petrelharp
Copy link
Contributor

I meant - what tool do you use to look at this?

@jeromekelleher
Copy link
Member

Ah, sorry. Linux perf: https://perf.wiki.kernel.org/index.php/Main_Page

It's awesome. perf top in particular is super useful for getting sense of what's important during the different stages of a computation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants