-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance is Slow with Docker + Wine; Migrate to wsl2 for converter #20
Comments
This is because of stat() which is often the cause of the I/O wait I was showing your a while back. stat() pulls in all the metadata, which you often don't need, like mtime, ctime, atime, etc. In fact we turn off atime on the fileserver since it's always different and quickly can cause an I/O bottleneck since it never stays in cache (always new) and it's never really used. $ man 2 stat # shows you the structure of all information in an inode. All this is collected with any stat call. Noting that directories are just files with tables of what's in them. The tables contain the list of filename/inodes. An inode with a table of inodes. Including a parent inode of ".." This is why a "mv" in UNIX is virtually instantaneous since you're only updating the tables in two inodes. No data actually moves. Take home session; with any code, avoid the stat unless you need it. If you only need filenames, all your doing is reading a table in a series of inodes. |
Thank you for this tip Bryan! I had no idea that's how these things work and that it was pulling all that information whenever you called In this case we don't even need the filenames! We literally just need how many files exist! |
No problem. Here's a real-world test I just did to demonstrate. Create a 1000 files, and iteration of just filenames is 68.6 times faster than iterating both filenames and stats for each of the 1000 files. Let us know if you're seeing performance issues in general, we may have some insight into why and help out.
|
Do you need a file count? Or just if file(s) exist or not. |
We just need the file count in the directory. The converter is killed only when the specified number of images has been collected or if a very long timeout period is hit. As far as I'm aware, the converter doesn't output anything if it finishes successfully or crashes, so doing something like So what happens instead is the file system is polled every 10 seconds to see if the number of found images matches the number of expected images from the recording. See here for the conditional it's looking for. |
Here's some very rough/not standard kinds of benchmarking with different use of the ripper. There's a summary at the bottom. Note: For the Avg Tifs/sec metric, the number is likely a fair bit higher than what's written for the LOCAL ON NATIVE WINDOWS measures because, again, the time it takes to perform the conversion to csv happens first and is included in the total time. It would be somewhat higher for the the REMOTE TO ones. Memory profiling as well as disk write speeds during conversion have yet to be profiled for this. I'm also not sure how to actually do that without just staring at the screen (which is what I'm currently doing). LOCAL ON NATIVE WINDOWS AMD Ryzen 9 Copying FROM server TO local machine special-k.snl.ad.salk.edu: Transfer topped out close to 100MB/s most of the time. Total Tiffs: 45510 Total Conversion Time: ~ 6 min (!) Avg Tifs/sec: ~ 126/s (!) Total to tiff time: ~ 28 min Writing to disk this way looks like it has a pretty sustained rate of creation of 323MB/s during processing and it's memory usage remains at just under 0.5GB throughout the whole conversion.... If there's a way to speed up transfers to the windows machine upstairs that would be very cool. A better solution would be if there was a dedicated windows machine that's linked up over some fast ethernet to move things to local SSDs, but as far as I'm aware, one is not available in the cluster at the moment. The other thing would be to have a script invoke subprocessing on that windows machine so conversions can happen in parallel. I don't know how to tell a specific Windows machine to do stuff from a script, though... REMOTE TO NATIVE WINDOWS AMD Ryzen 9 Total Tiffs: 45510 Total Conversion Time: ~ 19 min Avg Tifs/sec: ~ 40 Total to tiff time: ~ 19 min Reading over the 1Gbps line slows conversions down quite a bit, but isn't that much shorter than the Windows local total amount of time due to the amount of time it took to transfer to Austin's windows upstairs. This uses approximately the same amount of memory (just under .5GB) consistently but is writing to disk at a consistently slow rate of about 22MB/s! It's transferring data from the network at a rate of near .5Gbps, so half the bandwidth of the available 1Gbps line... REMOTE TO COMPUTE NODE SCRATCH, DOCKER WINE Total Tiffs: 45510 Total Conversion Time: ~ 28 min Avg Tifs/sec: ~ 27 Total to tiff time: ~ 28 min This takes slightly longer maybe due to the fact that it's reading data into the machine over LAN before processing it, but is much more likely to be related to Docker + Wine doing things slowly as Annie has mentioned. Given the slow speed of writing tiffs to disk over the network shown in the REMOTE TO NATIVE WINDOWS section, it seems like this is also a candidate for things being slow, but the fact that using the local scratch space for reading/writing data is similarly slow overall, Docker + Wine are the more likely culprits it seems... LOCAL COMPUTE NODE SCRATCH TO LOCAL SCRATCH, DOCKER WINE cp /snlkt/data/specialk/jeremy_testing/ripping_tests/20211105_CSE020_plane1_-587.325_raw-013 /scratch/snlkt2p_format_testing/ Using: /scratch/snlkt2p_format_testing/20211105_CSE020_plane1_-587.325_raw-013 Total Tiffs: 45510 Total Conversion Time: ~ 23 min Avg Tifs/sec: ~ 33 As expected, moving data to/from nodes and storage in the datacenter is very fast. However, there was only a slight speedup by having data available on the machine's SSD (disappointingly). While staring at the memory usage through
However, the disk read/write speeds are very slow. While doing this:
I see this at the very beginning once tiffs have started being generated:
Over time this slowly decreases down further and further until it looks like it hits a floor of about 10000 kB_wrtn/s. How to actually tell if this is a hard drive specific thing vs actually something going on in the code is something I don't think we can do that well since we have no idea what the converter is actually doing... The only thing that remains to be tested is the subject of this issue really. I feel like most likely it's a Wine thing as Annie said, but it looks like at the very beginning of conversion it starts out doing at least okay. Why would it slow down so badly the longer it's been running? And why doesn't it ever reach the fast 300+MB/s generation of tiffs as on a windows machine? Not sure how to properly monitor file IO/writing on linux still, but I output Summary: As we've known, performing the ripping on a windows machine with data locally stored on the computer's SSD performs much much faster. I'm not aware of a dedicated windows machine that we can perform this procedure on through shell scripting as Annie has suggested doing which would be really cool. Another thing is that sending the raw binaries over the network and then performing the conversion upon incurs a performance penalty, which is compounded by the slowdowns experienced by the docker container (probably because Wine isn't managing the conversions to unix calls very well for this case, or for the other reasons mentioned above/performance penalties due to virtualizaiton through docker). Making copies of files from the server to local scratch spaces in the data center is crazy fast (just over 1 min for 75GB of stuff!) especially when compared to Austin's special-k computer (as expected since it's on a slower connection), but the unnecessary copying of data, even temporarily, isn't ideal and I would imagine it's something we'd like to avoid. There's definitely better ways of doing benchmarking, and there are logs available for the Docker implementation about tif generation, but I'm not sure how to properly log things like memory usage, disk write speeds, and cpu usage to files yet. No real idea about how to log them on Linux properly... |
FYI - You can boot a linux kernel in windows, use wsl v2. This is pure
linux, as opposed to an emulator like wine.
https://docs.microsoft.com/en-us/windows/wsl/about I've been using wsl
for sometime now when I have to use windows. It adds the "UNIX" component
to windows, similar to how OSX is built around a UNIX bsd kernel so you get
that for free. In all cases I have a true bash shell, exec'ing code/apps
from there.
Once I figure out what the "converter" is, I can profile a breakdown. i.e.
I can test data through a raw tcp/ip socket directly between cpus, then add
in I/O to/from different forms of storage, etc. Break up the problem. You
have a bottleneck somewhere, just need to find it. We have visibility into
what's happening on the storage side as well and can add visibility into
the systems too.
Could you gather a bundle of code that is the "converter" and put it on the
fileserver where I can access it? Or maybe it's already a git repo? :-)
I'm guessing it is, but some insight will save me some digging around. I
know you gave me the example dir location of a "pile of tiffs" needing
processing.
The push should be towards linux (and related tools) for pipeline
processing. That's where all the power is located (99+ %); compute, storage
and network.
Thanks, Bryan
…On Wed, Jun 15, 2022 at 4:29 PM Jeremy Delahanty ***@***.***> wrote:
Here's some very rough/not standard kinds of benchmarking with different
use of the ripper. There's a summary at the bottom.
Note:
All "Total to tiff time" includes the amount of time it takes the ripper
to convert the voltage recording binary into a csv file + transferring from
server to compute node when applicable. Directory locations are noted as
well as where to things are being written. This assumes that the raw
binaries/metadata files have been written to the server/are located there
for the REMOTE conversions. For LOCAL directories, the binaries are stored
on the local machine's SSD.
For the Avg Tifs/sec metric, the number is likely a fair bit higher than
what's written for the LOCAL ON NATIVE WINDOWS measures because, again, the
time it takes to perform the conversion to csv happens first and is
included in the total time. It would be somewhat higher for the the REMOTE
TO ones. Memory profiling as well as disk write speeds during conversion
have yet to be profiled for this. I'm also not sure how to actually do that
without just staring at the screen (which is what I'm currently doing).
LOCAL ON NATIVE WINDOWS
Using: C:\Users\jdelahanty\Desktop\20211105_CSE020_plane1_-587.325_raw-013
Writing to: C:\Users\jdelahanty\Desktop\local_out
Ripper: 5.5.64.500
AMD Ryzen 9
Copying FROM server TO local machine special-k.snl.ad.salk.edu:
22 min
Total Tiffs: 45510
Total Conversion Time: ~ 6 min (!)
Avg Tifs/sec: ~ 126/s (!)
Total to tiff time: ~ 28 min
If there's a way to speed up transfers to the windows machine upstairs
that would be very cool. A better solution would be if there was a
dedicated windows machine that's linked up over some fast ethernet to move
things to local SSDs, but as far as I'm aware, one is not available in the
cluster at the moment. The other thing would be to have a script invoke
subprocessing on that windows machine so conversions can happen in
parallel. I don't know how to tell a specific Windows machine to do stuff
from a script, though...
REMOTE TO NATIVE WINDOWS
Using:
X:\specialk\jeremy_testing\ripping_tests\20211105_CSE020_plane1_-587.325_raw-013
Ripper: ibid
AMD Ryzen 9
Total Tiffs: 45510
Total Conversion Time: ~ 19 min
Avg Tifs/sec: ~ 40
Total to tiff time: ~ 19 min
Reading over the 1Gbps line slows conversions down quite a bit, but isn't
that much shorter than the Windows local total amount of time due to the
amount of time it took to transfer to Austin's windows upstairs. That
transfer topped out close to 100MB/s most of the time.
REMOTE TO COMPUTE NODE SCRATCH, DOCKER WINE
Using:
X:\specialk\jeremy_testing\ripping_tests\20211105_CSE020_plane1_-587.325_raw-013
Ripper: ibid
Total Tiffs: 45510
Total Conversion Time: ~ 28 min
Avg Tifs/sec: ~ 27
Total to tiff time: ~ 28 min
This takes slightly longer maybe due to the fact that it's reading data
into the machine over LAN before processing it, but is much more likely to
be related to Docker + Wine doing things slowly as Annie has mentioned.
LOCAL COMPUTE NODE SCRATCH TO LOCAL SCRATCH, DOCKER WINE
Location on cheetos.snl.salk.edu
cp
/snlkt/data/specialk/jeremy_testing/ripping_tests/20211105_CSE020_plane1_-587.325_raw-013
/scratch/snlkt2p_format_testing/
1m17.496s
Using:
/scratch/snlkt2p_format_testing/20211105_CSE020_plane1_-587.325_raw-013
Ripper: ibid
Total Tiffs: 45510
Total Conversion Time: ~ 23 min
Avg Tifs/sec: ~ 33
As expected, moving data to/from nodes and storage in the datacenter is
very fast. However, there was only a slight speedup by having data
available on the machine's SSD (disappointingly).
Summary:
As we've known, performing the ripping on a windows machine with data
locally stored on the computer's SSD performs much much faster. I'm not
aware of a dedicated windows machine that we can perform this procedure on
through shell scripting as Annie has suggested doing which would be really
cool. Another thing is that sending the raw binaries over the network and
then performing the conversion upon incurs a performance penalty, which is
compounded by the slowdowns experienced by the docker container (probably
because Wine isn't managing the conversions to unix calls very well for
this case, or for the other reasons mentioned above/performance penalties
due to virtualizaiton through docker). Making copies of files from the
server to local scratch spaces in the data center is crazy fast (just over
1 min for 75GB of stuff!) especially when compared to Austin's special-k
computer (as expected since it's on a slower connection), but the
unnecessary copying of data, even temporarily, isn't ideal and I would
imagine it's something we'd like to avoid.
There's definitely better ways of doing benchmarking, and there are logs
available for the Docker implementation about tif generation, but I'm not
sure how to properly log things like memory usage, disk write speeds, and
cpu usage to files yet.
—
Reply to this email directly, view it on GitHub
<#20 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAUXWXI2IG7SGQXMOHZGWSLVPJRMLANCNFSM5YH6NE5Q>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Sent along an email a little bit ago that has that kind of information! Can repost here if we'd like.
I remember now that you mentioned this! I completely forgot about it. Do we have any machines that are running wsl v2 that we can try running the converter on? |
Hey,
Yeah, I wanted to elaborate a bit in this thread thinking others might be
watching this thread, and if not, so the information pertaining to this
code base is located here. I think I'll close out that ticket, referencing
this thread, and doc the progress here where it's related.
You can install wsl2 on any machine. What's the name of your win machine?
I'll install it for you.
Thanks, Bryan
…On Thu, Jun 16, 2022 at 12:46 AM Jeremy Delahanty ***@***.***> wrote:
Sent along an email a little bit ago that has that kind of information!
Can repost here if we'd like.
You can boot a linux kernel in windows, use wsl v2. This is pure
linux, as opposed to an emulator like wine.
I remember now that you mentioned this! I completely forgot about it. Do
we have any machines that are running wsl v2 that we can try running the
converter on?
—
Reply to this email directly, view it on GitHub
<#20 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAUXWXMLPBNJNXRSIGCHSUDVPLLT5ANCNFSM5YH6NE5Q>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
The machine that I'm working on at my desk is called busdriver (so busdriver.snl.ad.salk.edu). On the cluster I've been using Cheetos for the docker version. So to be clear here, the converter (you can find the executables here) was written for Windows. If I'm understanding your suggestion correctly, the idea would be to have a Windows machine with wsl2 installed in the cluster that can run this converter. That machine can be told to execute the converter from a different linux machine running a shell script or something. If we ran it this way, we would avoid the use of Docker/Wine altogether and get the super quick speed performances of a native windows machine with the flexibility of Linux. |
One thing that was discussed at our 2P meeting yesterday was to just have a Windows machine next to the Bruker computer that does nothing but ripping, which will be nice and fast, and then runs the conversion to H5 before it's sent to the server. Writing tiffs to H5 using this takes 2-3 minutes, but there are other implementations we can use too. |
According to a chat I had online in the zarr developer gitter (need to find that conversation in particular) things might slow down due to the use of
glob
to determine how many tiffs are present in the output directory. They suggested usingos.scandir
instead. Another friend I was seeking advice from mentioned a similar piece of advice as well for finding the source of slowdowns. It's worth a shot to try this out and, even if it doesn't work, it's probably better to use this anyways for this use case.The text was updated successfully, but these errors were encountered: