Increasing memory usage #22

goblin · 2018-08-16T12:04:34Z

Hi, thanks for this great backup solution! :-)

I'm wondering why is it consuming so much RAM during backup. I'm using this proc:

tar c somefiles | pv | scat -stats "split | 
  backlog 24 { 
    checksum | 
    index - | 
    gzip | 
    parity 3 1 | 
    checksum | 
    cmd gpg --batch -e -r ABCDEF01 -z 0 | 
    group 4 | 
    concur 4 stripe(1 3 zero=cp(/mnt/caddy/0 3) one=cp(/mnt/caddy/1 3) two=cp(/mnt/caddy/2 3) three=cp(/mnt/caddy/3 3)) 
}"

I'm currently at about 10 TiB of data from tar, and scat is now using 57GiB of RAM. This amount is always increasing, at about 5 TiB it was around 30 GiB. I've noticed zbackup behaved similarly (but it died with a stacktrace after about 6 TiB of input).

What's this RAM needed for?

The index produced on stdout is currently 2.5 GiB in size, so even if scat is storing all the checksums of all the chunks produced so far, it's still using over 40x as much memory as it should :-S But storing all the chunk checksums in RAM shouldn't be necessary, because the filesystem could be queried to see if they exist already...

Thanks :-)

The text was updated successfully, but these errors were encountered:

goblin · 2018-08-16T12:06:51Z

By the way, here's the current output from -stats (I'm not exactly sure how to interpret the rates and multipliers):

           PROC INST            RATE            USE          QUOTA         FILL
           zero x1         6.8 MiB/s        2.4 TiB              ∞            ∞
            one x3         5.9 MiB/s        2.4 TiB              ∞            ∞
            two x0          11 MiB/s        2.4 TiB              ∞            ∞
          three x0         8.1 MiB/s        2.4 TiB              ∞            ∞
          split x1          12 MiB/s                                           
        backlog x24         15 MiB/s                                           
       checksum x0          26 MiB/s                                           
          index x0          11 MiB/s                                           
           gzip x0          12 MiB/s                                           
         parity x0          16 MiB/s                                           
            cmd x4          14 MiB/s                                           
          group x0             0 B/s                                           
         concur x21         15 MiB/s                                           
   (goroutines) x296

Roman2K · 2018-08-17T12:39:31Z

Thanks for the detailed report!

I too noticed the ever increasing memory usage which definitely looks like a memory leak.

Could you try recompiling with a new version of Go?

goblin · 2018-08-17T21:33:18Z

Thanks for the quick reply :-)

This was freshly compiled using Go 1.10.3, which I believe is the latest.

However, there's one important mistake I made: I had this other memory increase tested with restic, not zbackup (I bailed on zbackup early cause it seemed too slow). So it's entirely possible it's due to restic's chunker. (and it was restic that died after 6 TiB, not zbackup)

Later I'll try to run it with perf and pprof and see if I can figure it out where the leak is coming from. I'm a Go newbie though so it might be hard ;-)

goblin · 2019-10-08T18:05:02Z

OK, so I ran some initial tests with pprof.

I started with a simple proc of split | { checksum | index - }, and the memory was increasing, although not as fast as in the original post. I fed it totally random data so there was no duplicate chunks.

I discovered it seems to be leaky by design: in procs/index.go:62, it's assigning the chunk-hash to an in-memory map. It later uses that map to see if the chunk was already processed. I originally imagined it wouldn't do that, as it can check that by seeing if the appropriate filename exists in the output directory.

So I don't see a simple way of fixing that, short of changing how it works and possibly making it slower in the process (although the filesystem checks can perhaps be cached by the OS).

I then ran it again with the original proc and some real data from tar, and there was way more places where large chunks of data were allocated. Some of them were shrinking in the process, but overall the memory consumption grew, of course. Most notable were scat/split (*splitter) Next, scat/stores/copies (*Reg) List and scat/stores/copies (*List) Add.

My plan is to rewrite it, so that it uses an on-disk database of chunks. I'll need this also for other features, such as being able to restore only particular files rather than an entire backup, or being able to keep track of tape/disk changes (i.e. backing up a huge filesystem to many smaller BluRays, tapes, or USB HDDs, only few of which are connected at a given time). This should also help with #23, as it'll be easier to rename the output chunks then (and group them into bigger ones, to also hide the individual chunks' sizes).

Roman2K · 2019-10-12T15:50:22Z

Hi @goblin - glad you're still active on this project and thanks for having investigated the leak. I must admit though, I'm not using scat at the moment and most of the internals I have forgotten about, nor would I have the incentive to look at them in details. However from what I understand, I think your idea of rewriting procs/index to an on-disk database seems sensible. Index history would have to be stored within that database instead of git (since it wouldn't be a simple text file anymore), but other than that, why not. Good luck! I'd be curious to see if this this fixes the leak. Hopefully it will 🍀

May I add, I still do believe in the idea behind the project and still need such a tool. I've since fallen back to cleartext syncing to Google Drive 😫 to at least have some kind of backup despite the privacy issues and risks of loss. It's just that some open issues were preventing me from using scat as I initially envisioned it and I didn't have the guts to address them head on. I do have brewing in mind since the past few years to either give another go at it in the current code base, or rewrite the whole thing in Ruby. Yes, single-threaded, slow Matz Ruby - so enjoyable to code in that everything feels possible: easy to experiment, tinker with, tear apart and rewrite, or even... make performant, paradoxically. Should that last point prove infeasible, there's Crystal, hehe.

goblin · 2019-10-13T00:11:56Z

I tried Ruby a few years ago, and I'm much more fond of learning Go at the moment ;-) Especially given that you've done so much work on it in Go.

goblin · 2019-10-13T00:25:26Z

It's just that some open issues were preventing me from using scat as I initially envisioned it and I didn't have the guts to address them head on.

Which issues, specifically?

Roman2K mentioned this issue Oct 12, 2019

Announcements #1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increasing memory usage #22

Increasing memory usage #22

goblin commented Aug 16, 2018

goblin commented Aug 16, 2018

Roman2K commented Aug 17, 2018

goblin commented Aug 17, 2018 •

edited

Loading

goblin commented Oct 8, 2019

Roman2K commented Oct 12, 2019 •

edited

Loading

goblin commented Oct 13, 2019

goblin commented Oct 13, 2019

Increasing memory usage #22

Increasing memory usage #22

Comments

goblin commented Aug 16, 2018

goblin commented Aug 16, 2018

Roman2K commented Aug 17, 2018

goblin commented Aug 17, 2018 • edited Loading

goblin commented Oct 8, 2019

Roman2K commented Oct 12, 2019 • edited Loading

goblin commented Oct 13, 2019

goblin commented Oct 13, 2019

goblin commented Aug 17, 2018 •

edited

Loading

Roman2K commented Oct 12, 2019 •

edited

Loading