Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(repository): Metadata compression config support for indirect content #557

Merged

Conversation

PrasadG193
Copy link
Collaborator

@PrasadG193 PrasadG193 commented Jul 4, 2024

Overview:

This PR is based on #556

This PR adds support to set compression algorithm for indirectContent type of metadata (metadata with x prefix)

Test plan

  1. Initialize repo and check default policy setting. Validate default metadata compression is zstd-fastest
$ kopia policy show --global
.
.
Compression disabled.

Metadata compression:
  Compressor:                   zstd-fastest   (defined for this target)
.
.
  1. Create a file with random data of 4G size and put it in the repo dir. Perform snapshot of repo dir and observe stats. Validate content with x prefix is compressed with zstd-fastest
$ kopia content stats         
Count: 1221
Total Bytes: 5.1 GB
Total Packed: 5.1 GB (compression 0.0%)
By Method:
  (uncompressed)         count: 1185 size: 5.1 GB
  zstd-fastest           count: 36 size: 180.8 KB packed: 70.8 KB compression: 60.8%
Average: 4.2 MB
Histogram:

        0 between 0 B and 10 B (total 0 B)
        1 between 10 B and 100 B (total 28 B)
       75 between 100 B and 1 KB (total 42.1 KB)
      160 between 1 KB and 10 KB (total 578.1 KB)
       33 between 10 KB and 100 KB (total 662 KB)
        1 between 100 KB and 1 MB (total 365.3 KB)
      951 between 1 MB and 10 MB (total 5.1 GB)
        0 between 10 MB and 100 MB (total 0 B)



$ kopia content list --compression  | grep ^x  
x7cbc5392993071faffbce186055f5621 length 64553 packed 26315 zstd-fastest 59.2% 
x82fbbf71d9d7ee788fa996f089390d4c length 27562 packed 11230 zstd-fastest 59.3% 
xbfae0e6595e5b7e9d6e9582ab5c4ae2e length 26814 packed 10923 zstd-fastest 59.3% 
xeb2db39b3473266527f2e1cc49bdb35b length 9854 packed 4057 zstd-fastest 58.8% 
 
  1. Set metadata compression of global policy to s2-default
$ kopia policy set --global --metadata-compression=s2-default

$ kopia policy show --global 
.
.
Compression disabled.

Metadata compression:
  Compressor:                     s2-default   (defined for this target)
.
.
  1. Create a new file with 4G size on different path. Snapshot and view content stats. Validate new metadata is compressed with s2-default
$ kopia content stats             
Count: 2067
Total Bytes: 9.1 GB
Total Packed: 9.1 GB (compression 0.0%)
By Method:
  (uncompressed)         count: 2022 size: 9.1 GB
  s2-default             count: 9 size: 117.9 KB packed: 87 KB compression: 26.2%
  zstd-fastest           count: 36 size: 180.8 KB packed: 70.8 KB compression: 60.8%
Average: 4.4 MB
Histogram:

        0 between 0 B and 10 B (total 0 B)
        1 between 10 B and 100 B (total 28 B)
       96 between 100 B and 1 KB (total 54.7 KB)
      218 between 1 KB and 10 KB (total 815.3 KB)
       43 between 10 KB and 100 KB (total 896 KB)
        1 between 100 KB and 1 MB (total 365.3 KB)
     1708 between 1 MB and 10 MB (total 9.1 GB)
        0 between 10 MB and 100 MB (total 0 B)

  1. Disable metadata compression for tests dir
$ kopia policy set ./tests --metadata-compression=none

$ kopia policy show tests
.
.
Compression disabled.

Metadata compression disabled.
.
.
  1. Create a new file of 4G size in tests dir. Snapshot tests dir and inspect content. New metadata stats should be seen as uncompressed
$ kopia content stats          
Count: 2991
Total Bytes: 13.1 GB
Total Packed: 13.1 GB (compression 0.0%)
By Method:
  (uncompressed)         count: 2946 size: 13.1 GB
  zstd-fastest           count: 36 size: 180.8 KB packed: 70.8 KB compression: 60.8%
  s2-default             count: 9 size: 117.9 KB packed: 87 KB compression: 26.2%
Average: 4.4 MB
Histogram:

        0 between 0 B and 10 B (total 0 B)
        3 between 10 B and 100 B (total 208 B)
      140 between 100 B and 1 KB (total 78.3 KB)
      314 between 1 KB and 10 KB (total 1.2 MB)
       61 between 10 KB and 100 KB (total 1.3 MB)
        2 between 100 KB and 1 MB (total 757.8 KB)
     2471 between 1 MB and 10 MB (total 13.1 GB)
        0 between 10 MB and 100 MB (total 0 B)



$ kopia content list --compression  | grep ^x
x107d0a9778be51de0c02987415dc27ef length 51761 packed 51789 - 
x157b090d41b93301c773b879d7d23591 length 24094 packed 18876 s2-default 21.7% 
x171df3825e7748bafcc3c1ae35d82db6 length 27622 packed 27650 - 
x43867f5baa1397e8f88a241e2d425b5f length 51303 packed 39869 s2-default 22.3% 
x7cbc5392993071faffbce186055f5621 length 64553 packed 26315 zstd-fastest 59.2% 
x82fbbf71d9d7ee788fa996f089390d4c length 27562 packed 11230 zstd-fastest 59.3% 
x9827efd6eb27269da98632bcd8dc5ef4 length 27027 packed 21051 s2-default 22.1% 
xbfae0e6595e5b7e9d6e9582ab5c4ae2e length 26814 packed 10923 zstd-fastest 59.3% 
xc93297f25576049e0815902768c1472d length 23957 packed 23985 - 
xeb2db39b3473266527f2e1cc49bdb35b length 9854 packed 4057 zstd-fastest 58.8% 

@PrasadG193 PrasadG193 changed the title Metadata compression config support for indirect content feat(repository): Metadata compression config support for indirect content Jul 4, 2024
@PrasadG193 PrasadG193 force-pushed the md-compression-setting-k-content branch from bb7e7bd to b29e3a5 Compare July 23, 2024 05:29
@PrasadG193 PrasadG193 force-pushed the md-compression-setting-x-content branch from 01b1766 to 1eb6816 Compare July 24, 2024 06:40
@PrasadG193 PrasadG193 marked this pull request as ready for review July 24, 2024 06:40
Copy link

@e-sumin e-sumin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me, from first look I don't see any showstoppers. But I'd wait for second pair of eyes, to have greater confidence.
Also I see that tests are failing, not sure why (have not checked, might be needed to update tests).

Copy link

@Shrekster Shrekster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for moving the compression parameter to constructor.

@PrasadG193 PrasadG193 merged commit b03d9e7 into md-compression-setting-k-content Aug 8, 2024
5 of 6 checks passed
@PrasadG193 PrasadG193 deleted the md-compression-setting-x-content branch August 8, 2024 04:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants