-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: set bootstrap data locally instead of downloading #153
Conversation
discuss with @StoneDot dy should work in single binary so using like https://doc.rust-lang.org/std/macro.include_str.html or https://doc.rust-lang.org/std/macro.include_bytes.html is better.
~/desktop/dynein
$ cp ./target/release/dy ~/desktop/
~/desktop
$ ./dy bootstrap
Bootstrapping - dynein will creates 4 sample tables defined here:
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AppendixSampleTables.html
'ProductCatalog' - simple primary key table
Id (N)
'Forum' - simple primary key table
Name (S)
'Thread' - composite primary key table
ForumName (S)
Subject (S)
'Reply' - composite primary key table, with GSI named 'PostedBy-Message-Index'
Id (S)
ReplyDateTime (S)
[skip] Table 'ProductCatalog' already exists, skipping to create new one.
[skip] Table 'Forum' already exists, skipping to create new one.
[skip] Table 'Thread' already exists, skipping to create new one.
[skip] Table 'Reply' already exists, skipping to create new one.
Still CREATING following tables: []
All tables are in ACTIVE.
Tables are ready and retrieved sample data locally. Now start writing data into samle tables...
Error: LoadData(Os { code: 2, kind: NotFound, message: "No such file or directory" }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for creating pull request. I think this PR great because it opens a path to use bootstrap command in an isolated network.
I think the code quality is good. However, I have a concern regarding the size of binary, which is why I share the result of my testing in the bottom of this review.
It seems that binary size is almost doubled. It is acceptable level, however it would be nice if we can shrink this. Could you try to put compressed files on resources folder and uncompress these content at runtime? I think LZMA/LZMA2 is a good option for compressions because it is well-known by its compression ratio and there is pure rust library, lzma_rs. Also, we should measure the effectiveness. An article I found indicates that brotli may have good characteristics.
Additionally, could you change the type of pull request from cleanup
to refactor
?
❯ git show --no-patch
commit 96fb355bf8a08555215b54970c864344b24ffdab (HEAD -> main, upstream/main)
Author: Hirotaka Tagawa <[email protected]>
Date: Thu Aug 17 22:08:18 2023 +0100
fix: prevent infinite loop when DynamoDbClient connection fails
❯ cargo build -j 8 --release
**snip**
❯ ls -lh ./target/release/dy
Permissions Size User Date Modified Name
.rwxr-xr-x 12M hiroag 24 8 18:42 ./target/release/dy
❯ gh pr checkout 153
**snip**
❯ git show --no-patch
commit 2e706dc36f531bfb67f3a2e36e63fb88c006062e (HEAD -> bootstrap-local)
Author: Hirotaka Tagawa <[email protected]>
Date: Fri Aug 18 17:31:23 2023 +0100
fix: use include_str macro
❯ cargo build -j 8 --release
**snip**
❯ ls -lh ./target/release/dy
Permissions Size User Date Modified Name
.rwxr-xr-x 21M hiroag 24 8 18:45 ./target/release/dy
It seems that bzip2 achieves best compression ratio. brotli and lzma also achieve good compression rate. Based on the maturity of the library and the compression rate, I think rust-brotli is a best choice. Please feel free to comment your opinion. ❯ lzma -k -e moviedata.json
❯ bzip2 -9 -k moviedata.json
❯ brotli -k moviedata.json
❯ gzip -9 -k moviedata.json
❯ zstd -19 -k moviedata.json
❯ lz4 -9 -k moviedata.json
❯ ls -l | grep moviedata
.rw-r--r-- 3.7M hiroag 25 8 10:43 moviedata.json
.rw-r--r-- 480k hiroag 25 8 10:43 moviedata.json.br
.rw-r--r-- 423k hiroag 25 8 10:43 moviedata.json.bz2
.rw-r--r-- 731k hiroag 25 8 10:43 moviedata.json.gz
.rw-r--r-- 823k hiroag 25 8 10:43 moviedata.json.lz4
.rw-r--r-- 486k hiroag 25 8 10:43 moviedata.json.lzma
.rw-r--r-- 504k hiroag 25 8 10:43 moviedata.json.zst |
// Step 3. decompress data | ||
let compressed_data = include_bytes!("./resources/bootstrap/moviedata.json.br"); | ||
let cursor = Cursor::new(compressed_data); | ||
let mut decompressor = Decompressor::new(cursor, 4096); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in this example, set the buffer size to 4096
|
Pros
Cons
|
I confirmed that the difference in the size of the binary is negligible. ❯ git show --no-patch
commit 1264580374ba26c60d79068eff33569838d9b655 (HEAD -> bootstrap-local)
Author: Hirotaka Tagawa <[email protected]>
Date: Tue Aug 29 22:43:44 2023 +0100
refactor: decompress by brotli
❯ ls -lh ./target/release/dy
Permissions Size User Date Modified Name
.rwxr-xr-x 12M hiroag 8 9 13:47 ./target/release/dy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wafuwafu13 Thank you for implementing compression. It looks good to me. Could you squash commit into a single one?
1264580
to
ffad22b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good to me. Thank you!
Issue #, if available:
#149, #143
Description of changes:
#149 (comment)
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.