-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Zstd compression support to S3 plugin #439
base: master
Are you sure you want to change the base?
Conversation
Thanks for this enhancement. |
0d0bf95
to
6af3b5d
Compare
Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@v3...v4) --- updated-dependencies: - dependency-name: actions/checkout dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: yongwoo.kim <[email protected]>
Signed-off-by: yongwoo.kim <[email protected]>
Signed-off-by: ddukbg <[email protected]>
@ddukbg Thanks! I will review this soon. FYI: We are just now getting Zstd support on Fluentd's side. |
@daipom hello :) |
ruby/setup-ruby action has installed proper
So, I think you can replace
Then, I think the tests might be successful. |
Before: - name: Install dependencies run: gem install bundler rake After: - name: Install dependencies run: gem install rake Signed-off-by: ddukbg <[email protected]>
@Watson1978 Thank you for your valuable feedback. I have modified the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this enhancement!
Could you please check my following comments?
Signed-off-by: ddukbg <[email protected]>
…mments Moved ZstdCompressor tests from test_in_s3.rb to test_out_s3.rb as they relate to the out_s3 plugin. Signed-off-by: ddukbg <[email protected]>
…omments Added tests for ZstdCompressor to test_out_s3.rb following the maintainer's suggestions. Signed-off-by: ddukbg <[email protected]>
Add ZstdCompressor to S3 Plugin and Fix Tests According to Maintainer's Feedback
@daipom Thank you for your valuable feedback :) Changes Made:
Fluentd Log
testcode
testcode(rspec)
Notes:
Please review the changes, and let me know if any further adjustments are needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Sorry for my late response.
Basically, it looks good to me.
Could you please fix the following?
Remove redundant spaces to improve code readability and consistency Co-authored-by: Daijiro Fukuda <[email protected]> Signed-off-by: ddukbg <[email protected]> refactor: Simplify data compression logic refactor: Simplify data compression logic Remove duplicate file reading and streamline compression process Co-authored-by: Daijiro Fukuda <[email protected]> Signed-off-by: ddukbg <[email protected]>
@daipom
Could you please review the changes when you have a chance? |
4d2dbff
to
b1efca1
Compare
Signed-off-by: ddukbg <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry to bother you, but I think it would be better to implement the compressor in a separate file.
Could you please confirm the following comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to put the implementation of ZstdCompressor
in a separate file so that require 'zstd-ruby'
is only executed when necessary.
How about adding such a file instead of fixing out_s3.rb
?
s3_compressor_zstd.rb
require 'zstd-ruby'
module Fluent::Plugin
class S3Output
class ZstdCompressor < Compressor
S3Output.register_compressor("zstd", self)
config_section :compress, param_name: :compress_config, init: true, multi: false do
desc "Compression level for zstd (1-22)"
config_param :level, :integer, default: 3
end
def ext
'zst'.freeze
end
def content_type
'application/x-zst'.freeze
end
def compress(chunk, tmp)
data = chunk.read.gsub(/\r\n/, "\n").force_encoding('UTF-8')
compressed_data = Zstd.compress(data, level: @compress_config.level)
tmp.write(compressed_data)
rescue => e
log.warn "zstd compression failed: #{e.message}"
raise e
end
end
end
end
The points are
require 'zstd-ruby'
will be executed only when necessary.- This file is automatically loaded when
store_as
is set tozstd
. (We don't need to changeout_s3.rb
.)
- This file is automatically loaded when
- We can change
level
as follows:<match test> @type s3 ... <compress> level 1 </compress> </match>
- Note: We can omit the
compress
section to use the default value3
.
- Note: We can omit the
|
||
def compress(chunk, tmp) | ||
begin | ||
data = chunk.read.gsub(/\r\n/, "\n").force_encoding('UTF-8') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data = chunk.read.gsub(/\r\n/, "\n").force_encoding('UTF-8')
Could you please tell me why this is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def test_configure_with_mime_type_zstd | ||
conf = CONFIG.clone | ||
conf << "\nstore_as zstd\n" | ||
d = create_driver(conf) | ||
assert_equal 'zst', d.instance.instance_variable_get(:@compressor).ext | ||
assert_equal 'application/x-zst', d.instance.instance_variable_get(:@compressor).content_type | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add a test for the level
option like this.
def test_configure_with_mime_type_zstd | |
conf = CONFIG.clone | |
conf << "\nstore_as zstd\n" | |
d = create_driver(conf) | |
assert_equal 'zst', d.instance.instance_variable_get(:@compressor).ext | |
assert_equal 'application/x-zst', d.instance.instance_variable_get(:@compressor).content_type | |
end | |
data('level default' => nil, | |
'level 1' => 1) | |
def test_configure_with_mime_type_zstd(level) | |
conf = CONFIG.clone | |
conf << "\nstore_as zstd\n" | |
conf << "\n<compress>\nlevel #{level}\n</compress>\n" if level | |
d = create_driver(conf) | |
assert_equal 'zst', d.instance.instance_variable_get(:@compressor).ext | |
assert_equal 'application/x-zst', d.instance.instance_variable_get(:@compressor).content_type | |
assert_equal (level || 3), d.instance.instance_variable_get(:@compressor).instance_variable_get(:@compress_config).level | |
end |
Summary
This PR adds support for Zstd compression in the Fluentd S3 plugin.
Changes
zstd-ruby
library.ZstdCompressor
class to handle log compression before uploading to S3.store_as zstd
.Zstd
module is properly loaded to avoid uninitialized constant errors.Testing
Test Code
Result
store_as (Zstd) Test
Test Data
echo '{"message": "'$(head -c 1000000 </dev/zero | tr '\0' 'A')'"}' | fluent-cat test.tag
fluentd log
2024-10-18 17:49:37 +0900 [info]: #0 fluent/log.rb:362:info: [Aws::S3::Client 200 0.162773 0 retries] head_object(bucket:"fluent-test-yw",key:"logs/20241018_0.zst")
S3 Data
Why this feature?
Zstd compression provides a better compression ratio and performance compared to gzip, making it a valuable option for users who want efficient log storage on S3.