-
Notifications
You must be signed in to change notification settings - Fork 187
Description
I'm encountering an issue where compressing a PDF file using ZlibEncoder results in a different digest value compared to using ZstdEncoder. Interestingly, ZstdEncoder behaves as expected, producing the correct hash.
The goal of my code is to compress the file and at the same time compute the digest sha256 of the whole stream. I did this by wrap the ZlibEncoder into a HashWriter which uses ring::digest to get the digest of whole stream.
Code snippet
let (bytes_read, hash_hex, compressed) =
match (compression, rmaker.maybe_content_format()) {
(Compression::Zlib(level), Ok(MaybeContentFormat::MaybeLargeText)) => {
dbg!("zlib");
let mut writer =
ZlibEncoder::new(&mut cwp, flate2::Compression::new(*level));
let mut hwriter = HashWriter::new(&mut writer, &digest::SHA256);
let bytes_copied = copy_by_chunk(&mut stream, &mut hwriter, chunk_size)?;
let hash = hwriter.finish();
let hash_hex = hex::encode(hash);
dbg!(bytes_copied);
(bytes_copied, hash_hex, true)
}
(Compression::Zstd(lv), Ok(MaybeContentFormat::MaybeLargeText)) => {
dbg!("zstd");
let mut writer = ZstdEncoder::new(&mut cwp, *lv)?;
let mut hwriter = HashWriter::new(&mut writer, &digest::SHA256);
let bytes_copied = copy_by_chunk(&mut stream, &mut hwriter, chunk_size)?;
let hash = hwriter.finish();
let hash_hex = hex::encode(hash);
(bytes_copied, hash_hex, true)
}
_ => {
let mut hwriter = HashWriter::new(&mut cwp, &digest::SHA256);
let bytes_copied = copy_by_chunk(&mut stream, &mut hwriter, chunk_size)?;
let hash = hwriter.ctx.finish();
let hash_hex = hex::encode(hash);
(bytes_copied, hash_hex, false)
}
};Could you help identify why ZlibEncoder is producing a different hash_hex compared to ZstdEncoder when compressing PDF files?
Let me know if I need provide more detail information such as HashWriter and copy_by_chunk, I didn't paste here in case the issue is too long.
I was guessing the issue might because I didn't call .finish() to the compress writer. But since I am new to rust, I just don't know how to test it. I put writer.finish() before compute the hash but the borrow checker complains that it is borrowed.
environment
"flate2 = 1.0.31"
EDIT: I tried with GzEncoder and it give same unexpected hash_hex as ZlibEncoder.