Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Interpret data descriptors when reading zip file from (read, nonseek) stream #197

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
93a7f69
Added the method fn read_zipfile_from_seekablestream<S: Read + Seek>(…
0xCCF4 Jun 21, 2024
d146ff3
Merge branch 'refs/heads/master' into feature-read-from-seekable-stream
0xCCF4 Jun 21, 2024
7f0d07b
Merged master to feature branch
0xCCF4 Jun 21, 2024
59f1327
Moved streamed zip read tests to custom test file
0xCCF4 Jun 21, 2024
3bef659
Added security risk documentation and untrusted value struct to encap…
0xCCF4 Jun 22, 2024
f6b5da9
Library does not require Take<Read> anymore but instead accepts a tem…
0xCCF4 Jun 23, 2024
e36ad61
Merge branch 'refs/heads/master' into feature-read-from-seekable-stream
0xCCF4 Jun 23, 2024
47f718b
Completed merge master -> feature branch
0xCCF4 Jun 23, 2024
bf7a030
CRC32 checksum is now late propagated
0xCCF4 Jun 24, 2024
3265477
ZipStream API supports archives with data descriptor
0xCCF4 Jun 24, 2024
1d3afa9
Run cargo fmt --all
0xCCF4 Jun 24, 2024
ad3dbc0
chore: Feature-gate ReadAndSupplyExpectedCRC32 implementations
Pr0methean Jul 6, 2024
fc83a70
Merge branch 'master' into feature-read-from-seekable-stream
Pr0methean Jul 6, 2024
6cc20f5
refactor: applied simple review suggestions and cargo fmt & clippy
0xCCF4 Jul 8, 2024
c61a683
Merge remote-tracking branch 'origin/feature-read-from-seekable-strea…
0xCCF4 Jul 8, 2024
f0cb9f2
fix: completed merge of xz decoder into feature branch
0xCCF4 Jul 15, 2024
1ecad02
Merge branch 'refs/heads/master' into feature-read-from-seekable-stream
0xCCF4 Jul 15, 2024
549d1da
perf: use vecdequeue for look ahead buffer
0xCCF4 Jul 15, 2024
7cd7937
Merge branch 'master' into feature-read-from-seekable-stream
Pr0methean Jul 15, 2024
a530f52
Merge branch 'master' into feature-read-from-seekable-stream
Pr0methean Jul 19, 2024
6022136
fix: FixedSizeBlock must extend Pod
Pr0methean Jul 19, 2024
a9b7bc6
Merge branch 'master' into feature-read-from-seekable-stream
Pr0methean Aug 3, 2024
ecf5588
Merge branch 'master' into feature-read-from-seekable-stream
Pr0methean Aug 13, 2024
2ee9c18
Merge branch 'master' into feature-read-from-seekable-stream
Pr0methean Nov 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 1 addition & 4 deletions benches/read_metadata.rs
Original file line number Diff line number Diff line change
Expand Up @@ -108,10 +108,7 @@ fn parse_stream_archive(bench: &mut Bencher) {

bench.iter(|| {
let mut f = fs::File::open(&out).unwrap();
while zip::read::read_zipfile_from_stream(&mut f)
.unwrap()
.is_some()
{}
while zip::read::read_zipfile_from_stream(&mut f).unwrap().is_ok() {}
});
bench.bytes = bytes.len() as u64;
}
Expand Down
10 changes: 9 additions & 1 deletion examples/stdin_info.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,15 @@ fn real_main() -> i32 {
let mut buf = [0u8; 16];

loop {
match zip::read::read_zipfile_from_stream(&mut stdin_handle) {
let file = match zip::read::read_zipfile_from_stream(&mut stdin_handle) {
Err(e) => {
println!("Error encountered while reading zip: {e:?}");
return 1;
}
Ok(value) => value,
};

match file.unwrap_or_error("data descriptors not supported while reading stdin") {
Ok(Some(mut file)) => {
println!(
"{}: {} bytes ({} bytes packed)",
Expand Down
4 changes: 4 additions & 0 deletions src/aes.rs
Original file line number Diff line number Diff line change
Expand Up @@ -219,6 +219,10 @@ impl<R: Read> AesReaderValid<R> {
pub fn into_inner(self) -> R {
self.reader
}

pub fn get_ref(&self) -> &R {
&self.reader
}
}

pub struct AesWriter<W> {
Expand Down
155 changes: 139 additions & 16 deletions src/crc32.rs
Original file line number Diff line number Diff line change
@@ -1,34 +1,41 @@
//! Helper module to compute a CRC32 checksum

use bzip2::read::BzDecoder;

Check failure on line 3 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs (--no-default-features)

failed to resolve: use of undeclared crate or module `bzip2`

Check failure on line 3 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: ubuntu-latest, msrv

failed to resolve: use of undeclared crate or module `bzip2`

Check failure on line 3 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: macOS-latest, msrv

failed to resolve: use of undeclared crate or module `bzip2`
use std::io;
use std::io::prelude::*;
use std::io::BufReader;

Check failure on line 6 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs (--no-default-features)

unused import: `std::io::BufReader`

Check failure on line 6 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: ubuntu-latest, msrv

unused import: `std::io::BufReader`

Check failure on line 6 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: macOS-latest, msrv

unused import: `std::io::BufReader`

use crate::read::lzma::LzmaDecoder;

Check failure on line 8 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs (--no-default-features)

unresolved import `crate::read::lzma`

Check failure on line 8 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: ubuntu-latest, msrv

unresolved import `crate::read::lzma`

Check failure on line 8 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: macOS-latest, msrv

unresolved import `crate::read::lzma`
use crate::read::xz::XzDecoder;

Check failure on line 9 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs (--no-default-features)

unresolved import `crate::read::xz`

Check failure on line 9 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: ubuntu-latest, msrv

unresolved import `crate::read::xz`

Check failure on line 9 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: macOS-latest, msrv

unresolved import `crate::read::xz`
use crate::read::CryptoReader;
use crc32fast::Hasher;
use deflate64::Deflate64Decoder;

Check failure on line 12 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs (--no-default-features)

unresolved import `deflate64`

Check failure on line 12 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: ubuntu-latest, msrv

unresolved import `deflate64`

Check failure on line 12 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: macOS-latest, msrv

unresolved import `deflate64`
use flate2::read::DeflateDecoder;

Check failure on line 13 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs (--no-default-features)

failed to resolve: use of undeclared crate or module `flate2`

Check failure on line 13 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: ubuntu-latest, msrv

failed to resolve: use of undeclared crate or module `flate2`

Check failure on line 13 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: macOS-latest, msrv

failed to resolve: use of undeclared crate or module `flate2`

/// Reader that validates the CRC32 when it reaches the EOF.
pub struct Crc32Reader<R> {
pub struct Crc32Reader<R: ReadAndSupplyExpectedCRC32> {
inner: R,
hasher: Hasher,
check: u32,
/// Signals if `inner` stores aes encrypted data.
/// AE-2 encrypted data doesn't use crc and sets the value to 0.
enabled: bool,
}

impl<R> Crc32Reader<R> {
impl<R: ReadAndSupplyExpectedCRC32> Crc32Reader<R> {
/// Get a new Crc32Reader which checks the inner reader against checksum.
/// The check is disabled if `ae2_encrypted == true`.
pub(crate) fn new(inner: R, checksum: u32, ae2_encrypted: bool) -> Crc32Reader<R> {
pub(crate) fn new(inner: R, ae2_encrypted: bool) -> Crc32Reader<R> {
Crc32Reader {
inner,
hasher: Hasher::new(),
ae2_encrypted,

Check failure on line 31 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs

struct `crc32::Crc32Reader<R>` has no field named `ae2_encrypted`
check: checksum,

Check failure on line 32 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs (--no-default-features)

cannot find value `checksum` in this scope

Check failure on line 32 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs

cannot find value `checksum` in this scope

Check failure on line 32 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs

struct `crc32::Crc32Reader<R>` has no field named `check`

Check failure on line 32 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: ubuntu-latest, msrv

cannot find value `checksum` in this scope

Check failure on line 32 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: macOS-latest, msrv

cannot find value `checksum` in this scope
enabled: !ae2_encrypted,
}
}

fn check_matches(&self) -> bool {
self.check == self.hasher.clone().finalize()
fn check_matches(&self) -> std::io::Result<bool> {
Ok(self.inner.get_crc32()? == self.hasher.clone().finalize())
}

pub fn into_inner(self) -> R {
Expand All @@ -41,16 +48,14 @@
io::Error::new(io::ErrorKind::InvalidData, "Invalid checksum")
}

impl<R: Read> Read for Crc32Reader<R> {
impl<R: ReadAndSupplyExpectedCRC32> Read for Crc32Reader<R> {
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
let count = self.inner.read(buf)?;

if self.enabled {
if count == 0 && !buf.is_empty() && !self.check_matches() {
return Err(invalid_checksum());
let count = match self.inner.read(buf) {
Ok(0) if !buf.is_empty() && !self.check_matches()? && !self.ae2_encrypted => {

Check failure on line 54 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs

no field `ae2_encrypted` on type `&mut crc32::Crc32Reader<R>`
return Err(io::Error::new(io::ErrorKind::Other, "Invalid checksum"))
}
self.hasher.update(&buf[..count]);

Check failure on line 57 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs (--no-default-features)

expected one of `!`, `(`, `...`, `..=`, `..`, `::`, `=>`, `if`, `{`, or `|`, found `.`

Check failure on line 57 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs

expected one of `!`, `(`, `...`, `..=`, `..`, `::`, `=>`, `if`, `{`, or `|`, found `.`

Check failure on line 57 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: ubuntu-latest, msrv

expected one of `!`, `(`, `...`, `..=`, `..`, `::`, `=>`, `if`, `{`, or `|`, found `.`

Check failure on line 57 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: macOS-latest, msrv

expected one of `!`, `(`, `...`, `..=`, `..`, `::`, `=>`, `if`, `{`, or `|`, found `.`
}

Check failure on line 58 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs (--no-default-features)

expected `;`, found `Ok`

Check failure on line 58 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs

expected `;`, found `Ok`

Check failure on line 58 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: ubuntu-latest, msrv

expected `;`, found `Ok`

Check failure on line 58 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / Build and test --no-default-features: macOS-latest, msrv

expected `;`, found `Ok`
Ok(count)
}

Expand All @@ -60,7 +65,7 @@

if self.enabled {
self.hasher.update(&buf[start..]);
if !self.check_matches() {

Check failure on line 68 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs

cannot apply unary operator `!` to type `std::result::Result<bool, std::io::Error>`
return Err(invalid_checksum());
}
}
Expand All @@ -74,7 +79,7 @@

if self.enabled {
self.hasher.update(&buf.as_bytes()[start..]);
if !self.check_matches() {

Check failure on line 82 in src/crc32.rs

View workflow job for this annotation

GitHub Actions / style_and_docs

cannot apply unary operator `!` to type `std::result::Result<bool, std::io::Error>`
return Err(invalid_checksum());
}
}
Expand All @@ -83,6 +88,124 @@
}
}

/// A reader trait that provides a method to get the expected crc of the data read.
/// In the normal case, the expected crc is known before the zip entry is read.
/// In streaming mode with data descriptors, the crc will be available after the data is read.
/// Still in both cases the crc is available after the data is read and can be checked.
pub trait ReadAndSupplyExpectedCRC32: Read {
fn get_crc32(&self) -> std::io::Result<u32>;
}

pub struct InitiallyKnownCRC32<R: Read> {
reader: R,
crc: u32,
}

impl<R: Read> InitiallyKnownCRC32<R> {
pub fn new(reader: R, crc: u32) -> InitiallyKnownCRC32<R> {
InitiallyKnownCRC32 { reader, crc }
}

#[allow(dead_code)]
pub fn into_inner(self) -> R {
self.reader
}

#[allow(dead_code)]
pub fn get_ref(&self) -> &R {
&self.reader
}
}

impl<R: Read> Read for InitiallyKnownCRC32<R> {
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
self.reader.read(buf)
}
}

impl<R: Read> ReadAndSupplyExpectedCRC32 for InitiallyKnownCRC32<R> {
fn get_crc32(&self) -> std::io::Result<u32> {
Ok(self.crc)
}
}

impl<'a, T: ReadAndSupplyExpectedCRC32 + 'a> ReadAndSupplyExpectedCRC32 for CryptoReader<'a, T> {
fn get_crc32(&self) -> io::Result<u32> {
self.get_ref().get_crc32()
}
}

#[cfg(feature = "_deflate-any")]
impl<T: ReadAndSupplyExpectedCRC32> ReadAndSupplyExpectedCRC32 for DeflateDecoder<T> {
Pr0methean marked this conversation as resolved.
Show resolved Hide resolved
fn get_crc32(&self) -> io::Result<u32> {
self.get_ref().get_crc32()
}
}

#[cfg(feature = "deflate64")]
impl<T: ReadAndSupplyExpectedCRC32> ReadAndSupplyExpectedCRC32 for Deflate64Decoder<BufReader<T>> {
Pr0methean marked this conversation as resolved.
Show resolved Hide resolved
fn get_crc32(&self) -> io::Result<u32> {
self.get_ref().get_ref().get_crc32()
}
}

#[cfg(feature = "bzip2")]
impl<T: ReadAndSupplyExpectedCRC32> ReadAndSupplyExpectedCRC32 for BzDecoder<T> {
Pr0methean marked this conversation as resolved.
Show resolved Hide resolved
fn get_crc32(&self) -> io::Result<u32> {
self.get_ref().get_crc32()
}
}

#[cfg(feature = "zstd")]
impl<'a, T: ReadAndSupplyExpectedCRC32 + BufRead> ReadAndSupplyExpectedCRC32
Pr0methean marked this conversation as resolved.
Show resolved Hide resolved
for zstd::Decoder<'a, T>
{
fn get_crc32(&self) -> io::Result<u32> {
self.get_ref().get_crc32()
}
}

#[cfg(feature = "zstd")]
impl<'a, T: ReadAndSupplyExpectedCRC32> ReadAndSupplyExpectedCRC32
Pr0methean marked this conversation as resolved.
Show resolved Hide resolved
for zstd::Decoder<'a, BufReader<T>>
{
fn get_crc32(&self) -> io::Result<u32> {
self.get_ref().get_ref().get_crc32()
}
}

#[cfg(feature = "lzma")]
impl<T: ReadAndSupplyExpectedCRC32> ReadAndSupplyExpectedCRC32 for LzmaDecoder<T> {
Pr0methean marked this conversation as resolved.
Show resolved Hide resolved
fn get_crc32(&self) -> io::Result<u32> {
self.get_ref().get_crc32()
}
}

#[cfg(feature = "xz")]
impl<T: ReadAndSupplyExpectedCRC32> ReadAndSupplyExpectedCRC32 for XzDecoder<T> {
fn get_crc32(&self) -> io::Result<u32> {
self.as_ref().get_crc32()
}
}

impl<'a> ReadAndSupplyExpectedCRC32 for Box<(dyn ReadAndSupplyExpectedCRC32 + 'a)> {
fn get_crc32(&self) -> io::Result<u32> {
self.as_ref().get_crc32()
}
}

impl<'a, T: ReadAndSupplyExpectedCRC32 + 'a> ReadAndSupplyExpectedCRC32 for Box<T> {
fn get_crc32(&self) -> io::Result<u32> {
self.as_ref().get_crc32()
}
}

impl<T: ReadAndSupplyExpectedCRC32> ReadAndSupplyExpectedCRC32 for std::io::Take<&mut T> {
fn get_crc32(&self) -> io::Result<u32> {
self.get_ref().get_crc32()
}
}

#[cfg(test)]
mod test {
use super::*;
Expand All @@ -92,10 +215,10 @@
let data: &[u8] = b"";
let mut buf = [0; 1];

let mut reader = Crc32Reader::new(data, 0, false);
let mut reader = Crc32Reader::new(InitiallyKnownCRC32::new(data, 0), false);
assert_eq!(reader.read(&mut buf).unwrap(), 0);

let mut reader = Crc32Reader::new(data, 1, false);
let mut reader = Crc32Reader::new(InitiallyKnownCRC32::new(data, 1), false);
assert!(reader
.read(&mut buf)
.unwrap_err()
Expand All @@ -108,7 +231,7 @@
let data: &[u8] = b"1234";
let mut buf = [0; 1];

let mut reader = Crc32Reader::new(data, 0x9be3e0a3, false);
let mut reader = Crc32Reader::new(InitiallyKnownCRC32::new(data, 0x9be3e0a3), false);
assert_eq!(reader.read(&mut buf).unwrap(), 1);
assert_eq!(reader.read(&mut buf).unwrap(), 1);
assert_eq!(reader.read(&mut buf).unwrap(), 1);
Expand All @@ -123,7 +246,7 @@
let data: &[u8] = b"1234";
let mut buf = [0; 5];

let mut reader = Crc32Reader::new(data, 0x9be3e0a3, false);
let mut reader = Crc32Reader::new(InitiallyKnownCRC32::new(data, 0x9be3e0a3), false);
assert_eq!(reader.read(&mut buf[..0]).unwrap(), 0);
assert_eq!(reader.read(&mut buf).unwrap(), 4);
}
Expand Down
Loading
Loading