sequencefile-rs

Hadoop SequenceFile library for Rust

# Cargo.toml
[dependencies]
sequencefile = "0.2.0"

Status

Prototype status!

Unfortunately that means the API will change. If you depend on this crate, please fully qualify your versions for now.

Currently supports reading out your garden-variety sequence file. Handles uncompressed sequencefiles as well as block/record compressed files (deflate, gzip, and bzip2 only). LZO and Snappy are not (yet) handled.

There's a lot more to do:

Benchmarks

There are only two benchmarks yet. Those two benchmarks read seq files (1000 entries each) generated in java with no compression. Both have Text as keyclass. First has i64 as valueclass, second has some more complex structure. Earlier investigations (with deflate on an early 2012 MBP) showed 98.4% of CPU time was spent in miniz producing ~125MB/s of decompressed data.

Usage

use sequencefile::Writable;
let file = File::open("/path/to/seqfile").expect("cannot open file");

struct ValueClass {
  // some fields
}

impl Writable for ValueClass {
   fn read(buf: &mut impl std::io::Read) -> sequencefile::Result<Self>
    where
        Self: Sized,
    {
      // implement read function
    }
}

let seqfile = sequencefile::Reader::<File, Text, ValueClass>::new(file).expect("cannot open reader");

for kv in seqfile.flatten() {

    println!("{:?} - {:?}", kv.0, kv.1);
}

License

sequencefile-rs is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0), with portions covered by various BSD-like licenses.

See LICENSE-APACHE, and LICENSE-MIT for details.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
.circleci		.circleci
benches		benches
src		src
test_data		test_data
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

sequencefile-rs

Status

Benchmarks

Usage

License

About

Licenses found

Releases

Packages

Contributors 3

Languages

License

Licenses found

Xorlev/sequencefile-rs

Folders and files

Latest commit

History

Repository files navigation

sequencefile-rs

Status

Benchmarks

Usage

License

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages