Skip to content

Latest commit

 

History

History
65 lines (43 loc) · 2.44 KB

README.md

File metadata and controls

65 lines (43 loc) · 2.44 KB

tpchgen-rs

Apache licensed Build Status

Blazing fast TPCH benchmark data generator in pure Rust !

Usage

tpchgen-cli is a dbgen compatible CLI tool that generates tables from the TPCH benchmark dataset.

tpchgen is the library that implements the data generation logic for TPCH and it can be used to embed data generation logic natively in Rust.

CLI Usage

We tried to make the tpchgen-cli experience as close to dbgen as possible for no other reason than maybe make it easier for you to have a drop-in replacement.

$ tpchgen-cli -h
TPC-H Data Generator

Usage: tpchgen-cli [OPTIONS] --output-dir <OUTPUT_DIR>

Options:
  -s, --scale-factor <SCALE_FACTOR>  Scale factor to address defaults to 1 [default: 1]
  -o, --output-dir <OUTPUT_DIR>      Output directory for generated files
  -t, --tables <TABLES>              Which tables to generate (default: all) [possible values: nation, region, part, supplier, part-supp, customer, orders, line-item]
  -p, --parts <PARTS>                Number of parts to generate (for parallel generation) [default: 1]
      --part <PART>                  Which part to generate (1-based, only relevant if parts > 1) [default: 1]
  -h, --help                         Print help

For example generating a dataset with a scale factor of 1 (1GB) can be done like this :

$ tpchgen-cli -s 1 --output-dir=/tmp/tpch

Contributing

Pull requests are welcome. For major changes, please open an issue first for discussion. See our contributors guide for more details.

Architecture

Please see architecture guide for details on how the code is structured.

License

The project is licensed under the APACHE 2.0 license.

References

  • The TPC-H Specification, see the specification page.
  • The Original dbgen Implementation you must submit an official request to access the software dbgen at their official website