Apache Iceberg C++ library
You do not need Java to use Apache Iceberg™.
It's an alternative for iceberg-cpp. The library is a part of our extention for Greenplum that allows it to read Iceberg data from S3 compatible storage using HMS and Nessie catalogs. The extention is not opensourced yet. But we're thinking about it.
Source https://iceberg.apache.org/status/
| Data type | Iceberg version | Cxx | Java |
|---|---|---|---|
| boolean | 2 | + | + |
| int | 2 | + | + |
| float | 2 | + | + |
| double | 2 | + | + |
| decimal | 2 | + | + |
| date | 2 | + | + |
| time | 2 | + | + |
| timestamp | 2 | + | + |
| timestamptz | 2 | + | + |
| timestamp_ns | 3 | + | + |
| timestamptz_ns | 3 | + | + |
| string | 2 | + | + |
| uuid | 2 | + | + |
| fixed | 2 | + | + |
| binary | 2 | + | + |
| variant | 3 | - | + |
| list | 2 | + | + |
| map | 2 | - | + |
| struct | 2 | - | + |
| unknown | 3 | + | ? |
Datetime restrictions are defined in Iceberg spec.
For date underlying type is int32. For time* it's int64.
timestamp and timestamptz store microseconds from 1970-01-01 00:00:00.000000.
timestamp_ns and timestamptz_ns store nanoseconds from 1970-01-01 00:00:00.000000000.
tz suffix means the time is adjusted to UTC.
| File format | Cxx | Java |
|---|---|---|
| Parquet | + | + |
| ORC | - | + |
| Puffin | + | + |
| Avro | - | + |
| File IO | Cxx | Java |
|---|---|---|
| Local Filesystem | + | + |
| Hadoop Filesystem | - | + |
| S3 Compatible | + | + |
| GCS Compatible | - | + |
| ADLS Compatible | - | + |
Not implemented
Not implemented
| Operation | Iceberg version | Cxx | Java |
|---|---|---|---|
| Plan with data file | 1,2 | + | + |
| Plan with position deletes | 2 | + | + |
| Plan with equality deletes | 2 | + | + |
| Plan with puffin statistics | 1,2 | - | + |
| Read data file | 1,2 | + | + |
| Read with position deletes | 2 | + | + |
| Read with equality deletes | 2 | + | + |
| Operation | Iceberg version | Cxx | Java |
|---|---|---|---|
| Append data | 1,2 | + | + |
| Write position deletes | 2 | - | + |
| Write equality deletes | 2 | - | + |
| Write deletion vectors | 3 | + | + |
| Table Operation | Rest | Glue | HMS |
|---|---|---|---|
| listTable | - | - | - |
| createTable | - | - | - |
| dropTable | - | - | - |
| loadTable | +- | - | +- |
| updateTable | - | - | - |
| renameTable | - | - | - |
| tableExists | +- | - | +- |
| createView | - | - | - |
| dropView | - | - | - |
| listView | - | - | - |
| viewExists | - | - | - |
| replaceView | - | - | - |
| renameView | - | - | - |
| listNamespaces | - | - | - |
| createNamespace | - | - | - |
| dropNamespace | - | - | - |
| namespaceExists | - | - | - |
| updateNamespaceProperties | - | - | - |
| loadNamespaceMetadata | - | - | - |
- C++20 compliant compiler
- CMake 3.20 or higher
- OpenSSL
You have to download Apache Arrow dependencies first.
mkdir _deps && cd _deps
git clone --single-branch -b maint-15.0.2 https://github.com/apache/arrow.git
cd arrow && git apply ../../vendor/arrow/fix_c-ares_url.patch && cd ..
./arrow/cpp/thirdparty/download_dependencies.sh ./arrow-thirdpartymkdir _build
cd _build
ln -s ../_deps/arrow-thirdparty arrow-thirdparty
cmake -GNinja ../
ninja
cd tests/
../iceberg/iceberg-cpp-test
../iceberg/common/fs/iceberg_common_fs_test
./iceberg_local_test