Skip to content

libduckdb-sys: expose DEP_DUCKDB_INCLUDE for downstream C/C++ shims#753

Open
frhack wants to merge 1 commit intoduckdb:mainfrom
frhack:expose-include-dir
Open

libduckdb-sys: expose DEP_DUCKDB_INCLUDE for downstream C/C++ shims#753
frhack wants to merge 1 commit intoduckdb:mainfrom
frhack:expose-include-dir

Conversation

@frhack
Copy link
Copy Markdown

@frhack frhack commented May 1, 2026

What

Add links = "duckdb" to libduckdb-sys's manifest and emit cargo:include=<path> from each backend so downstream crates that compile their own C/C++ code can read the resolved DuckDB include directory from DEP_DUCKDB_INCLUDE in their own build scripts.

Why

Crates that ship native extension code (e.g. an OptimizerExtension shim that #include "duckdb.hpp") need access to libduckdb-sys's DuckDB headers at their own build time.

Today they have to either:

  • Glob target/<profile>/build/libduckdb-sys-<hash>/out/duckdb/src/include from their build.rs, or
  • Vendor a copy of the DuckDB headers into their own crate.

Both are fragile: the glob breaks across cargo versions and is awkward in cross-builds; vendoring drifts from the bundled DuckDB ABI on every libduckdb-sys upgrade.

The standard Cargo mechanism for this is the links directive plus cargo:KEY=VALUE metadata. With this PR, downstream code can simply:

// downstream build.rs
fn main() {
    let inc = std::env::var("DEP_DUCKDB_INCLUDE").unwrap();
    cc::Build::new()
        .cpp(true)
        .file("cpp/extension_shim.cpp")
        .include(&inc)
        .compile("my_extension_shim");
}

This unblocks downstream crates from registering things like duckdb::OptimizerExtension from their own native code without resorting to the C API (which the issue at duckdb/duckdb#19093 is still working out).

Changes

  • crates/libduckdb-sys/Cargo.toml: add links = "duckdb" so build-script metadata flows to downstream DEP_DUCKDB_* env vars.
  • crates/libduckdb-sys/build_bundled_cc.rs: emit cargo:include=<OUT_DIR>/duckdb/src/include.
  • crates/libduckdb-sys/build_bundled_cmake.rs: emit cargo:include=<duckdb-sources>/src/include.
  • crates/libduckdb-sys/build.rs (the build_linked path): emit cargo:include=... derived from the resolved HeaderLocationDUCKDB_INCLUDE_DIR, DUCKDB_LIB_DIR, vcpkg, pkg-config, or download dir.

Wrapper-only mode (system DuckDB found via pkg-config without an explicit include path) intentionally skips the emission — there's no single directory to publish; the system header is already on the default search path, and downstream shims fall through to the same default include resolution.

Behaviour change

None for existing Rust consumers — the only public-surface change is a new links directive, which doesn't affect compilation of crates that don't run their own build.rs against duckdb headers.

Validation

The companion downstream project that motivated this PR (dbfy — embedded SQL federation engine, github.com/frhack/dbfy) ships an OptimizerExtension shim in bundled mode (--features duckdb) that needs these headers. Its build.rs currently uses the target-tree glob hack as a documented workaround; once this PR is in a release, the shim can switch to the clean DEP_DUCKDB_INCLUDE path.

Happy to add a small test or example exercising the metadata if useful.

Crates that ship their own C/C++ extension code (e.g. an
OptimizerExtension shim that #include "duckdb.hpp") need the path to
libduckdb-sys's resolved DuckDB headers at their own build time. Today
they have to glob the target tree for libduckdb-sys-<hash>/out/duckdb
or vendor a copy of the headers; both are fragile.

This change adds `links = "duckdb"` to libduckdb-sys's Cargo.toml so
its build script's metadata is exported as `DEP_DUCKDB_*` environment
variables to downstream build scripts, then emits `cargo:include=...`
from each backend that resolves a real header directory:

  - build_bundled_cc.rs   → <OUT_DIR>/duckdb/src/include
  - build_bundled_cmake.rs → <duckdb-sources>/src/include
  - build_linked::main    → DUCKDB_INCLUDE_DIR / DUCKDB_LIB_DIR /
                             vcpkg / pkg-config / download dir

Wrapper-only mode (system DuckDB found via pkg-config without an
explicit include path) intentionally skips the emission: there's no
single directory to publish, the system header is on the default
search path already, and downstream shims can fall back to the same
default include resolution.

Downstream crates can now use:

    fn main() {
        let inc = std::env::var("DEP_DUCKDB_INCLUDE").unwrap();
        cc::Build::new()
            .cpp(true)
            .file("cpp/extension_shim.cpp")
            .include(&inc)
            .compile("my_extension_shim");
    }

No behaviour change for downstream Rust users who don't compile native
code; this only adds opt-in metadata for native build hooks.
frhack added a commit to typeeffect/dbfy that referenced this pull request May 1, 2026
After upstream PR duckdb/duckdb-rs#753 lands
in a released libduckdb-sys, our build.rs can drop the target-tree
glob and read DEP_DUCKDB_INCLUDE directly. Until then we try
DEP_DUCKDB_INCLUDE first (no-op against today's upstream) and fall
back to the glob, so the same code path works whether the user is
on stock libduckdb-sys or a future patched one.
Copy link
Copy Markdown
Member

@mlafeldt mlafeldt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Happy to merge this once you address the mentioned path bug and format your code with cargo fmt.

println!("cargo:rerun-if-env-changed=MACOSX_DEPLOYMENT_TARGET");

write_bindings(&source_dir.join("src/include"), out_path);
let include_path = source_dir.join("src/include");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently relative but should be absolute so downstream build scripts can actually use it.

Suggested change
let include_path = source_dir.join("src/include");
let include_path = source_dir.join("src/include").canonicalize().unwrap();

@mlafeldt mlafeldt added the build label May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants