feat(higgs-audio): crate skeleton + config/weights + backbone forward + parity gate by ywh555hhh · Pull Request #409 · openinfer-project/openinfer

ywh555hhh · 2026-06-16T11:25:56Z

Closes #408 (sub-issue of #395).

Summary

New openinfer-higgs-audio crate for the Higgs Audio model backbone (text-only path).
The backbone is 36-layer Qwen3-isomorphic, using Higgs checkpoint body.* weight
naming and nested text_config.

What is included

395.1 — Crate skeleton + feature wiring

Feature-gated higgs-audio: default = pure logic (Mac-buildable), feature pulls GPU stack
Workspace registration in root Cargo.toml

395.2 — Config parsing

HiggsConfig + TextConfig with nested rope_parameters.rope_theta = 1_000_000
11 architecture fact assertions (hidden_size=2560, 36 layers, GQA 32/8, head_dim=128, etc.)

395.3 — Weight name mapping

map_backbone(): body.layers.{i}.{rest} → BackboneSlot::Layer
All audio/codec tensors correctly return None
Tests: 398 backbone tensors (36×11 + embed + norm), all 36 layers with 11 components

395.4 — Backbone forward

HiggsBackbone::from_safetensors() — loads weights via body.* naming
forward() — 36-layer Qwen3 text prefill (RMSNorm → fused QKV → FlashInfer paged attention → SwiGLU MLP)
Mirrors openinfer-qwen3-4b; only the weight-name prefix and config source differ

395.5 — Backbone parity gate

tests/backbone_parity.rs (required-features = ["higgs-audio"])
Top-64 logprobs: regret + mean (≤ 0.06 nat) + p99 (≤ 0.20 nat)
Clean skip when model or golden file is absent

Guardrails

Only touches openinfer-higgs-audio/ + 2 lines in root Cargo.toml
Zero shared-crate modifications (openinfer-core, openinfer-kernels, etc.)
No mod.rs, no unwrap() on external input, no weight transposition

Mac verification

cargo fmt --all --check     ✅
cargo metadata --no-deps    ✅
cargo build -p openinfer-higgs-audio   ✅
cargo test --lib (3/3)      ✅

- openinfer-higgs-audio crate with default (no GPU) and higgs-audio features - GPU deps (openinfer-core, openinfer-kernels, openinfer-kv-cache, cudarc) are optional, gated behind the higgs-audio feature - Pure-logic modules (config, weights) compile on Mac without GPU - backbone.rs placeholder for GPU forward (395.4) Refs: openinfer-project#395

Add openinfer-higgs-audio to members[] and [workspace.dependencies]. Refs: openinfer-project#395

- HiggsConfig with TextConfig, AudioEncoderConfig, and top-level fields - Nested rope_parameters.rope_theta resolution (1_000_000) - from_path() loads config.json from a model directory - Unit test asserts all known facts against real checkpoint config Refs: openinfer-project#395

- map_backbone() maps body.* → BackboneSlot::Layer, body.norm → FinalNorm, tied.embedding.text_embedding → EmbedTokens, tied.head.text_head → LmHead - Non-backbone tensors (audio/codec/modality) return None - Unit tests: 398 backbone tensors (36×11 + embed + norm), all audio/codec skipped, all 36 layers have all 11 components Refs: openinfer-project#395

- HiggsBackbone::from_safetensors() loads weights using body.* naming (body.layers.{i}.{rest}, body.norm.weight, tied.embedding.text_embedding) - Uses text_config from HiggsConfig as architecture source - forward() runs full 36-layer Qwen3 prefill with paged KV attention - Mirrors openinfer-qwen3-4b kernel ops: RMSNorm, fused QKV GEMM, FlashInfer paged attention, SwiGLU MLP, residual adds - last_token_logits() and compute_all_position_logits() for parity testing - No LoRA, no TP, no CUDA graph, no decode — single-sequence prefill only Refs: openinfer-project#395

- tests/backbone_parity.rs compares Higgs backbone logits against pre-computed golden (backbone_golden.safetensors) - Top-64 logprobs comparison: regret, mean delta, p99 delta - Tolerances: mean ≤ 0.06 nat, p99 ≤ 0.20 nat (same pattern as qwen3) - Requires higgs-audio feature (GPU only) - Skipped cleanly when model or golden file absent Refs: openinfer-project#395

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 21abb93f25

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-16T11:33:40Z

+    fn test_model_path() -> PathBuf {
+        let env_path = std::env::var("OPENINFER_TEST_MODEL_PATH")
+            .map(PathBuf::from)
+            .unwrap_or_else(|_| PathBuf::from("docs/private/higgs-audio-v3-tts-4b"));


Skip private-model unit tests on clean checkout

When OPENINFER_TEST_MODEL_PATH is unset on a clean checkout, docs/private/ is gitignored, but the new crate is a workspace member so cargo test --release --workspace --lib runs this unit test and immediately panics trying to read docs/private/higgs-audio-v3-tts-4b/config.json (the same fallback in weights.rs later panics on the safetensors index). Please make these fixture-dependent tests skip unless an explicit Higgs checkpoint or checked-in fixture exists.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-16T11:33:40Z

+        );
+
+        // 5+6. Residual add + MLP RMSNorm (fused)
+        openinfer_kernels::ops::fused_add_rms_norm_batch_into(


Use the round fused add/RMSNorm variant

For the Higgs GPU path that mirrors Qwen3, using FlashInfer's non-round fused add/norm changes the residual value used for the RMS reduction: the shared kernel notes it keeps the pre-BF16-round add in memory, while the Qwen3 prefill/unified paths call fused_add_rms_norm_round_batch_into to match hidden = bf16(hidden + residual). On real Higgs/Qwen-style bf16 weights this introduces per-layer logits drift across all 36 layers and can fail the golden parity gate; call the round variant here.

Useful? React with 👍 / 👎.

ywh555hhh added 6 commits June 16, 2026 17:44

chore(workspace): register openinfer-higgs-audio in workspace

32dad9d

Add openinfer-higgs-audio to members[] and [workspace.dependencies]. Refs: openinfer-project#395

ywh555hhh mentioned this pull request Jun 16, 2026

higgs-audio: crate skeleton + config/weights + backbone forward + parity gate #408

Open

3 tasks

chatgpt-codex-connector Bot reviewed Jun 16, 2026

View reviewed changes

ywh555hhh marked this pull request as draft June 16, 2026 12:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(higgs-audio): crate skeleton + config/weights + backbone forward + parity gate#409

feat(higgs-audio): crate skeleton + config/weights + backbone forward + parity gate#409
ywh555hhh wants to merge 6 commits into
openinfer-project:mainfrom
ywh555hhh:feat/higgs-audio-v1

ywh555hhh commented Jun 16, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ywh555hhh commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What is included

395.1 — Crate skeleton + feature wiring

395.2 — Config parsing

395.3 — Weight name mapping

395.4 — Backbone forward

395.5 — Backbone parity gate

Guardrails

Mac verification

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ywh555hhh commented Jun 16, 2026 •

edited

Loading