Skip to content

[codex] Gate local thinking controls and SQLCipher import#954

Merged
tpae merged 5 commits into
osaurus-ai:mainfrom
mimeding:codex/baseline-capability-fixes
Apr 29, 2026
Merged

[codex] Gate local thinking controls and SQLCipher import#954
tpae merged 5 commits into
osaurus-ai:mainfrom
mimeding:codex/baseline-capability-fixes

Conversation

@mimeding
Copy link
Copy Markdown
Contributor

@mimeding mimeding commented Apr 26, 2026

Summary

This keeps the fallback local thinking toggle hidden unless template inspection confirms both an enable_thinking switch and reasoning markers Osaurus can parse. It also cleans public compatibility wording to describe generic protocol-compatible clients, hardens two environment-sensitive tests, fixes the vendored SQLCipher Swift import on newer macOS SDKs, and aligns CI with the release workflow's pinned Xcode 26.4.1 toolchain.

Business Rationale

Users should not see reasoning controls that the selected local model cannot actually honor. Hiding unsupported toggles reduces confusing model behavior and avoids implying that Osaurus can control reasoning for templates that only expose a kwarg without recognizable reasoning output. The SQLCipher import guard and CI toolchain alignment keep the current developer and GitHub test paths able to compile Osaurus after the storage-encryption baseline, which is necessary before broader LLM capability work can be reviewed safely. Generic compatibility wording keeps public docs focused on Osaurus-owned protocol support.

Coding Rationale

LocalReasoningCapability already centralizes template inspection, and ModelOptions already gates fallback profile selection. Adding isToggleableThinking preserves the existing JANG/Gemma/Qwen detection while making the UI profile depend on the combined capability instead of a single kwarg. The SQLCipher change hides only the loadable-extension C API from Swift's Clang import, which Osaurus does not use, while preserving the core SQLite/SQLCipher API and adding vendor guard tests so future SQLCipher bumps keep the import boundary intact. The CI change keeps the test workflow on the same Xcode 26.4.1 pin already used by the release workflow and avoids an unrelated Xcode 26.3 EventSource transitive module build failure.

Expected LLM Performance Impact

Fewer local-model dispatches with misleading thinking controls and fewer degraded first responses from models whose templates mention enable_thinking but do not produce parseable reasoning markers. This should improve model-option accuracy, reduce unsupported reasoning toggles, and make future capability changes easier to validate consistently. The compile and CI fixes do not change model behavior directly, but they keep the capability baseline buildable on current Xcode so follow-up LLM harness improvements can land without test-environment drift.

Changes

  • Behavior change
  • UI change
  • Refactor / chore
  • Tests
  • Docs

Test Plan

  • xcrun swift-format lint --strict Packages/OsaurusCore/Models/Configuration/ModelOptions.swift Packages/OsaurusCore/Services/LocalReasoningCapability.swift Packages/OsaurusCore/Tests/Service/LocalReasoningCapabilityTests.swift Packages/OsaurusCore/Tests/Model/ModelProfileRegistryTests.swift Packages/OsaurusCore/Tests/Sandbox/SandboxManagerCleanupTests.swift Packages/OsaurusCore/Models/API/AnthropicAPI.swift Packages/OsaurusCore/Models/Chat/ChatConfiguration.swift Packages/OsaurusCLI/Sources/OsaurusCLICore/Commands/Bundle/MCPBundleManifest.swift Packages/OsaurusCore/Package.swift Packages/OsaurusCore/Tests/Storage/SQLCipherVendorGuardTests.swift
  • swift test --package-path Packages/OsaurusCore --filter LocalReasoningCapabilityTests
  • swift test --package-path Packages/OsaurusCore --filter ModelProfileRegistryTests
  • swift test --package-path Packages/OsaurusCore --filter SandboxManagerCleanupTests
  • swift test --package-path Packages/OsaurusCore --filter SQLCipherVendorGuardTests
  • swift build --package-path Packages/OsaurusCore
  • swift test --package-path Packages/OsaurusCore
  • xcodebuild build-for-testing -workspace osaurus.xcworkspace -scheme OsaurusCoreTests -disableAutomaticPackageResolution -derivedDataPath build/CIProbeDD -resultBundlePath build/CIProbe2.xcresult -skipPackagePluginValidation -skipMacroValidation -enableCodeCoverage NO COMPILER_INDEX_STORE_ENABLE=NO SWIFT_COMPILATION_MODE=incremental -quiet
  • git diff --check
  • Public/private hygiene scan for private reference terms and client-specific attribution

Checklist

  • I have read CONTRIBUTING.md
  • I added/updated tests where reasonable
  • I updated docs/README as needed
  • I verified build on macOS with Xcode 16.4+

@mimeding mimeding force-pushed the codex/baseline-capability-fixes branch from e404bbf to 30a67dc Compare April 27, 2026 12:32
@mimeding mimeding changed the title [codex] Gate auto thinking toggle by runtime markers [codex] Gate local thinking controls and SQLCipher import Apr 27, 2026
@mimeding mimeding marked this pull request as ready for review April 27, 2026 12:33
@tpae
Copy link
Copy Markdown
Contributor

tpae commented Apr 27, 2026

Hi @mimeding , i added the SQLCipher import related fix in this PR as well: #956

I believe there might be some collisions with this PR #953

@mimeding
Copy link
Copy Markdown
Contributor Author

Thanks. I checked both overlaps. I agree #956 should own the SQLCipher import guard if it lands first; this PR currently carries an equivalent guard only to keep the capability baseline buildable against current main. After #956 merges, I’ll rebase this branch and drop the duplicate SQLCipher hunks if they are already in main. I also see #953 is already green and should probably merge before the broader capability snapshot work in #955, since #955 touches nearby runtime/model-service paths.

@tpae
Copy link
Copy Markdown
Contributor

tpae commented Apr 27, 2026

I believe #953 is still not ready yet, will leave it up to @jjang-ai

@mimeding
Copy link
Copy Markdown
Contributor Author

Got it, thanks. I’ll leave #953 out of the immediate sequencing and keep #955 as draft until that branch is ready or main otherwise settles around the model-service changes. For #954/#955, I’ll focus on the #956 overlap first: once the flaky-test and SQLCipher guard changes land, I’ll rebase and remove the duplicate SQLCipher hunks from these branches.

@mimeding mimeding force-pushed the codex/baseline-capability-fixes branch 3 times, most recently from 096e9cc to 58a2c19 Compare April 28, 2026 21:54
@mimeding mimeding force-pushed the codex/baseline-capability-fixes branch from 58a2c19 to 2d47d12 Compare April 28, 2026 22:07
@mimeding
Copy link
Copy Markdown
Contributor Author

@tpae update after rebasing on current main with #956/#966 included: I dropped what main already resolved through the rebase, but this branch still has a narrow SQLCipher Swift-import guard because it remains a real diff against current main. The key addition is defining OSAURUS_OMIT_FTS5_HEADERS at the umbrella header boundary; target cSettings alone are not enough when Swift imports the Clang module. Local targeted validation is green now: LocalReasoningCapabilityTests, LLMCapabilitySnapshotTests, ChatEngineTests, ModelProfileRegistryTests, and SQLCipherVendorGuardTests all passed after a clean DerivedData rebuild.

@tpae tpae merged commit 718ecbd into osaurus-ai:main Apr 29, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants