Add managed local llama.cpp embeddings by mika76 · Pull Request #191 · yoanbernabeu/grepai

mika76 · 2026-03-18T12:25:03Z

Summary

add a managed local llama.cpp embedding provider with globally managed runtime and model assets
add curated managed model install/list/use/remove flows plus model-id shell completions
gate the managed runtime by supported platform and reuse the runtime via health checks instead of PID probing

the managed llama.cpp runtime is pinned to upstream build b3426
manual end-to-end testing was only done on macOS
on unsupported platforms, the managed llama.cpp functionality is hidden or returns a clear unsupported-platform error rather than pretending to work

go test ./config ./embedder ./internal/managedassets ./search ./indexer ./cli
manual smoke test on macOS: init, model install, model use, watch, search

mika76 added 8 commits March 18, 2026 10:36

feat: add managed local llamacpp embeddings

2dc2b0f

feat: show managed model sizes

4bddc97

fix: infer installed model sizes

58bab4a

feat: add curated local embedding model selection

1248e0f

feat: prompt for installed local models during init

653dce1

docs: add managed model completion and usage docs

a230246

fix: use health-first llama runtime reuse

5b1aad3

feat: gate managed llama runtime by platform

b78e87b