Higress plugins for GPUStack

Project Overview

This repository contains custom Higress Proxy-Wasm plugins (extensions) designed for GPUStack, focusing on AI API traffic processing, observability, and enhanced gateway features. Each extension is implemented as a standalone module and can be deployed independently in a Higress-compatible environment.

Available Extensions

gpustack-token-usage
- Collects and injects token usage statistics into AI API streaming responses (SSE), including time to first token, average token latency, and tokens per second. Supports real client IP injection and path-based filtering. See extensions/gpustack-token-usage/README.md for details.

Usage

Build the desired extension(s) to generate wasm files.
Upload the wasm file(s) to the Higress console and configure parameters as needed.
Apply the extension(s) to target routes for immediate effect.

Notes

All extensions are designed for Proxy-Wasm compatible gateways.
See each extension's README for specific configuration and usage instructions.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
extensions/gpustack-token-usage		extensions/gpustack-token-usage
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Higress plugins for GPUStack

Project Overview

Available Extensions

Usage

Notes

About

Uh oh!

Releases 8

Packages

Languages

gpustack/gpustack-higress-plugin

Folders and files

Latest commit

History

Repository files navigation

Higress plugins for GPUStack

Project Overview

Available Extensions

Usage

Notes

About

Resources

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Languages

Packages