Skip to content

perf(openai): bound large-group scheduler candidate scans#1263

Open
CPU-JIA wants to merge 1 commit intoWei-Shaw:mainfrom
CPU-JIA:cpujia/perf-openai-bounded-scheduler
Open

perf(openai): bound large-group scheduler candidate scans#1263
CPU-JIA wants to merge 1 commit intoWei-Shaw:mainfrom
CPU-JIA:cpujia/perf-openai-bounded-scheduler

Conversation

@CPU-JIA
Copy link

@CPU-JIA CPU-JIA commented Mar 24, 2026

Summary

  • add windowed schedulable-account reads for scheduler snapshot cache and DB fallback
  • bound OpenAI large-group candidate scanning with rotating page sampling instead of full-bucket materialization
  • add scheduling config knobs for candidate page size / scan limit and document them
  • include the Windows logger handle cleanup so full backend tests pass locally

Why

Large OpenAI groups can currently trigger repeated full-bucket ZRange + MGET + JSON unmarshal reads, followed by full in-memory candidate scans. Under 10k+ accounts this inflates CPU, heap, and GC pressure.

This change keeps the existing selection semantics, but makes the hot path operate on a bounded candidate window:

  • snapshot reads can fetch only one ordered window
  • DB fallback can fetch only one ordered window
  • OpenAI load-aware selection rotates its starting page within a bounded ordered scan range
  • each request only scores a small candidate pool instead of materializing the whole group

Validation

  • go test -p 1 ./...
  • go run github.com/golangci/golangci-lint/v2/cmd/golangci-lint@v2.9.0 run ./...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant