forked from ggml-org/llama.cpp
-
Notifications
You must be signed in to change notification settings - Fork 531
Open
Description
Hey! I've noticed something a bit weird with the context size. It seems to be running 128 tokens larger than what I'm setting, even when I use the --noshift
--nofastforward
flags. This isn't happening with llamacpp, so I thought I'd bring it up.
--contextsize 8192 --noshift --nofastforward
llama_context: n_ctx_per_seq (8320) < n_ctx_train (131072) -- the full capacity of the model will not be utilized

The GUI just shows the value I set. Any chance I can actually run the model with just the context size I'm specifying? I'm guessing it might be something to do with context shifting?
Thanks!
Metadata
Metadata
Assignees
Labels
No labels