Skip to content

Commit

Permalink
docs: add initial ollama notes
Browse files Browse the repository at this point in the history
  • Loading branch information
danbev committed Nov 20, 2024
1 parent 0478f68 commit 100da3e
Showing 1 changed file with 70 additions and 0 deletions.
70 changes: 70 additions & 0 deletions notes/ollama.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
## Ollama

### Models
Ollama stores models in an OCI registry which by default are stored in
~/.ollama/models.

### Using a local model
I wanted to run a gguf model that I have locally and this can be done by running
the following commands:

Make sure the ollama REST server is running
```console
$ cd ~/work/ai/ollama
$ ./ollama serve
```

Create a configuration file that describes the model:
```console
FROM /home/danbev/.ollama/models/llama-2-7b-hf-chat-q4.gguf

LICENSE apache-2.0

# Adjust the parameters based on your needs
PARAMETER stop "Human:"
PARAMETER stop "Assistant:"
PARAMETER temperature 0.7
PARAMETER top_k 40
PARAMETER top_p 0.5
```

Create the model in using the above configuration file (saved as ModelFile):
```console
$ ./ollama create mycustomllama -f Modelfile
```
Run the model:
```console
$ ./ollama run mycustomllama
```

### Inspecting a model
If a gguf model is used we can find the blob for the model by either inspecting
the metadata file:
```console
$ cat ~/.ollama/models/manifests/registry.ollama.ai/library/mycustomllama/latest | jq '.layers[] | select(.mediaType == "application/vnd.ollama.image.model") | .digest'
"sha256:42f685711f23fc73c4558da0d8df22fd18ecabbee9c6c8f8277204902ace10d3"
```
And we can use the digest (without the sha256: prefix) to find the blob:
```console
$ ./inspect-model.sh ~/.ollama/models/blobs/sha256-42f685711f23fc73c4558da0d8df22fd18ecabbee9c6c8f8277204902ace10d3
INFO:gguf-dump:* Loading: /home/danbev/.ollama/models/blobs/sha256-42f685711f23fc73c4558da0d8df22fd18ecabbee9c6c8f8277204902ace10d3
* File is LITTLE endian, script is running on a LITTLE endian host.
* Dumping 35 key/value pair(s)
1: UINT32 | 1 | GGUF.version = 3
2: UINT64 | 1 | GGUF.tensor_count = 291
3: UINT64 | 1 | GGUF.kv_count = 32
4: STRING | 1 | general.architecture = 'llama'
```

The blob is also shows in the servers console:
```console
llama_model_loader: loaded meta data with 32 key-value pairs and 291 tensors from /home/danbev/.ollama/models/blobs/sha256-42f685711f23fc73c4558da0d8df22fd18ecabbee9c6c8f8277204902ace10d3 (version GGUF V3 (latest))
```

### Building from source
```console
$ cd ~/work/ai/ollama
$ make -j 8
$ go build .
```

0 comments on commit 100da3e

Please sign in to comment.