grycap · RK181 · May 15, 2026 · May 15, 2026 · May 19, 2026 · May 19, 2026
diff --git a/crates/qwen2-5-05b-instruct/README.md b/crates/qwen2-5-05b-instruct/README.md
@@ -0,0 +1,101 @@
+# KServe LLM: Qwen2.5-0.5B-Instruct (vLLM CPU)
+
+This example deploys an LLM service on OSCAR using KServe,
+vLLM on CPU, and an OCI modelcar image that contains the
+`Qwen/Qwen2.5-0.5B-Instruct` model.
+
+## Example files
+
+| File | Description |
+|---|---|
+| `fdl.yaml` | OSCAR service definition with a KServe `llm_inference` block. |
+| `docker/Dockerfile.vllm` | vLLM CPU runtime wrapper with user `uid=1010` for KServe modelcar compatibility. |
+| `docker/Dockerfile.model` | Modelcar image that downloads the model from Hugging Face. |
+
+## Requirements
+
+- OSCAR cluster with KServe enabled.
+- `oscar-cli` configured against your cluster.
+
+## 1. Deploy the service
+
+```bash
+oscar-cli apply fdl.yaml
+```
+
+Verify that the service was created:
+
+```bash
+oscar-cli service list
+```
+
+The service name in this example is `qwen2-5-05b-instruct`.
+
+## 2. Test the OpenAI-compatible endpoint
+
+Once the service is ready, the model will be exposed on `https://<YOUR_CLUSTER>/system/services/<SERVICE_NAME>/models` and you can test your service in different ways:
+
+### Direct request with `curl`
+
+1. Open a terminal and try:
+
+    ```bash
+    curl -X POST "https://<YOUR_CLUSTER>/system/services/qwen2-5-05b-instruct/models/v1/chat/completions" \
+        -H "Content-Type: application/json" \
+        -H "Authorization: Bearer <TOKEN>" \
+        --data '{
+            "model": "qwen2-5-05b-instruct",
+            "messages": [
+                {
+                    "role": "user",
+                    "content": "Write a short explanation about KServe"
+                }
+            ]
+        }'
+    ```
+    > Replace `<TOKEN>` with your service token or four personal OIDC token.  
+
+    > Note: If there is only one model, it will have the same name as the OSCAR service.
+
+### Through Open WebUI
+
+1. Install [Docker](https://www.docker.com)
+2. Run Open WebUI:
+    ```bash
+    docker run -d -p 3000:8080 -e WEBUI_AUTH=False -v open-webui:/app/backend/data --name open-webui ghcr.io/open-webui/open-webui:main
+    ```
+3. Go to [http://localhost:3000/](http://localhost:3000/)
+4. Add a connection to the service:  
+    `Top right corner → Admin Panel → Settings → Connections → OpenAI API`
+5. Try it
+
+## Build the images
+
+### vLLM CPU runtime
+
+```bash
+docker buildx build --platform linux/amd64,linux/arm64 -t ghcr.io/grycap/kserve-vllm-openai-cpu:v0.22.1 -f Dockerfile.vllm . --push
+```
+
+### OCI modelcar (Qwen2.5 model)
+
+```bash
+docker buildx build --platform linux/amd64,linux/arm64 -t ghcr.io/grycap/kserve-qwen2-5-05b-instruct:latest -f Dockerfile.model . --push
+```
+
+If you use a local registry (for example `localhost:5001`), update the tags in
+the commands above and in `fdl.yaml` (`runtime_image` and `storage_uri`).
+
+## Notes
+
+- The first startup can take several minutes (model download and pod rollout).
+- The current example defines modest resources (`cpu: 2`, `memory: 6Gi`); adjust them for your cluster.
+- `fdl.yaml` uses `--dtype=auto` and `--enforce-eager` for more stable CPU execution.
+
+## Additional Resources
+
+- [vLLM Documentation](https://docs.vllm.ai/en/latest/)
+- [OSCAR Documentation](https://docs.oscar.grycap.net/)
+- [KServe](https://kserve.github.io/website/)
+- [API](https://docs.oscar.grycap.net/latest/api/)
+- [OSCAR CLI](https://github.com/grycap/oscar-cli)
diff --git a/crates/qwen2-5-05b-instruct/docker/Dockerfile.model b/crates/qwen2-5-05b-instruct/docker/Dockerfile.model
@@ -0,0 +1,21 @@
+FROM --platform=$BUILDPLATFORM alpine:3.20 AS builder
+ARG TARGETPLATFORM
+ARG BUILDPLATFORM
+RUN apk add --no-cache ca-certificates git git-lfs
+RUN git lfs install --system
+RUN git clone --depth 1 https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct /models
+RUN cd /models && git lfs pull && rm -rf .git
+
+FROM --platform=$BUILDPLATFORM busybox
+ARG TARGETPLATFORM
+ARG BUILDPLATFORM
+# Create a non-root user and group, and set permissions for the /models directory
+# Necesary to avoid permission issues when KServe tries to access the model files
+# Default KServe modelcard uid is 1010
+RUN addgroup -g 1010 storage \
+	&& adduser -D -u 1010 -G storage storage \
+	&& mkdir -p /models \
+	&& chown -R 1010:1010 /models \
+	&& chmod 755 /models
+COPY --from=builder --chown=1010:1010 /models/ /models/
+USER 1010:1010
diff --git a/crates/qwen2-5-05b-instruct/docker/Dockerfile.vllm b/crates/qwen2-5-05b-instruct/docker/Dockerfile.vllm
@@ -0,0 +1,5 @@
+FROM --platform=$BUILDPLATFORM vllm/vllm-openai-cpu:v0.22.1
+ARG BUILDPLATFORM
+ARG TARGETPLATFORM
+USER root
+RUN useradd -u 1010 -m storage
diff --git a/crates/qwen2-5-05b-instruct/fdl.yml b/crates/qwen2-5-05b-instruct/fdl.yml
@@ -0,0 +1,18 @@
+functions:
+  oscar:
+  - oscar-kserve-cluster:
+      name: qwen2-5-05b-instruct
+      image: ubuntu
+      kserve:
+        type: llm_inference
+        llm_inference:
+          runtime_image: ghcr.io/grycap/kserve-vllm-openai-cpu:v0.22.1
+        storage_uri: "oci://ghcr.io/grycap/kserve-qwen2-5-05b-instruct:latest"
+        cpu: '2.0'
+        memory: 6Gi
+        args:
+          - --dtype=auto
+          - --enforce-eager
+        env:
+          VLLM_CPU_KVCACHE_SPACE: "1"
+      log_level: CRITICAL
diff --git a/crates/qwen2-5-05b-instruct/icon.png b/crates/qwen2-5-05b-instruct/icon.png
diff --git a/crates/qwen2-5-05b-instruct/ro-crate-metadata.json b/crates/qwen2-5-05b-instruct/ro-crate-metadata.json
@@ -0,0 +1,113 @@
+{
+  "@context": [
+    "https://w3id.org/ro/crate/1.2/context"
+  ],
+  "@graph": [
+    {
+      "@type": "CreativeWork",
+      "@id": "ro-crate-metadata.json",
+      "conformsTo": {
+        "@id": "https://w3id.org/ro/crate/1.2"
+      },
+      "about": {
+        "@id": "./"
+      }
+    },
+    {
+      "@id": "./",
+      "@type": [
+        "Dataset",
+        "Service",
+        "SoftwareApplication"
+      ],
+      "datePublished": "2025-11-26",
+      "url": "https://github.com/grycap/oscar-hub/tree/main/crates/qwen2-5-05b-instruct",
+      "name": "OSCAR vLLM Qwen2-5-05b-Instruct",
+      "description": "OSCAR service that deploys a vLLM-based Qwen model for efficient large language model inference.",
+      "license": {
+        "@id": "https://www.apache.org/licenses/LICENSE-2.0"
+      },
+      "version": "0.1.0",
+      "applicationCategory": "OSCAR-KServe Service",
+      "memoryRequirements": "256 MiB",
+      "processorRequirements": [
+        "0.1 vCPU"
+      ],
+      "serviceType": "exposed",
+      "isBasedOn": [
+        {
+          "@id": "https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct"
+        }
+      ],
+      "author": {
+        "@id": "#author"
+      },
+      "hasPart": [
+        {
+          "@id": "fdl.yml"
+        },
+        {
+          "@id": "script.sh"
+        },
+        {
+          "@id": "icon.png"
+        }
+      ]
+    },
+    {
+      "@id": "fdl.yml",
+      "@type": [
+        "File",
+        "SoftwareSourceCode"
+      ],
+      "name": "Service Definition (FDL)",
+      "encodingFormat": "text/yaml"
+    },
+    {
+      "@id": "script.sh",
+      "@type": [
+        "File",
+        "SoftwareSourceCode"
+      ],
+      "name": "Service Execution Script",
+      "encodingFormat": "text/x-shellscript"
+    },
+    {
+      "@id": "icon.png",
+      "@type": [
+        "File",
+        "ImageObject"
+      ],
+      "name": "Service Icon",
+      "encodingFormat": "image/png"
+    },
+    {
+      "@id": "#author",
+      "@type": "Person",
+      "affiliation": {
+        "@id": "UPV"
+      },
+      "name": "Robert Kazaryan"
+    },
+    {
+      "@id": "UPV",
+      "@type": "Organization",
+      "name": "Universitat Politècnica de València",
+      "url": "https://www.upv.es"
+    },
+    {
+      "@id": "https://www.apache.org/licenses/LICENSE-2.0",
+      "@type": "CreativeWork",
+      "name": "Apache License 2.0",
+      "identifier": "SPDX:Apache-2.0"
+    },
+    {
+      "@id": "https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct",
+      "@type": "SoftwareApplication",
+      "name": "Qwen2.5-0.5B-Instruct",
+      "description": "Qwen 2.5 0.5B parameter model fine-tuned for instruction following.",
+      "url": "https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct",
+      "version": "latest"
+    }
+  ]
+}
diff --git a/crates/qwen2-5-05b-instruct/script.sh b/crates/qwen2-5-05b-instruct/script.sh
@@ -0,0 +1,3 @@
+#!/bin/bash
+
+echo "Running Qwen2-5-05b-instruct script..."
diff --git a/crates/rabbitmq-broker/AMQP Client/queue-publisher-amqp.py b/crates/rabbitmq-broker/AMQP Client/queue-publisher-amqp.py
@@ -0,0 +1,42 @@
+import pika
+import time
+
+# topic username and password 
+SERVICE_NAME= 'service-name'
+SERVICE_TOKEN= 'token-service'
+TOPIC='oscar.service-name'
+
+delay=5 # Delay time between messages
+
+credentials = pika.PlainCredentials(SERVICE_NAME,SERVICE_TOKEN)
+
+#connection = pika.BlockingConnection(pika.ConnectionParameters('localhost', credentials=credentials))
+
+# REPLACE the long URL with the mapped host and port
+# If you're in the same environment as Rabbit: 'localhost'
+# If it's remote: the domain without 'https://' or routes
+host_cluster = 'cluster.im.grycap.net' 
+amqp_port = 30300 # # Make sure this is the AMQP NodePort, not the HTTPS one
+
+connection = pika.BlockingConnection(
+    pika.ConnectionParameters(
+        host=host_cluster, 
+        port=amqp_port, 
+        credentials=credentials
+    )
+)
+
+channel = connection.channel()
+number_message=8
+# We posted x messages in a row to test the accumulator
+for i in range(1, number_message):
+    message = f"Message - {i}"
+    channel.basic_publish(
+        exchange='amq.topic',
+        routing_key=TOPIC, # topic
+        body=message
+    )
+    print(f" [!] Send: {message}")
+    time.sleep(delay)
+
+connection.close()
diff --git a/crates/rabbitmq-broker/AMQP Client/queue-publisher-http.py b/crates/rabbitmq-broker/AMQP Client/queue-publisher-http.py
@@ -0,0 +1,62 @@
+import requests
+import json
+import time
+
+def send_burst_of_messages(number):
+    # --- Configuration---
+    USER = "service-name"
+    PASS = "service-token"
+    URL = "http://cluster.im.grycap.net:30100/api/exchanges/%2f/amq.topic/publish"
+
+    print(f"🚀 Starting to send {number} messages...")
+
+    for i in range(1, number + 1):
+        # Variable message body
+        message = {
+            "id": i,
+            "timestamp": time.time(),
+            "content": f"Burst message number {i}",
+            "source": "python-script"
+        }
+
+        payload_api = {
+            "properties": {
+                "content_type": "application/json",
+                "delivery_mode": 2
+            },
+            "routing_key": f"oscar.{USER}",
+            "payload": json.dumps(message),
+            "payload_encoding": "string"
+        }
+
+        try:
+            response = requests.post(
+                URL, 
+                auth=(USER, PASS), 
+                data=json.dumps(payload_api),
+                allow_redirects=False,
+                headers={"Content-Type": "application/json"}
+            )
+
+            if response.status_code == 200:
+                result = response.json()
+                if result.get("routed"):
+                    print(f"✅ [{i}/{number}] Message successfully routed.")
+                else:
+                    print(f"⚠️ [{i}/{number}] Sent but NOT routed. Check routing_key.")
+            else:
+                print(f"❌ Error in message {i}: {response.status_code} - {response.text}")
+
+        except Exception as e:
+            print(f"❌ Connection error in message {i}: {e}")
+            break 
+
+        # Optional: a short pause of 3s between messages for improved flow
+        time.sleep(3)
+
+    print("🏁 Process completed.")
+
+if __name__ == "__main__":
+    # Change this value as needed
+    AMOUNT_TO_SEND = 10 
+    send_burst_of_messages(AMOUNT_TO_SEND)
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		#!/bin/bash

		echo "Running Qwen2-5-05b-instruct script..."