Skip to content

Commit 20bbf36

Browse files
authored
Add external indexers for RAG (#4457)
2 parents 0015a4e + 1be1469 commit 20bbf36

File tree

19 files changed

+1024
-2
lines changed

19 files changed

+1024
-2
lines changed

cozy.example.yaml

+13
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,8 @@ jobs:
146146
# - "service": launching services
147147
# - "migrations": transforming a VFS with Swift to layout v3
148148
# - "notes-save": saving notes to the VFS
149+
# - "rag-index": send data to the RAG server for being indexed
150+
# - "rag-query": send a query to the RAG server
149151
# - "push": sending push notifications
150152
# - "sms": sending SMS notifications
151153
# - "sendmail": sending mails
@@ -200,6 +202,17 @@ konnectors:
200202
# cmd: ./scripts/konnector-rkt-run.sh # run connectors with rkt
201203
# cmd: ./scripts/konnector-nsjail-node8-run.sh # run connectors with nsjail
202204

205+
# rag are the URL of the RAG server(s) for AI.
206+
rag:
207+
# A cozy will use the rag server for its context, or if the context is not
208+
# declared, for default.
209+
default:
210+
url: http://localhost:8000
211+
api_key: $3cr3t
212+
beta:
213+
url: http://localhost:8001
214+
api_key: $3cr3t
215+
203216
# mail service parameters for sending email via SMTP
204217
mail:
205218
# mail noreply address - flags: --mail-noreply-address

docs/README.md

+1
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ designing new services.
7272

7373
### List of services
7474

75+
- `/ai` - [AI](ai.md)
7576
- `/auth` - [Authentication & OAuth](auth.md)
7677
- [Delegated authentication](delegated-auth.md)
7778
- `/apps` - [Applications Management](apps.md)

docs/ai.md

+121
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
[Table of contents](README.md#table-of-contents)
2+
3+
# AI for personal data
4+
5+
## Introduction
6+
7+
AI can be used for interacting with the personal data of a Cozy. This is
8+
currently an experimental feature. Retrieval-Augmented Generation (RAG) is
9+
a classical pattern in the AI world. Here, it is specific to each Cozy.
10+
11+
![Architecture with a RAG server](diagrams/ai.svg)
12+
13+
## Indexation
14+
15+
First of all, the RAG server must be installed with its dependencies. It is
16+
not mandatory to install them on the same servers as the cozy-stack. And the
17+
URL of RAG must be filled in cozy-stack configuration file (in `rag`).
18+
19+
For the moment, the feature is experimental, and a trigger must be created
20+
manually on the Cozy:
21+
22+
```sh
23+
$ COZY=cozy.localhost:8080
24+
$ TOKEN=$(cozy-stack instances token-cli $COZY io.cozy.triggers)
25+
$ curl "http://${COZY}/jobs/triggers" -H "Authorization: Bearer $TOKEN" -d '{ "data": { "attributes": { "type": "@event", "arguments": "io.cozy.files", "debounce": "1m", "worker": "rag-index", "message": {"doctype": "io.cozy.files"} } } }'
26+
```
27+
28+
It can also be a good idea to start a first indexation with:
29+
30+
```sh
31+
$ cozy-stack triggers launch --domain $COZY $TRIGGER_ID
32+
```
33+
34+
In practice, when files are uploaded/modified/deleted, the trigger will create
35+
a job for the index worker (with debounce). The index worker will look at the
36+
changed feed, and will call the RAG for each entry in the changes feed.
37+
38+
## Chat
39+
40+
When a user starts a chat, their prompts are sent to the RAG that can use the
41+
vector database to find relevant documents (technically, only some parts of
42+
the documents called chunks). Those documents are added to the prompt, so
43+
that the LLM can use them as a context when answering.
44+
45+
### POST /ai/chat/conversations/:id
46+
47+
This route can be used to ask AI for a chat completion. The id in the path
48+
must be the identifier of a chat conversation. The client can generate a random
49+
identifier for a new chat conversation.
50+
51+
The stack will respond after pushing a job for this task, but without the
52+
response. The client must use the real-time websocket and subscribe to
53+
`io.cozy.ai.chat.events`.
54+
55+
#### Request
56+
57+
```http
58+
POST /ai/chat/conversations/e21dce8058b9013d800a18c04daba326 HTTP/1.1
59+
Content-Type: application/json
60+
```
61+
62+
```json
63+
{
64+
"q": "Why the sky is blue?"
65+
}
66+
```
67+
68+
#### Response
69+
70+
```http
71+
HTTP/1.1 202 Accepted
72+
Content-Type: application/vnd.api+json
73+
```
74+
75+
```json
76+
{
77+
"data": {
78+
"type": "io.cozy.ai.chat.conversations",
79+
"id": "e21dce8058b9013d800a18c04daba326",
80+
"rev": "1-23456",
81+
"attributes": {
82+
"messages": [
83+
{
84+
"id": "eb17c3205bf1013ddea018c04daba326",
85+
"role": "user",
86+
"content": "Why the sky is blue?",
87+
"createdAt": "2024-09-24T13:24:07.576Z"
88+
}
89+
]
90+
}
91+
},
92+
"cozyMetadata": {
93+
"createdAt": "2024-09-24T13:24:07.576Z",
94+
"createdOn": "http://cozy.localhost:8080/",
95+
"doctypeVersion": "1",
96+
"metadataVersion": 1,
97+
"updatedAt": "2024-09-24T13:24:07.576Z"
98+
}
99+
}
100+
```
101+
102+
### Real-time via websockets
103+
104+
```
105+
client > {"method": "AUTH", "payload": "token"}
106+
client > {"method": "SUBSCRIBE",
107+
"payload": {"type": "io.cozy.ai.chat.events"}}
108+
server > {"event": "CREATED",
109+
"payload": {"id": "eb17c3205bf1013ddea018c04daba326",
110+
"type": "io.cozy.ai.chat.events",
111+
"doc": {"object": "delta", "content": "The ", "position": 0}}}
112+
server > {"event": "CREATED",
113+
"payload": {"id": "eb17c3205bf1013ddea018c04daba326",
114+
"type": "io.cozy.ai.chat.events",
115+
"doc": {"object": "delta", "content": "sky ", "position": 1}}}
116+
[...]
117+
server > {"event": "CREATED",
118+
"payload": {"id": "eb17c3205bf1013ddea018c04daba326",
119+
"type": "io.cozy.ai.chat.events",
120+
"doc": {"object": "done"}}}
121+
```

docs/diagrams/ai.d2

+17
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# https://d2lang.com/
2+
3+
stack: {label: "Cozy-Stack"}
4+
rag: {label: "RAG"}
5+
llm: {label: "LLM"; shape: diamond}
6+
embed: {label: "Embeddings model"; shape: diamond}
7+
vector: {label: "Vector DB"; shape: cylinder}
8+
couchdb: {label: "CouchDB"; shape: cylinder}
9+
swift: {label: "Swift"; shape: cylinder}
10+
11+
stack -> rag
12+
stack -> couchdb
13+
stack -> swift
14+
15+
rag -> embed
16+
rag -> vector
17+
rag -> llm

docs/diagrams/ai.svg

+101
Loading

docs/files.md

+75-2
Original file line numberDiff line numberDiff line change
@@ -958,8 +958,8 @@ Get an image that shows the first page of a PDF (at most 1080x1920).
958958
Get a thumbnail of a file (for an image & pdf only). `:format` can be `tiny` (96x96)
959959
`small` (640x480), `medium` (1280x720), or `large` (1920x1080).
960960

961-
This API does not require authentication because the secret acts as a token.
962-
This secret is valid for 10 minutes, after which the link will return an error.
961+
This API does not require authentication because the secret acts as a token.
962+
This secret is valid for 10 minutes, after which the link will return an error.
963963
To retrieve a new functional link, you must query the files API again to obtain
964964
a new secret.
965965

@@ -1400,6 +1400,79 @@ Content-Type: application/vnd.api+json
14001400

14011401
The same status codes can be encountered as the `PATCH /files/:file-id` route.
14021402

1403+
### POST /files/:id/description
1404+
1405+
This endpoint fills the `metadata.description` field of a file with a
1406+
description generated by the IA from the content of this file.
1407+
1408+
#### Request
1409+
1410+
```http
1411+
POST /files/9152d568-7e7c-11e6-a377-37cbfb190b4b/description HTTP/1.1
1412+
```
1413+
1414+
#### Response
1415+
1416+
```http
1417+
HTTP/1.1 200 OK
1418+
Content-Type: application/json.vnd+api
1419+
```
1420+
1421+
```json
1422+
{
1423+
"data": {
1424+
"type": "io.cozy.files",
1425+
"id": "9152d568-7e7c-11e6-a377-37cbfb190b4b",
1426+
"meta": {
1427+
"rev": "2-20900ae0"
1428+
},
1429+
"attributes": {
1430+
"type": "file",
1431+
"name": "hi.txt",
1432+
"trashed": false,
1433+
"md5sum": "ODZmYjI2OWQxOTBkMmM4NQo=",
1434+
"created_at": "2016-09-19T12:38:04Z",
1435+
"updated_at": "2016-09-19T12:38:04Z",
1436+
"tags": ["poem"],
1437+
"size": 12,
1438+
"executable": false,
1439+
"class": "document",
1440+
"mime": "text/plain",
1441+
"metadata": {
1442+
"description": "Explores love's complexities through vivid imagery and heartfelt emotions"
1443+
},
1444+
"cozyMetadata": {
1445+
"doctypeVersion": "1",
1446+
"metadataVersion": 1,
1447+
"createdAt": "2016-09-20T18:32:49Z",
1448+
"createdByApp": "drive",
1449+
"createdOn": "https://cozy.example.com/",
1450+
"updatedAt": "2016-09-22T13:32:51Z",
1451+
"uploadedAt": "2016-09-21T04:27:50Z",
1452+
"uploadedOn": "https://cozy.example.com/",
1453+
"uploadedBy": {
1454+
"slug": "drive"
1455+
}
1456+
}
1457+
},
1458+
"relationships": {
1459+
"parent": {
1460+
"links": {
1461+
"related": "/files/f2f36fec-8018-11e6-abd8-8b3814d9a465"
1462+
},
1463+
"data": {
1464+
"type": "io.cozy.files",
1465+
"id": "f2f36fec-8018-11e6-abd8-8b3814d9a465"
1466+
}
1467+
}
1468+
},
1469+
"links": {
1470+
"self": "/files/9152d568-7e7c-11e6-a377-37cbfb190b4b"
1471+
}
1472+
}
1473+
}
1474+
```
1475+
14031476
### POST /files/archive
14041477

14051478
Create an archive. The body of the request lists the files and directories that

docs/toc.yml

+1
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
- "Sharing design": ./sharing-design.md
2424
- "Workflow of the konnectors": ./konnectors-workflow.md
2525
- List of services:
26+
- "/ai - AI": ./ai.md
2627
- "/auth - Authentication & OAuth": ./auth.md
2728
- " /oidc - Delegated authentication": ./delegated-auth.md
2829
- "/apps - Applications Management": ./apps.md

docs/workers.md

+6
Original file line numberDiff line numberDiff line change
@@ -412,3 +412,9 @@ It can be launched from command-line with:
412412
```sh
413413
$ cozy-stack jobs run migrations --domain example.mycozy.cloud --json '{"type": "to-swift-v3"}'
414414
```
415+
416+
## index
417+
418+
This worker is used for sending data to a RAG. It looks at the changes feed for
419+
the given doctype, send the changes to an external indexer that will generate
420+
embeddings for the data and put them in a vector database.

model/instance/instance.go

+11
Original file line numberDiff line numberDiff line change
@@ -439,6 +439,17 @@ func (i *Instance) Registries() []*url.URL {
439439
return context
440440
}
441441

442+
// RAGServer returns the RAG server for the instance (AI features).
443+
func (i *Instance) RAGServer() config.RAGServer {
444+
contexts := config.GetConfig().RAGServers
445+
if i.ContextName != "" {
446+
if server, ok := contexts[i.ContextName]; ok {
447+
return server
448+
}
449+
}
450+
return contexts[config.DefaultInstanceContext]
451+
}
452+
442453
// HasForcedOIDC returns true only if the instance is in a context where the
443454
// config says that the stack shouldn't allow to authenticate with the
444455
// password.

0 commit comments

Comments
 (0)