|
1 | 1 | <h1 align="center">qdurllm</h1>
|
2 | 2 | <h2 align="center">Search your favorite websites and chat with them, on your desktop🌐</h2>
|
3 | 3 |
|
| 4 | +# Docs in active development!👷♀️ |
4 | 5 |
|
5 |
| -<div align="center"> |
6 |
| - <img src="https://img.shields.io/github/languages/top/AstraBert/qdurllm" alt="GitHub top language"> |
7 |
| - <img src="https://img.shields.io/github/commit-activity/t/AstraBert/qdurllm" alt="GitHub commit activity"> |
8 |
| - <img src="https://img.shields.io/badge/Status-stable-green" alt="Static Badge"> |
9 |
| - <img src="https://img.shields.io/badge/Release-v0.0.0-purple" alt="Static Badge"> |
10 |
| - <img src="https://img.shields.io/docker/image-size/astrabert/local-search-application |
11 |
| - " alt="Docker image size"> |
12 |
| - <img src="https://img.shields.io/badge/Supported_platforms-Windows/macOS/Linux-brown" alt="Static Badge"> |
13 |
| - <div> |
14 |
| - <img src="./imgs/qdurllm.png" alt="Flowchart" align="center"> |
15 |
| - <p><i>Flowchart for qdurllm</i></p> |
16 |
| - </div> |
17 |
| -</div> |
| 6 | +They will be soon available on: https://astrabert.github.io/qdurllm/ |
18 | 7 |
|
19 |
| -**qdurllm** (**Qd**rant **URL**s and **L**arge **L**anguage **M**odels) is a local search engine that lets you select and upload URL content to a vector database: after that, you can search, retrieve and chat with this content. |
| 8 | +In the meantime, refer to the **Quickstart guide** in this README! |
20 | 9 |
|
21 |
| -This is provisioned through a multi-container Docker application, leveraging Qdrant, Langchain, llama.cpp, quantized Gemma and Gradio. |
| 10 | +## Quickstart |
22 | 11 |
|
23 |
| -## Demo! |
| 12 | +### 1. Prerequisites |
24 | 13 |
|
25 |
| -Head over to the [demo space on HuggingFace](https://huggingface.co/spaces/as-cle-bert/qdurllm-demo)🦀 |
| 14 | +- [`conda`](https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html) package manager |
| 15 | +- [`docker`](https://www.docker.com/) and [`docker compose`](https://docs.docker.com/compose/). |
26 | 16 |
|
27 |
| -## Requirements |
| 17 | +### 2. Installation |
28 | 18 |
|
29 |
| -The only requirement is to have `docker` and `docker-compose`. |
| 19 | +> [!IMPORTANT] |
| 20 | +> _This is only for the pre-release of `v1.0.0`, i.e. `v1.0.0-rc.0`_ |
30 | 21 |
|
31 |
| -If you don't have them, make sure to install them [here](https://docs.docker.com/get-docker/). |
32 |
| - |
33 |
| -## Installation |
34 |
| - |
35 |
| -You can install the application by cloning the GitHub repository |
| 22 | +1. Clone the `january-2025` branch of this GitHub repo: |
36 | 23 |
|
37 | 24 | ```bash
|
38 |
| -git clone https://github.com/AstraBert/qdurllm.git |
39 |
| -cd qdurllm |
| 25 | +git clone -b january-2025 --single-branch https://github.com/AstraBert/qdurllm.git |
| 26 | +cd qdurllm/ |
40 | 27 | ```
|
41 | 28 |
|
42 |
| -Or you can simply paste the following text into a `compose.yaml` file: |
43 |
| - |
44 |
| -```yaml |
45 |
| -networks: |
46 |
| - mynet: |
47 |
| - driver: bridge |
48 |
| -services: |
49 |
| - local-search-application: |
50 |
| - image: astrabert/local-search-application |
51 |
| - networks: |
52 |
| - - mynet |
53 |
| - ports: |
54 |
| - - "7860:7860" |
55 |
| - qdrant: |
56 |
| - image: qdrant/qdrant |
57 |
| - ports: |
58 |
| - - "6333:6333" |
59 |
| - volumes: |
60 |
| - - "./qdrant_storage:/qdrant/storage" |
61 |
| - networks: |
62 |
| - - mynet |
63 |
| - llama_server: |
64 |
| - image: astrabert/llama.cpp-gemma |
65 |
| - ports: |
66 |
| - - "8000:8000" |
67 |
| - networks: |
68 |
| - - mynet |
69 |
| -``` |
| 29 | +2. Create the `conda` environment: |
70 | 30 |
|
71 |
| -Placing the file in whatever directory you want in your file system. |
| 31 | +```bash |
| 32 | +conda env create -f environment.yml |
| 33 | +``` |
72 | 34 |
|
73 |
| -Prior to running the application, you can optionally pull all the needed images from Docker hub: |
| 35 | +3. Pull `qdrant` from Docker Hub: |
74 | 36 |
|
75 | 37 | ```bash
|
76 | 38 | docker pull qdrant/qdrant
|
77 |
| -docker pull astrabert/llama.cpp-gemma |
78 |
| -docker pull astrabert/local-search-application |
79 | 39 | ```
|
80 | 40 |
|
81 |
| -## How does it work? |
82 |
| - |
83 |
| -When launched (see [Usage](#usage)), the application runs three containers: |
84 |
| - |
85 |
| -- `qdrant`(port 6333): serves as vector database provider for semantic search-based retrieval |
86 |
| -- `llama.cpp-gemma`(port 8000): this is an implementation of a [quantized Gemma model](https://huggingface.co/lmstudio-ai/gemma-2b-it-GGUF) provided by LMStudio and Google, served with `llama.cpp` server. This works for text-generation scopes, enriching the search experience of the user. |
87 |
| -- `local-search-application`(port 7860): a Gradio tabbed interface with: |
88 |
| - + The possibility to upload one or multiple contents by specifying the URL (thanks to Langchain) |
89 |
| - + The possibility to chat with the uploaded URLs thanks to `llama.cpp-gemma` |
90 |
| - + The possibility to perform a direct search that leverages double-layered retrieval with `all-MiniLM-L6-v2` (that identifies the 10 best matches) and `sentence-t5-base` (that re-encodes the 10 best matches and extracts the best hit from them) - this is the same RAG implementation used in combination with `llama.cpp-gemma`. Wanna see how double-layered RAG performs compared to single-layered RAG? Head over [here](./double-layered-rag-benchmarks/)! |
91 |
| - |
92 |
| -> _The overall computational burden is light enough to make the application run not only GPUless, but also with low RAM availability (>=8GB, although it can take up to 10 mins for Gemma to respond on 8GB RAM)._ |
93 |
| -
|
94 |
| -## Usage |
95 |
| - |
96 |
| -### Run it |
| 41 | +### 3. Launching |
97 | 42 |
|
98 |
| -You can make the application work with the following - really simple - command, which has to be run within the same directory where you stored your `compose.yaml` file: |
| 43 | +1. Launch `qdrant` vector database services with `docker compose` (from within the `qdurllm` folder): |
99 | 44 |
|
100 | 45 | ```bash
|
101 |
| -docker compose up -d |
| 46 | +docker compose up |
102 | 47 | ```
|
103 | 48 |
|
104 |
| -If you've already pulled all the images, you'll find the application running at `http://localhost:7860` or `http://0.0.0.0:7860` in less than a minute. |
| 49 | +2. Activate the `qdurllm` conda environment you just created: |
105 | 50 |
|
106 |
| -If you have not pulled the images, you'll have to wait that their installation is complete before actually using the application. |
107 |
| - |
108 |
| -### Use it |
109 |
| - |
110 |
| -Once the app is loaded, you'll find a first tab in which you can write the URLs whose content you want to interact with: |
111 |
| - |
112 |
| - |
113 |
| - |
114 |
| -Now that your URLs are uploaded, you can either chat with their content through `llama.cpp-gemma`: |
115 |
| - |
116 |
| - |
| 51 | +```bash |
| 52 | +conda activate qdurllm |
| 53 | +``` |
117 | 54 |
|
118 |
| -> _Note that you can also set parameters like maximum output tokens, temperature, repetition penalty and generation seed_ |
| 55 | +3. Go inside the `app` directory and launch the Gradio application: |
119 | 56 |
|
120 |
| -Or you can use double-layered-retrieval semantic search to query your URL content(s) directly: |
| 57 | +```bash |
| 58 | +cd app/ |
| 59 | +python3 app.py |
| 60 | +``` |
121 | 61 |
|
122 |
| - |
| 62 | +You should see the app running on `http://localhost:7860` once all the models are downloaded from HuggingFace Hub. |
123 | 63 |
|
124 |
| -## License and rights of usage |
| 64 | +## Relies on |
125 | 65 |
|
126 |
| -The software is (and will always be) open-source, provided under [MIT license](./LICENSE). |
| 66 | +- [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct), with Apache 2.0 license |
| 67 | +- [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base), with Apache 2.0 license |
| 68 | +- [prithivida/Splade_PP_en_v1](https://huggingface.co/prithivida/Splade_PP_en_v1), with Apache 2.0 license |
127 | 69 |
|
128 |
| -Anyone can use, modify and redistribute any portion of it, as long as the author, [Astra Clelia Bertelli](https://astrabert.vercel.app) is cited. |
129 | 70 |
|
130 |
| -## Contributions and funding |
| 71 | +## Give feedback! |
131 | 72 |
|
132 |
| -Contribution are always more than welcome! Feel free to flag issues, open PRs or [contact the author](mailto:[email protected]) to suggest any changes, request features or improve the code. |
| 73 | +Comment on the [**discussion thread created for this release**](https://github.com/AstraBert/qdurllm/discussions) with your feedback or create [**issues**](https://github.com/AstraBert/qdurllm/issues) :) |
133 | 74 |
|
134 |
| -If you found the application useful, please consider [funding it](https://github.com/sponsors/AstraBert) in order to allow improvements! |
0 commit comments