Skip to content

Commit 26a5acd

Browse files
authored
Merge pull request #9 from AstraBert/january-2025
v1.0.0
2 parents 6d6951f + d95999e commit 26a5acd

28 files changed

+804
-11443
lines changed

CONTRIBUTING.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# Contributing to `conda`
2+
3+
Do you want to contribute to this project? Make sure to read this guidelines first :)
4+
5+
## Issue
6+
7+
**When to do it**:
8+
9+
- You found bugs but you don't know how to solve them or don't have time/will to do the solve
10+
- You want new features but you don't know how to implement them or don't have time/will to do the implementation
11+
12+
> ⚠️ _Always check open and closed issues before you submit yours to avoid duplicates_
13+
14+
**How to do it**:
15+
16+
- Open an issue
17+
- Give the issue a meaningful title (short but effective problem description)
18+
- Describe the problem following the issue template
19+
20+
## Traditional contribution
21+
22+
**When to do it**:
23+
24+
- You found bugs and corrected them
25+
- You optimized/improved the code
26+
- You added new features that you think could be useful to others
27+
28+
**How to do it**:
29+
30+
1. Fork this repository
31+
2. Commit your changes
32+
3. Submit pull request (make sure to provide a thorough description of the changes)
33+
34+
35+
## Showcase your PrAIvateSearch
36+
37+
**When to do it**:
38+
39+
- You modified the base application with new features but you don't want/can't merge them with the original PrAIvateSearch
40+
41+
**How to do it**:
42+
43+
- Go to [_GitHub Discussions > Show and tell_](https://github.com/AstraBert/PrAIvateSearch/discussions/categories/show-and-tell) page
44+
- Open a new discussion there, describing your PrAIvateSearch application
45+
46+
### Thanks for contributing!

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MIT License
22

3-
Copyright (c) 2024 Astra Clelia Bertelli
3+
Copyright (c) 2025 Clelia (Astra) Bertelli
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

README.md

Lines changed: 37 additions & 97 deletions
Original file line numberDiff line numberDiff line change
@@ -1,134 +1,74 @@
11
<h1 align="center">qdurllm</h1>
22
<h2 align="center">Search your favorite websites and chat with them, on your desktop🌐</h2>
33

4+
# Docs in active development!👷‍♀️
45

5-
<div align="center">
6-
<img src="https://img.shields.io/github/languages/top/AstraBert/qdurllm" alt="GitHub top language">
7-
<img src="https://img.shields.io/github/commit-activity/t/AstraBert/qdurllm" alt="GitHub commit activity">
8-
<img src="https://img.shields.io/badge/Status-stable-green" alt="Static Badge">
9-
<img src="https://img.shields.io/badge/Release-v0.0.0-purple" alt="Static Badge">
10-
<img src="https://img.shields.io/docker/image-size/astrabert/local-search-application
11-
" alt="Docker image size">
12-
<img src="https://img.shields.io/badge/Supported_platforms-Windows/macOS/Linux-brown" alt="Static Badge">
13-
<div>
14-
<img src="./imgs/qdurllm.png" alt="Flowchart" align="center">
15-
<p><i>Flowchart for qdurllm</i></p>
16-
</div>
17-
</div>
6+
They will be soon available on: https://astrabert.github.io/qdurllm/
187

19-
**qdurllm** (**Qd**rant **URL**s and **L**arge **L**anguage **M**odels) is a local search engine that lets you select and upload URL content to a vector database: after that, you can search, retrieve and chat with this content.
8+
In the meantime, refer to the **Quickstart guide** in this README!
209

21-
This is provisioned through a multi-container Docker application, leveraging Qdrant, Langchain, llama.cpp, quantized Gemma and Gradio.
10+
## Quickstart
2211

23-
## Demo!
12+
### 1. Prerequisites
2413

25-
Head over to the [demo space on HuggingFace](https://huggingface.co/spaces/as-cle-bert/qdurllm-demo)🦀
14+
- [`conda`](https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html) package manager
15+
- [`docker`](https://www.docker.com/) and [`docker compose`](https://docs.docker.com/compose/).
2616

27-
## Requirements
17+
### 2. Installation
2818

29-
The only requirement is to have `docker` and `docker-compose`.
19+
> [!IMPORTANT]
20+
> _This is only for the pre-release of `v1.0.0`, i.e. `v1.0.0-rc.0`_
3021
31-
If you don't have them, make sure to install them [here](https://docs.docker.com/get-docker/).
32-
33-
## Installation
34-
35-
You can install the application by cloning the GitHub repository
22+
1. Clone the `january-2025` branch of this GitHub repo:
3623

3724
```bash
38-
git clone https://github.com/AstraBert/qdurllm.git
39-
cd qdurllm
25+
git clone -b january-2025 --single-branch https://github.com/AstraBert/qdurllm.git
26+
cd qdurllm/
4027
```
4128

42-
Or you can simply paste the following text into a `compose.yaml` file:
43-
44-
```yaml
45-
networks:
46-
mynet:
47-
driver: bridge
48-
services:
49-
local-search-application:
50-
image: astrabert/local-search-application
51-
networks:
52-
- mynet
53-
ports:
54-
- "7860:7860"
55-
qdrant:
56-
image: qdrant/qdrant
57-
ports:
58-
- "6333:6333"
59-
volumes:
60-
- "./qdrant_storage:/qdrant/storage"
61-
networks:
62-
- mynet
63-
llama_server:
64-
image: astrabert/llama.cpp-gemma
65-
ports:
66-
- "8000:8000"
67-
networks:
68-
- mynet
69-
```
29+
2. Create the `conda` environment:
7030

71-
Placing the file in whatever directory you want in your file system.
31+
```bash
32+
conda env create -f environment.yml
33+
```
7234

73-
Prior to running the application, you can optionally pull all the needed images from Docker hub:
35+
3. Pull `qdrant` from Docker Hub:
7436

7537
```bash
7638
docker pull qdrant/qdrant
77-
docker pull astrabert/llama.cpp-gemma
78-
docker pull astrabert/local-search-application
7939
```
8040

81-
## How does it work?
82-
83-
When launched (see [Usage](#usage)), the application runs three containers:
84-
85-
- `qdrant`(port 6333): serves as vector database provider for semantic search-based retrieval
86-
- `llama.cpp-gemma`(port 8000): this is an implementation of a [quantized Gemma model](https://huggingface.co/lmstudio-ai/gemma-2b-it-GGUF) provided by LMStudio and Google, served with `llama.cpp` server. This works for text-generation scopes, enriching the search experience of the user.
87-
- `local-search-application`(port 7860): a Gradio tabbed interface with:
88-
+ The possibility to upload one or multiple contents by specifying the URL (thanks to Langchain)
89-
+ The possibility to chat with the uploaded URLs thanks to `llama.cpp-gemma`
90-
+ The possibility to perform a direct search that leverages double-layered retrieval with `all-MiniLM-L6-v2` (that identifies the 10 best matches) and `sentence-t5-base` (that re-encodes the 10 best matches and extracts the best hit from them) - this is the same RAG implementation used in combination with `llama.cpp-gemma`. Wanna see how double-layered RAG performs compared to single-layered RAG? Head over [here](./double-layered-rag-benchmarks/)!
91-
92-
> _The overall computational burden is light enough to make the application run not only GPUless, but also with low RAM availability (>=8GB, although it can take up to 10 mins for Gemma to respond on 8GB RAM)._
93-
94-
## Usage
95-
96-
### Run it
41+
### 3. Launching
9742

98-
You can make the application work with the following - really simple - command, which has to be run within the same directory where you stored your `compose.yaml` file:
43+
1. Launch `qdrant` vector database services with `docker compose` (from within the `qdurllm` folder):
9944

10045
```bash
101-
docker compose up -d
46+
docker compose up
10247
```
10348

104-
If you've already pulled all the images, you'll find the application running at `http://localhost:7860` or `http://0.0.0.0:7860` in less than a minute.
49+
2. Activate the `qdurllm` conda environment you just created:
10550

106-
If you have not pulled the images, you'll have to wait that their installation is complete before actually using the application.
107-
108-
### Use it
109-
110-
Once the app is loaded, you'll find a first tab in which you can write the URLs whose content you want to interact with:
111-
112-
![upload_URLs](./imgs/tutorial1.png)
113-
114-
Now that your URLs are uploaded, you can either chat with their content through `llama.cpp-gemma`:
115-
116-
![chat_with_URLs](./imgs/tutorial2.png)
51+
```bash
52+
conda activate qdurllm
53+
```
11754

118-
> _Note that you can also set parameters like maximum output tokens, temperature, repetition penalty and generation seed_
55+
3. Go inside the `app` directory and launch the Gradio application:
11956

120-
Or you can use double-layered-retrieval semantic search to query your URL content(s) directly:
57+
```bash
58+
cd app/
59+
python3 app.py
60+
```
12161

122-
![direct_search](./imgs/tutorial3.png)
62+
You should see the app running on `http://localhost:7860` once all the models are downloaded from HuggingFace Hub.
12363

124-
## License and rights of usage
64+
## Relies on
12565

126-
The software is (and will always be) open-source, provided under [MIT license](./LICENSE).
66+
- [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct), with Apache 2.0 license
67+
- [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base), with Apache 2.0 license
68+
- [prithivida/Splade_PP_en_v1](https://huggingface.co/prithivida/Splade_PP_en_v1), with Apache 2.0 license
12769

128-
Anyone can use, modify and redistribute any portion of it, as long as the author, [Astra Clelia Bertelli](https://astrabert.vercel.app) is cited.
12970

130-
## Contributions and funding
71+
## Give feedback!
13172

132-
Contribution are always more than welcome! Feel free to flag issues, open PRs or [contact the author](mailto:[email protected]) to suggest any changes, request features or improve the code.
73+
Comment on the [**discussion thread created for this release**](https://github.com/AstraBert/qdurllm/discussions) with your feedback or create [**issues**](https://github.com/AstraBert/qdurllm/issues) :)
13374

134-
If you found the application useful, please consider [funding it](https://github.com/sponsors/AstraBert) in order to allow improvements!

_config.yml

Lines changed: 0 additions & 1 deletion
This file was deleted.
1.74 KB
Binary file not shown.
7.53 KB
Binary file not shown.
2.05 KB
Binary file not shown.

app/app.py

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
from rag import client, SemanticCache, NeuralSearcher, dense_encoder, sparse_encoder
2+
from texInference import pipe
3+
from loadUrls import urlload, to_db
4+
import gradio as gr
5+
import time
6+
7+
8+
searcher = NeuralSearcher("memory", client, dense_encoder, sparse_encoder)
9+
semantic_cache = SemanticCache(client, dense_encoder, "semantic_cache")
10+
11+
12+
def upload2qdrant(url):
13+
global client
14+
documents = urlload(url)
15+
if type(documents) == list:
16+
try:
17+
to_db(documents)
18+
return "URLs successfully uploaded to Qdrant collection!"
19+
except Exception as e:
20+
return f"An error occured: {e}"
21+
else:
22+
return documents
23+
24+
demo0 = gr.Interface(fn=upload2qdrant, title="Upload URL content to Qdrant", inputs=gr.Textbox(label="URL(s)", info="Add one URL or more (if more, you should provide them comma-separated, like this: URL1,URL2,...,URLn)"), outputs=gr.Textbox(label="Logs"))
25+
26+
27+
def reply(message, history, ntokens, rep_pen, temp, topp, systemins):
28+
sr = semantic_cache.search_cache(message)
29+
if sr:
30+
response = sr
31+
this_hist = ''
32+
for c in response:
33+
this_hist+=c
34+
time.sleep(0.001)
35+
yield this_hist
36+
else:
37+
context, url = searcher.search_text(message)
38+
prompt = [{"role": "system", "content": systemins}, {"role": "user", "content": f"This is the context information to reply to my prompt:\n\n{context}"}, {"role": "user", "content": message}]
39+
results = pipe(prompt, temp, topp, ntokens, rep_pen)
40+
results = results.split("<|im_start|>assistant\n")[1]
41+
response = results.replace("<|im_end|>", "")
42+
semantic_cache.upload_to_cache(message, response)
43+
this_hist = ''
44+
for c in response:
45+
this_hist+=c
46+
time.sleep(0.001)
47+
yield this_hist
48+
49+
def direct_search(input_text):
50+
context, url = searcher.search_text(input_text)
51+
return context, f"Reference website [here]({url})"
52+
53+
demo2 = gr.Interface(fn=direct_search, inputs=gr.Textbox(label="Search Query", placeholder="Input your search query here...", ), outputs=[gr.Textbox(label="Retrieved Content"), gr.Markdown(label="URL")], title="Search your URLs")
54+
55+
user_max_new_tokens = gr.Slider(0, 4096, value=512, label="Max new tokens", info="Select max output tokens (higher number of tokens will result in a longer latency)")
56+
user_max_temperature = gr.Slider(0, 1, value=0.1, step=0.1, label="Temperature", info="Select generation temperature")
57+
user_max_rep_pen = gr.Slider(0, 10, value=1.2, step=0.1, label="Repetition penalty", info="Select repetition penalty")
58+
user_top_p = gr.Slider(0.1, 1, value=1, step=0.1, label="top_p", info="Select top_p for the generation")
59+
system_ins = gr.Textbox(label="System Prompt", info="Insert your system prompt here", value="You are an helpful web searching assistant. You reply based on the contextual information you are provided with and on your knowledge.")
60+
additional_accordion = gr.Accordion(label="Parameters to be set before you start chatting", open=True)
61+
demo1 = gr.ChatInterface(fn=reply, title="Chat with your URLs", additional_inputs=[user_max_new_tokens, user_max_temperature, user_max_rep_pen, user_top_p, system_ins], additional_inputs_accordion=additional_accordion)
62+
63+
my_theme = gr.themes.Soft(primary_hue=gr.themes.colors.rose, secondary_hue=gr.themes.colors.pink)
64+
65+
demo = gr.TabbedInterface([demo0, demo1, demo2], ["Upload URLs", "Chat with URLs", "Direct Search"], theme=my_theme)
66+
67+
if __name__ == "__main__":
68+
demo.launch(server_name="0.0.0.0", server_port=7860)

app/loadUrls.py

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
from langchain_community.document_loaders.url import UnstructuredURLLoader
2+
from langchain.text_splitter import CharacterTextSplitter
3+
from rag import upload_text_to_qdrant, client
4+
from typing import List, Dict
5+
6+
def urlload(urls: str) -> List[Dict[str,str]]:
7+
links = urls.split(",")
8+
try:
9+
loader = UnstructuredURLLoader(
10+
urls=links, method="elements",
11+
strategy="fast"
12+
)
13+
docs = loader.load()
14+
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
15+
pages = text_splitter.split_documents(docs)
16+
contents = [{"text": pages[i].page_content, "url": pages[i].metadata["source"]} for i in range(len(pages))]
17+
return contents
18+
except Exception as e:
19+
return f"An error occurred while parsing the URLs: {e}"
20+
21+
22+
def to_db(contents = List[Dict[str, str]]) -> None:
23+
c = 0
24+
for content in contents:
25+
upload_text_to_qdrant(client, "memory", content, c)
26+
c+=1
27+
return
28+
29+

0 commit comments

Comments
 (0)