perplex_city

perplex_city is a perplexity like LLM system that makes use of web searches to mitigate hallucinations. The core concept is similar to that of a RAG system except that you don't explicitly have to use vector embeddings.

This repository has two major branches, namely main (with vector embeddings) and norag (without vector embeddings).

The simple architecture diagram of the system without the vector database is as follows:

Here's how the system works:

User asks a question
Multiple questions similar to the original question are produced using query_expander. It results multiple questions similar to the original question of user, from different angle.
web_search is used in order to perform google search and find the most relevant links to the questions produced. It results in a list of URLs.
deduplicator is used in order to remove all the duplicate URLs from the list and normalize them as well.
Each normalized & original URLs are then visited and the contexts (title, url, text) are extracted and stored as context_data.
- (only for vector embeddings)
  - Chunks are prepared from the given context_data, and they are stored as FAISS index.
  - For the retrieval, a similarity search is performed first between the original question and given sets of chunks, followed by reranking in order to retrieve the most relevant document.
The original query of the user and this retrieved contexts are finally passed on to the generator function in order to produce a response.

Tools summarization:

Web search: DuckDuckGo websearch API
Extract web contents: Trafilatura
LLM: gemini-2.5-flash-lite
Vector database: FAISS
Reranker: ms-marco-MiniLM-L-6-v2

Remaining checklist:

Evaluate the system, and list the evaluation params and observations.
Improve performance efficiency of the system and use SOTA tools & practices.
Keep refining the system to improve the performance in accordance with evaluation metrics.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
app		app
files		files
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
README.md		README.md
notes.txt		notes.txt
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

perplex_city

Here's how the system works:

Tools summarization:

Remaining checklist:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

perplex_city

Here's how the system works:

Tools summarization:

Remaining checklist:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages