🚀 Starknet Agent - An AI-powered search engine for the Starknet Ecosystem 🔎

Credits

This project was originally forked from Perplexica, an open-source AI search engine. We've adapted and expanded upon their work to create a specialized tool for the Starknet ecosystem. We're grateful for their initial contribution which provided a base foundation for Starknet Agent.

Overview

Starknet Agent is an open-source AI-powered searching tool specifically designed for the Starknet Ecosystem. It uses advanced Retrieval-Augmented Generation (RAG) to search and understand the Starknet documentation, Cairo Book, and other resources, providing clear and accurate answers to your queries about Starknet and Cairo.

Preview

Features

RAG-based Search: Uses Retrieval-Augmented Generation to provide accurate, source-cited answers to your questions.
Multiple Focus Modes: Special modes to better answer specific types of questions:
- Starknet Ecosystem: Searches the entire Starknet Ecosystem, including all resources below.
- Cairo Book: Searches the Cairo Book for answers.
- Starknet Docs: Searches the Starknet documentation for answers.
- Starknet Foundry: Searches the Starknet Foundry documentation for answers.
- Cairo By Example: Searches the Cairo By Example resource for answers.
- OpenZeppelin Docs: Searches the OpenZeppelin documentation for Starknet-related information.
Source Citations: All answers include citations to the source material, allowing you to verify the information.
Real-time Streaming: Responses are streamed in real-time as they're generated.
Chat History: Your conversation history is preserved for context in follow-up questions.
Modular Architecture: Easily extensible to include additional documentation sources.

Installation

There are mainly 2 ways of installing Starknet Agent - With Docker, Without Docker. Using Docker is highly recommended.

Getting Started with Docker (Recommended)

Ensure Docker is installed and running on your system.

Clone the Starknet Agent repository:

git clone https://github.com/cairo-book/starknet-agent.git

After cloning, navigate to the directory containing the project files.
Setup your database on MongoDB Atlas.
- Create a new cluster.
- Create a new database, e.g. cairo-chatbot.
- Create a new collection inside the database that will store the embeddings. e.g. chunks.
- Create a vectorSearch index named default on the collection (tab Atlas Search). Example index configuration:
```
{
  "fields": [
    {
      "numDimensions": 2048,
      "path": "embedding",
      "similarity": "cosine",
      "type": "vector"
    },
    {
      "path": "source",
      "type": "filter"
    }
  ]
}
```
Inside the packages/backend package, copy the sample.config.toml file to a config.toml. For development setups, you need only fill in the following fields:
- OPENAI: Your OpenAI API key. You only need to fill this if you wish to use OpenAI's models.
- ANTHROPIC: Your Anthropic API key. You only need to fill this if you wish to use Anthropic models.
  
  Note: You can change these after starting Starknet Agent from the settings dialog.
- SIMILARITY_MEASURE: The similarity measure to use (This is filled by default; you can leave it as is if you are unsure about it.)
- Databases:
  - VECTOR_DB: This is the database for the entire Starknet Ecosystem, that aggregates all the other databases. You will need to fill this with your own database URL. example:
```
    [VECTOR_DB]
    MONGODB_URI = "mongodb+srv://mongo:..."
    DB_NAME = "cairo-chatbot"
    COLLECTION_NAME = "chunks"
```
- Models: The [HOSTED_MODE] table defines the underlying LLM model used. We recommend using:
```
   [HOSTED_MODE]
   DEFAULT_CHAT_PROVIDER = "anthropic"
   DEFAULT_CHAT_MODEL = "Claude 3.5 Sonnet"
   DEFAULT_EMBEDDING_PROVIDER = "openai"
   DEFAULT_EMBEDDING_MODEL = "Text embedding 3 large"
```
Generate the embeddings for the databases. You can do this by running turbo run generate-embeddings. If you followed the example above, you will need to run the script with option 6 (Everything) to generate embeddings for all the documentation sources.
```
turbo run generate-embeddings
```
Run the development server with turbo.
```
turbo dev
```
Wait a few minutes for the setup to complete. You can access Starknet Agent at http://localhost:3000 in your web browser.

Note: After the containers are built, you can start Starknet Agent directly from Docker without having to open a terminal.

Architecture

Starknet Agent uses a modern architecture based on Retrieval-Augmented Generation (RAG) to provide accurate, source-cited answers to questions about Starknet and Cairo.

Project Structure

The project is organized as a monorepo with multiple packages:

packages/agents/: Core RAG agent implementation
- Contains the pipeline for processing queries, retrieving documents, and generating answers
- Implements the RAG pipeline in a modular, extensible way
packages/backend/: Express server with WebSocket support
- Handles API endpoints and WebSocket connections for real-time communication
- Manages configuration and environment settings
packages/ui/: Next.js frontend application
- Provides a modern, responsive user interface
- Implements real-time streaming of responses
packages/ingester/: Data ingestion tools for documentation sources
- Uses a template method pattern with a BaseIngester abstract class
- Implements source-specific ingesters for different documentation sources
packages/typescript-config/: Shared TypeScript configuration

RAG Pipeline

The RAG pipeline is implemented in the packages/agents/src/pipeline/ directory and consists of several key components:

Query Processor: Processes user queries and prepares them for document retrieval
Document Retriever: Retrieves relevant documents from the vector database
Answer Generator: Generates answers based on the retrieved documents
RAG Pipeline: Orchestrates the entire RAG process

Ingestion System

The ingestion system is designed to be modular and extensible, allowing for easy addition of new documentation sources:

BaseIngester: Abstract class that defines the template method pattern for ingestion
Source-specific Ingesters: Implementations for different documentation sources
Ingestion Process: Downloads, processes, and stores documentation in the vector database

Currently supported documentation sources:

Cairo Book
Starknet Docs
Starknet Foundry
Cairo By Example
OpenZeppelin Docs

Database

Starknet Agent uses MongoDB Atlas with vector search capabilities for similarity search:

Vector Database: Stores document embeddings for efficient similarity search
Multiple Databases: Separate databases for different focus modes
Vector Search: Uses cosine similarity to find relevant documents

Real-time Communication

The real-time communication system is implemented using WebSockets:

WebSocket Server: Handles real-time communication between the frontend and backend
Message Handler: Processes messages and manages the RAG pipeline
Connection Manager: Manages WebSocket connections and sessions

Development

For development, you can use the following commands:

Start Development Server: turbo dev
Build for Production: turbo build
Run Tests: turbo test
Generate Embeddings: bun run packages/ingester/scripts/generateEmbeddings.ts

To add a new documentation source:

Create a new ingester by extending the BaseIngester class
Implement the required methods for downloading and processing the documentation
Register the new ingester in the IngesterFactory
Update the configuration to include the new database

Upcoming Features

[✅] Expanding coverage of Starknet-related resources
[✅] Enhanced UI with more customization options
Improved search algorithms for better document retrieval
Adding an Autonomous Agent Mode for more precise answers

Contribution

For more information on contributing to Starknet Agent, please read the CONTRIBUTING.md file to learn more about the project and how you can contribute to it.

We welcome contributions in the following areas:

Adding new documentation sources
Improving the RAG pipeline
Enhancing the UI
Adding new features
Fixing bugs
Improving documentation

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
.assets		.assets
.cursor/rules		.cursor/rules
.github		.github
.trunk		.trunk
data		data
docs/architecture		docs/architecture
packages		packages
.dockerignore		.dockerignore
.gitignore		.gitignore
.npmrc		.npmrc
.prettierignore		.prettierignore
.prettierrc.js		.prettierrc.js
API_INTEGRATION.md		API_INTEGRATION.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
app.dockerfile		app.dockerfile
backend.dockerfile		backend.dockerfile
docker-compose.prod-hosted.yml		docker-compose.prod-hosted.yml
docker-compose.yaml		docker-compose.yaml
ingest.dockerfile		ingest.dockerfile
jest.config.js		jest.config.js
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Starknet Agent - An AI-powered search engine for the Starknet Ecosystem 🔎

Table of Contents

Credits

Overview

Preview

Features

Installation

Getting Started with Docker (Recommended)

Architecture

Project Structure

RAG Pipeline

Ingestion System

Database

Real-time Communication

Development

Upcoming Features

Contribution

About

Releases

Packages

Contributors 5

Languages

License

cairo-book/starknet-agent

Folders and files

Latest commit

History

Repository files navigation

🚀 Starknet Agent - An AI-powered search engine for the Starknet Ecosystem 🔎

Table of Contents

Credits

Overview

Preview

Features

Installation

Getting Started with Docker (Recommended)

Architecture

Project Structure

RAG Pipeline

Ingestion System

Database

Real-time Communication

Development

Upcoming Features

Contribution

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages