📊 RasaGPT Query Results

The following are questions based on the example dummy data included in the RasaGPT repository. The response framing is great, but retrieval quality is not sufficient due to indexing. Here are some examples of Q&A hits and misses w/ some commentary.

☑️ Hits

✅ “Tell me about Pepe Corp”

Pepe Corp is a groundbreaking parody pharmaceutical and technology company inspired by the iconic Pepe the Frog meme. Our mission is to revolutionize the world through the power of dank memes.
🏷️  company,mission

✅ “What is the weather like in London today?” (It’s expected to say I don’t know)

I apologize, I'm not sure how to help with that
🏷️  weather

✅ “What is the purpose of Kekzal?”

Kekzal enhances sexual function.
🏷️  purpose

✅ “What is Kekzal and what are the side effects?”

Kekzal is a medication, but the document does not provide information on its side effects. It only advises to consult a memologist if side effects persist or become bothersome.
🏷️  Kekzal,side effects

It probably should not mention “the document” and this can be optimized by modifying the prompt template

✅ “What should I consider before taking Kekzal?”

Before taking Kekzal, consider the following precautions: Do not use Kekzal if you have a known allergy to dank memes or an aversion to Pepe-themed roman.
🏷️  precautions

This is technically a hit, but because of the chunking settings, the original text gets cut off and says “roman” instead of “romantic..” so this is easily fixed

Ideally it should include “Kekzal” in the tags

✅ “What are the overdose symptoms of Kekzal?”

In case of an overdose, symptoms may include: 1. An insatiable desire to create
🏷️  overdose,symptoms

This is a technical hit, but again indexing cut it off and it is likely due to the configured output limits which is easily fixed

✅ “Who is the CEO?”

The CEO of Pepe Corp is Amanda Gigglestein.
🏷️  CEO

✅ “What are the 2023 plans of Pepe Corp?”

Pepe Corp plans to invest in research and development to create cutting-edge products that address the needs of our diverse customer base. Our R&D team will explore new areas of meme science and technology, developing products that meet the needs of our customers.
🏷️  plans,research and development,cutting-edge products,meme science,technology,customer needs

✅ “What is the revenue for Q1 FY2023?”

$235 million
🏷️  revenue,Q1,FY2023

✅ “Tell me about the early days of PepeCorp?”

In the early 21st century, a group of visionary memelords gathered in a dimly lit basement, fueled by their passion for dank memes and the iconic Pepe the Frog.
🏷️  company history,early days

❌ Misses

❌ “What are the side effects of Kekzal?”

Consult your memologist if side effects persist or become bothersome.
🏷️  Kekzal,side effects

The side effects are in the documentation but it misses it entirely. It did tag the product name unlike earlier sessions, but index optimization can fix this

❌ “Who is Dr. Olivia Dankstein?”

I apologize, I'm not sure how to help with that.
🏷️  unknown

Not good, she is in the Board of Directors document and is Vice-Chair and Chief Scientific Officer of Pepe Corp. Better indexing strategies fix this.

❌ “What is the list of products provided by Pepe Corp?”

Pepe Corp has a diverse portfolio of products, ranging from meme-inspired pharmaceuticals to advanced brain implants and mobile applications.
🏷️  products

Although it did answer, it is insufficient. This is where a tree index with summarization would help

🔬 Retrieval optimizations

So how could all of this be improved by what I estimate to be more than 70%+? Here are a few solutions I intend to explore:

Better index-tuning at the LLM library level, specifically using llama-index instead of Langchain which specializes in retrieval
Usage of Query Routers Abstractions to understand which search strategy to use (one-shot vs few-shot)
Explore other indexing methods like Tree indexes, Keyword indexes
Adding chat history for immediate recall and context setting
Add a secondary adversarial agent (Dual pattern model) with the following potential functionalities:
- Determine if the question has been answered and if not, re-optimize search strategy
- Ensure prompt injection is not occurring
Increase baseline similarity search by exploring:
- Regularly generate “fake” document embeddings based on historical queries and link to actual documents via HyDE pattern
- Regularly generate “fake” user queries based on documents and link to actual document so user input search and “fake” queries can match better

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RESULTS.md

RESULTS.md

📊 RasaGPT Query Results

☑️ Hits

✅ “Tell me about Pepe Corp”

✅ “What is the weather like in London today?” (It’s expected to say I don’t know)

✅ “What is the purpose of Kekzal?”

✅ “What is Kekzal and what are the side effects?”

✅ “What should I consider before taking Kekzal?”

✅ “What are the overdose symptoms of Kekzal?”

✅ “Who is the CEO?”

✅ “What are the 2023 plans of Pepe Corp?”

✅ “What is the revenue for Q1 FY2023?”

✅ “Tell me about the early days of PepeCorp?”

❌ Misses

❌ “What are the side effects of Kekzal?”

❌ “Who is Dr. Olivia Dankstein?”

❌ “What is the list of products provided by Pepe Corp?”

🔬 Retrieval optimizations

Files

RESULTS.md

Latest commit

History

RESULTS.md

File metadata and controls

📊 RasaGPT Query Results

☑️ Hits

✅ “Tell me about Pepe Corp”

✅ “What is the weather like in London today?” (It’s expected to say I don’t know)

✅ “What is the purpose of Kekzal?”

✅ “What is Kekzal and what are the side effects?”

✅ “What should I consider before taking Kekzal?”

✅ “What are the overdose symptoms of Kekzal?”

✅ “Who is the CEO?”

✅ “What are the 2023 plans of Pepe Corp?”

✅ “What is the revenue for Q1 FY2023?”

✅ “Tell me about the early days of PepeCorp?”

❌ Misses

❌ “What are the side effects of Kekzal?”

❌ “Who is Dr. Olivia Dankstein?”

❌ “What is the list of products provided by Pepe Corp?”

🔬 Retrieval optimizations