Issues with Persisting and Retrieving Keyword Index in PostgreSQL #14348
Replies: 1 comment
-
Hey there, @kwyszynski-cervello! I'm here to help you out with any bugs, questions, or contributions. Let's tackle this issue together! 🤖 To ensure that keyword indexes are properly stored and retrieved across sessions in LlamaIndex version 0.10.42 when using a PostgreSQL database, you need to implement a persistent storage mechanism for the Here is an example of how you can achieve this:
import psycopg2
import json
def save_keyword_table_to_postgres(keyword_table: KeywordTable, connection_string: str):
conn = psycopg2.connect(connection_string)
cursor = conn.cursor()
# Convert the keyword table to a JSON string
keyword_table_json = json.dumps(keyword_table.table)
# Insert or update the keyword table in the database
cursor.execute("""
INSERT INTO keyword_tables (id, table_data)
VALUES (%s, %s)
ON CONFLICT (id) DO UPDATE SET table_data = %s
""", ('keyword_table_id', keyword_table_json, keyword_table_json))
conn.commit()
cursor.close()
conn.close()
import psycopg2
import json
def load_keyword_table_from_postgres(connection_string: str) -> KeywordTable:
conn = psycopg2.connect(connection_string)
cursor = conn.cursor()
# Retrieve the keyword table from the database
cursor.execute("SELECT table_data FROM keyword_tables WHERE id = %s", ('keyword_table_id',))
result = cursor.fetchone()
cursor.close()
conn.close()
if result:
keyword_table_json = result[0]
keyword_table_dict = json.loads(keyword_table_json)
return KeywordTable(table=keyword_table_dict)
else:
return KeywordTable(table={})
class PersistentKeywordTableIndex(KeywordTableIndex):
def __init__(self, connection_string: str, *args, **kwargs):
super().__init__(*args, **kwargs)
self.connection_string = connection_string
self._index_struct = load_keyword_table_from_postgres(self.connection_string)
def save(self):
save_keyword_table_to_postgres(self._index_struct, self.connection_string) With these implementations, you can ensure that the keyword indexes are properly stored in a PostgreSQL database and retrieved across sessions. You can call the |
Beta Was this translation helpful? Give feedback.
-
Hello,
I am using LlamaIndex version 0.10.42 and have encountered an issue where keyword indexes are not being stored or retrieved from a PostgreSQL database across different sessions. My documents and nodes are processed and stored in both docstore and vector store without issues, and other data types are indexing correctly within the same database setup.
Issue Description:
When I ingest and attempt to persist the keyword index within the same session or execution, everything functions correctly—I can retrieve the required keywords and their links to individual nodes. However, if I close the session and later attempt to retrieve the keyword index in a new session, the retrieval process returns nothing. There seems to be no keyword data stored persistently in either the docstore or the vector store, despite no error messages being produced.
Code for Processing and Persisting the Keyword Index:
Code for Retrieving the Keyword Index:
Database Configuration:
I have set up separate configurations for the vector store and the docstore which work perfectly in other applications. Here is the corrected configuration:
Could you provide any insights or suggestions on how to ensure that keyword indexes are properly stored and retrieved across sessions? Is there additional configuration required for keyword data that might differ from other types of data?
Thank you for your help.
Beta Was this translation helpful? Give feedback.
All reactions