- Introduced optional lexical graph configuration for SimpleKGPipeline, enhancing flexibility in customizing node labels and relationship types in the lexical graph.
- Made
relations
andpotential_schema
optional inSchemaBuilder
. - Added a check to prevent the use of deprecated Cypher syntax for Neo4j versions 5.23.0 and above.
- Added a
LexicalGraphBuilder
component to enable the import of the lexical graph (document, chunks) without performing entity and relation extraction. - Added a
Neo4jChunkReader
component to be able to read chunk text from the database.
- Vector and Hybrid retrievers used with
return_properties
now also return the node labels (nodeLabels
) and the node's element ID (id
). HybridRetriever
now filters out the embedding property index inself.vector_index_name
from the retriever result by default.- Removed support for neo4j.AsyncDriver in the KG creation pipeline, affecting Neo4jWriter and related components.
- Updated examples and unit tests to reflect the removal of async driver support.
- Resolved issue with
AzureOpenAIEmbeddings
incorrectly inheriting fromOpenAIEmbeddings
, now inherits fromBaseOpenAIEmbeddings
.
- Introduced a
fail_if_exist
option to index creation functions to control behavior when an index already exists. - Added Qdrant retriever in neo4j_graphrag.retrievers.
- Comprehensive rewrite of the README to improve clarity and provide detailed usage examples.
- Fix a bug where
openai
Python client andnumpy
were required to import any embedder or LLM.
- The value associated to the enum field
OnError.IGNORE
has been changed from "CONTINUE" to "IGNORE" to stick to the convention and match the field name.
- Added
SinglePropertyExactMatchResolver
component allowing to merge entities with exact same property (e.g. name) - Added the
SimpleKGPipeline
class, a simplified abstraction layer to streamline knowledge graph building processes from text documents.
- Added
SinglePropertyExactMatchResolver
component allowing to merge entities with exact same property (e.g. name)
- Added AzureOpenAILLM and AzureOpenAIEmbeddings to support Azure served OpenAI models
- Added
template
validation inPromptTemplate
class upon construction. - Examples demonstrating the use of Mistral embeddings and LLM in RAG pipelines.
- Added feature to include kwargs in
Text2CypherRetriever.search()
that will be injected into a custom prompt, if provided. - Added validation to
custom_prompt
parameter ofText2CypherRetriever
to ensure thatquery_text
placeholder exists in prompt. - Introduced a fixed size text splitter component for splitting text into specified fixed size chunks with overlap. Updated examples and tests to utilize this new component.
- Introduced Vertex AI LLM class for integrating Vertex AI models.
- Added unit tests for the Vertex AI LLM class.
- Added support for Cohere LLM and embeddings - added optional dependency to
cohere
. - Added support for Anthropic LLM - added optional dependency to
anthropic
. - Added support for MistralAI LLM - added optional dependency to
mistralai
. - Added support for Qdrant - added optional dependency to
qdrant-client
.
- Resolved import issue with the Vertex AI Embeddings class.
- Fixed bug in
Text2CypherRetriever
usingcustom_prompt
arg where thesearch
method would not inject thequery_text
content. custom_prompt
arg is now converted toText2CypherTemplate
class within theText2CypherRetriever.get_search_results
method.Text2CypherTemplate
andRAGTemplate
prompt templates now requirequery_text
arg and will error if it is not present. Previousquery_text
aliases may be used, but will warn of deprecation.- Resolved issue where Neo4jWriter component would raise an error if the start or end node ID was not defined properly in the input.
- Resolved issue where relationship types was not escaped in the insert Cypher query.
- Improved query performance in Neo4jWriter: created nodes now have a generic
__KGBuilder__
label and an index is created on the__KGBuilder__.id
property. Moreover, insertion queries are now batched. Batch size can be controlled using thebatch_size
parameter in theNeo4jWriter
component.
- Moved the Embedder class to the neo4j_graphrag.embeddings directory for better organization alongside other custom embedders.
- Removed query argument from the GraphRAG class'
.search
method; users must now usequery_text
. - Neo4jWriter component now runs a single query to merge node and set its embeddings if any.
- Nodes created by the
Neo4jWriter
now have an extra__KGBuilder__
label. Nodes from the entity graph also have an__Entity__
label. - Dropped support for Python 3.8 (end of life).
- Updated documentation links in README.
- Renamed deprecated package references in documentation.
- Introduction page to the documentation content tree.
- Introduced a new Vertex AI embeddings class for generating text embeddings using Vertex AI.
- Updated documentation to include OpenAI and Vertex AI embeddings classes.
- Added google-cloud-aiplatform as an optional dependency for Vertex AI embeddings.
- Make
pygraphviz
an optional dependency - it is now only required when callingpipeline.draw
.
- Moved pygraphviz to optional dependencies under [tool.poetry.extras] in pyproject.toml to resolve an issue where pip install neo4j-graphrag incorrectly required pygraphviz as a mandatory dependency.
- Officially renamed neo4j-genai to neo4j-graphrag. For the final release version of neo4j-genai, please visit https://pypi.org/project/neo4j-genai/.
- The
neo4j-genai
package is now deprecated. Users are advised to switch to the new packageneo4j-graphrag
.
- Ability to visualise pipeline with
my_pipeline.draw("pipeline.png")
. LexicalGraphBuilder
component to create the lexical graph without entity-relation extraction.
- Pipelines now return correct results when the same pipeline is run in parallel.
- Pipeline run method now return a PipelineResult object.
- Improved parameter validation for pipelines (#124). Pipeline now raise an error before a run starts if:
- the same parameter is mapped twice
- or a parameter is defined in the mapping but is not a valid component input
- PDF-to-graph pipeline for knowledge graph construction in experimental mode
- Introduced support for Component/Pipeline flexible architecture.
- Added new components for knowledge graph construction, including text splitters, schema builders, entity-relation extractors, and Neo4j writers.
- Implemented end-to-end tests for the new knowledge graph builder pipeline.
- When saving the lexical graph in a KG creation pipeline, the document is also saved as a specific node, together with relationships between each chunk and the document they were created from.
- Corrected the hybrid retriever query to ensure proper normalization of scores in vector search results.
- Add optional custom_prompt arg to the Text2CypherRetriever class.
GraphRAG.search
method first parameter has been renamedquery_text
(wasquery
) for consistency with the retrievers interface.- Made
GraphRAG.search
method backwards compatible with the query parameter, raising warnings to encourage using query_text instead.
- Corrected initialization to allow specifying the embedding model name.
- Removed sentence_transformers from embeddings/init.py to avoid ImportError when the package is not installed.
- Stopped embeddings from being returned when searching with
VectorRetriever
. AddednodeLabels
andid
to the metadata ofVectorRetriever
results. - Added
upsert_vector
utility function for attaching vectors to node properties. - Introduced
Neo4jInsertionError
for handling insertion failures in Neo4j. - Included Pinecone and Weaviate retrievers in neo4j_graphrag.retrievers.
- Introduced the GraphRAG object, enabling a full RAG (Retrieval-Augmented Generation) pipeline with context retrieval, prompt formatting, and answer generation.
- Added PromptTemplate and RagTemplate for customizable prompt generation.
- Added LLMInterface with implementation for OpenAI LLM.
- Updated project configuration to support multiple Python versions (3.8 to 3.12) in CI workflows.
- Improved developer experience by copying the docstring from the
Retriever.get_search_results
method to theRetriever.search
method - Support for specifying database names in index handling methods and retrievers.
- User Guide in documentation.
- Introduced result_formatter argument to all retrievers, allowing custom formatting of retriever results.
- Refactored import paths for retrievers to neo4j_graphrag.retrievers.
- Implemented exception chaining for all re-raised exceptions to improve stack trace readability.
- Made error messages in
index.py
more consistent. - Renamed
Retriever._get_search_results
toRetriever.get_search_results
- Updated retrievers and index handling methods to accept optional database names.
- Removed Pinecone and Weaviate retrievers from init.py to prevent ImportError when optional dependencies are not installed.
- Moved few-shot examples in
Text2CypherRetriever
to the constructor for better initialization and usage. Updated unit tests and example script accordingly. - Fixed regex warnings in E2E tests for Weaviate and Pinecone retrievers.
- Corrected HuggingFaceEmbeddings import in E2E tests.
- Introduced custom exceptions for improved error handling, including
RetrieverInitializationError
,SearchValidationError
,FilterValidationError
,EmbeddingRequiredError
,RecordCreationError
,Neo4jIndexError
, andNeo4jVersionError
. - Retrievers that integrates with a Weaviate vector database:
WeaviateNeo4jRetriever
. - New return types that help with getting retriever results:
RetrieverResult
andRetrieverResultItem
. - Supported wrapper embedder object for sentence-transformers embeddings:
SentenceTransformerEmbeddings
. Text2CypherRetriever
object which allows for the retrieval of records from a Neo4j database using natural language.
- Replaced
ValueError
with custom exceptions across various modules for clearer and more specific error messages.
- Updated documentation to include new custom exceptions.
- Improved the use of Pydantic for input data validation for retriever objects.