[TST] add schema proptest to python #5823

jairad26 · 2025-11-06T16:22:31Z

Description of changes

Summarize the changes made by this PR.

Improvements & Bug fixes
- Adds prop tests to distributed & local chroma for e2e create_collection and get_collection flow to ensure all reconciliation logic for vector index config, and schema index configs for inverted indexes on metadata keys & sparse vector indexes are created with the correct parameters
New functionality
- ...

Test plan

How are these changes tested?

Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Migration plan

Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?

Observability plan

What is the plan to instrument and monitor this change?

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?

github-actions · 2025-11-06T16:22:39Z

jairad26 · 2025-11-06T16:22:49Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

propel-code-bot · 2025-11-06T18:02:32Z

Add Hypothesis-based schema reconciliation tests for collection creation

Introduces comprehensive property-based tests covering collection creation and retrieval across local and distributed configurations. The new suites validate reconciliation between metadata, explicit configuration, and schema-provided vector/index settings for both HNSW and SPANN modes, ensuring collection state (configuration blocks, schema defaults/keys, and embedding function metadata) matches expectations derived from randomized inputs.

Key Changes

• Added test_vector_index_configuration_create_collection Hypothesis test to validate active vector index configuration against metadata/config/schema permutations for both HNSW and SPANN backends
• Added test_schema_create_and_get_collection Hypothesis test to verify persistence of schema-defined inverted and sparse indexes through create/get flows
• Extended chromadb/test/property/strategies.py with strategies for vector index configs, sparse index configs, schema construction, metadata/configuration combinations, and supporting utilities (CollectionInputCombination, non_none_items, vector_index_to_dict)
• Registered new SimpleIpEmbeddingFunction (default space ip) for property tests and wired deterministic sparse embedding support into strategies

Affected Areas

• chromadb/test/property/test_schema.py
• chromadb/test/property/strategies.py

This summary was automatically generated by @propel-code-bot

chromadb/test/property/strategies.py

propel-code-bot · 2025-11-10T17:18:04Z

chromadb/test/property/test_schema.py

+                for key, value in hnsw_non_none.items():
+                    if value is not None and value != HNSW_DEFAULTS[key]:


[BestPractice]

Potential KeyError risk: The code accesses HNSW_DEFAULTS[key] on lines 128 and 197 without checking if the key exists in the dictionary. If a key from hnsw_non_none is not present in HNSW_DEFAULTS, this will raise a KeyError. Add a safety check:

if key in HNSW_DEFAULTS and value is not None and value != HNSW_DEFAULTS[key]: should_try_metadata = False

This prevents crashes when unexpected configuration keys are encountered.

Context for Agents

[**BestPractice**] Potential KeyError risk: The code accesses `HNSW_DEFAULTS[key]` on lines 128 and 197 without checking if the key exists in the dictionary. If a key from `hnsw_non_none` is not present in `HNSW_DEFAULTS`, this will raise a `KeyError`. Add a safety check: ```python if key in HNSW_DEFAULTS and value is not None and value != HNSW_DEFAULTS[key]: should_try_metadata = False ``` This prevents crashes when unexpected configuration keys are encountered. File: chromadb/test/property/test_schema.py Line: 128

propel-code-bot · 2025-12-03T16:53:52Z

chromadb/test/property/test_schema.py

+def _compute_expected_config(
+    spann_active: bool,
+    metadata: Optional[CollectionMetadata],
+    configuration: Optional[CreateCollectionConfiguration],
+    schema_vector_index_config: Optional[Dict[str, Any]],
+) -> Dict[str, Any]:


[Maintainability] The function _compute_expected_config is quite long and handles two distinct logical paths based on spann_active. To improve readability and maintainability, consider splitting this function into two smaller, focused helpers: one for the SPANN logic and one for the HNSW logic.

This would make the complex reconciliation logic easier to understand and maintain for each vector index type.

Example structure:

def _compute_expected_config_spann( metadata: Optional[CollectionMetadata], configuration: Optional[CreateCollectionConfiguration], schema_vector_index_config: Optional[Dict[str, Any]], ) -> Dict[str, Any]: # ... logic from the `if spann_active:` block ... def _compute_expected_config_hnsw( metadata: Optional[CollectionMetadata], configuration: Optional[CreateCollectionConfiguration], schema_vector_index_config: Optional[Dict[str, Any]], ) -> Dict[str, Any]: # ... logic from the `else:` block ... def _compute_expected_config( spann_active: bool, metadata: Optional[CollectionMetadata], configuration: Optional[CreateCollectionConfiguration], schema_vector_index_config: Optional[Dict[str, Any]], ) -> Dict[str, Any]: if spann_active: return _compute_expected_config_spann( metadata, configuration, schema_vector_index_config ) else: return _compute_expected_config_hnsw( metadata, configuration, schema_vector_index_config )

Context for Agents

The function `_compute_expected_config` is quite long and handles two distinct logical paths based on `spann_active`. To improve readability and maintainability, consider splitting this function into two smaller, focused helpers: one for the SPANN logic and one for the HNSW logic. This would make the complex reconciliation logic easier to understand and maintain for each vector index type. Example structure: ```python def _compute_expected_config_spann( metadata: Optional[CollectionMetadata], configuration: Optional[CreateCollectionConfiguration], schema_vector_index_config: Optional[Dict[str, Any]], ) -> Dict[str, Any]: # ... logic from the `if spann_active:` block ... def _compute_expected_config_hnsw( metadata: Optional[CollectionMetadata], configuration: Optional[CreateCollectionConfiguration], schema_vector_index_config: Optional[Dict[str, Any]], ) -> Dict[str, Any]: # ... logic from the `else:` block ... def _compute_expected_config( spann_active: bool, metadata: Optional[CollectionMetadata], configuration: Optional[CreateCollectionConfiguration], schema_vector_index_config: Optional[Dict[str, Any]], ) -> Dict[str, Any]: if spann_active: return _compute_expected_config_spann( metadata, configuration, schema_vector_index_config ) else: return _compute_expected_config_hnsw( metadata, configuration, schema_vector_index_config ) ``` File: chromadb/test/property/test_schema.py Line: 97

jairad26 mentioned this pull request Nov 6, 2025

[CLN] schema: build default with config ef & default_knn_index, remove #document population in defaults #5775

Merged

jairad26 mentioned this pull request Nov 6, 2025

[BUG] is_default_schema does not do a space check for defaults or #embedding #5787

Merged

1 task

jairad26 force-pushed the jai/schema-prop-tests branch from 394633e to 801ec5a Compare November 6, 2025 18:00

jairad26 force-pushed the jai/fix-default-path-reconcile branch from 061ee06 to fe04c8f Compare November 6, 2025 18:00

jairad26 marked this pull request as ready for review November 6, 2025 18:01

jairad26 changed the title ~~[TST] add schema proptest~~ [TST] add schema proptest to python Nov 6, 2025

propel-code-bot bot reviewed Nov 6, 2025

View reviewed changes

chromadb/test/property/strategies.py Outdated Show resolved Hide resolved

jairad26 force-pushed the jai/schema-prop-tests branch 2 times, most recently from 46bddae to 148fe32 Compare November 7, 2025 19:26

jairad26 force-pushed the jai/fix-default-path-reconcile branch from fe04c8f to db3b74e Compare November 7, 2025 19:26

This comment has been minimized.

Sign in to view

jairad26 force-pushed the jai/schema-prop-tests branch from 148fe32 to ab35cf0 Compare November 10, 2025 17:10

jairad26 force-pushed the jai/fix-default-path-reconcile branch from db3b74e to f8cfb53 Compare November 10, 2025 17:10

propel-code-bot bot reviewed Nov 10, 2025

View reviewed changes

jairad26 force-pushed the jai/schema-prop-tests branch from ab35cf0 to ed59ae9 Compare November 10, 2025 17:37

jairad26 force-pushed the jai/fix-default-path-reconcile branch from f8cfb53 to 2b76659 Compare November 10, 2025 17:37

jairad26 force-pushed the jai/schema-prop-tests branch from ed59ae9 to fa7744a Compare November 10, 2025 18:04

jairad26 force-pushed the jai/fix-default-path-reconcile branch 2 times, most recently from e0d238f to 5fafa38 Compare November 10, 2025 23:43

jairad26 force-pushed the jai/schema-prop-tests branch from fa7744a to 19836ec Compare November 10, 2025 23:43

jairad26 force-pushed the jai/fix-default-path-reconcile branch from 5fafa38 to 7c682b1 Compare November 10, 2025 23:47

jairad26 force-pushed the jai/schema-prop-tests branch from 19836ec to f1b64d9 Compare November 10, 2025 23:47

jairad26 mentioned this pull request Nov 11, 2025

[TST] add proptest for config & schema reconciliation #5847

Open

1 task

jairad26 force-pushed the jai/schema-prop-tests branch from f1b64d9 to 95c36fd Compare November 12, 2025 23:56

jairad26 force-pushed the jai/fix-default-path-reconcile branch from 7c682b1 to 8671ea1 Compare November 12, 2025 23:56

jairad26 force-pushed the jai/schema-prop-tests branch from 95c36fd to 6cd92fc Compare November 13, 2025 00:15

jairad26 force-pushed the jai/fix-default-path-reconcile branch from 8671ea1 to c129192 Compare November 13, 2025 00:15

jairad26 changed the base branch from jai/fix-default-path-reconcile to graphite-base/5823 November 13, 2025 19:03

jairad26 force-pushed the graphite-base/5823 branch from c129192 to 3d9843b Compare November 13, 2025 19:03

jairad26 force-pushed the jai/schema-prop-tests branch from 6cd92fc to a021c46 Compare November 13, 2025 19:03

jairad26 changed the base branch from graphite-base/5823 to main November 13, 2025 19:03

[TST] add schema proptest

f8f6b6f

jairad26 force-pushed the jai/schema-prop-tests branch from a021c46 to f8f6b6f Compare December 3, 2025 16:45

propel-code-bot bot reviewed Dec 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TST] add schema proptest to python #5823

[TST] add schema proptest to python #5823

jairad26 commented Nov 6, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 6, 2025

Uh oh!

jairad26 commented Nov 6, 2025 •

edited

Loading

Uh oh!

propel-code-bot bot commented Nov 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

This comment has been minimized.

propel-code-bot bot Nov 10, 2025

Uh oh!

propel-code-bot bot Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		for key, value in hnsw_non_none.items():
		if value is not None and value != HNSW_DEFAULTS[key]:

[TST] add schema proptest to python #5823

Are you sure you want to change the base?

[TST] add schema proptest to python #5823

Conversation

jairad26 commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of changes

Test plan

Migration plan

Observability plan

Documentation Changes

Uh oh!

github-actions bot commented Nov 6, 2025

Reviewer Checklist

Testing, Bugs, Errors, Logs, Documentation

System Compatibility

Quality

Uh oh!

jairad26 commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

propel-code-bot bot commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

propel-code-bot bot Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

propel-code-bot bot Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jairad26 commented Nov 6, 2025 •

edited

Loading

jairad26 commented Nov 6, 2025 •

edited

Loading

propel-code-bot bot commented Nov 6, 2025 •

edited

Loading