Skip to content

Commit 35eb8d3

Browse files
ibro45claude
andauthored
feat: Make Delete Operator Strict, Improve Error Messages, Remove Color Formatting, and Make Local Raw References Lazy (#5)
* Remove idempotent terminology and make delete operator error on missing keys * Refactor location tracking terminology and enhance merge context Rename SourceLocation -> Location and MetadataRegistry -> LocationRegistry to better reflect their purpose. Introduce MergeContext dataclass to consolidate location tracking during config merging, enabling better error messages with source locations for delete operator failures. Key changes: - Rename src/sparkwheel/metadata.py -> src/sparkwheel/locations.py - Update all references to SourceLocation to use Location - Update all references to MetadataRegistry to use LocationRegistry - Add MergeContext class for threading location info through merge operations - Enhance delete operator error messages with source location context - Track locations for individual keys during YAML loading 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Drop formatters * Improve error message in instantiationerror * Add more operator tests * Improve errors with locations and better messages * Improve error messages and add comprehensive test coverage for preprocessor ## Error Message Improvements ### Enhanced delete operator error messages (config.py) - Show relevant available keys based on context (top-level vs nested) - For nested keys like `model::missing`, display parent container's keys - Use `parent_key` parameter for better error formatting - Extract child key name for clearer nested error messages ### Enhanced raw reference error messages (preprocessor.py) - Add source file location tracking for all raw reference errors - Show specific missing key instead of full path (e.g., "Key 'data' not found" vs "Key 'data' not found at path 'data'") - Display available keys at point of failure (up to 10 keys) - Consistent error formatting for dict keys, list indices, and type errors - Use custom exceptions (ConfigKeyError, CircularReferenceError) for proper newline formatting - Thread LocationRegistry through external file loads for complete error context ## Test Coverage ### Preprocessor tests (test_preprocessor.py) - Added 14 new tests covering: - Missing keys (first level and nested) - Invalid list indices (first level and nested) - Type errors with proper context - Raw reference errors with location tracking - External file references (success and failure) - Circular reference detection with locations - Available keys truncation - Raw reference expansion scenarios - Total: 19 tests, all passing ### Config tests (test_config.py) - test_delete_nonexistent_top_level_key_shows_available_keys - test_delete_nonexistent_nested_key_shows_parent_keys - test_delete_nested_key_when_parent_doesnt_exist - test_update_from_file_with_nested_paths_merges_locations ### Operator tests (test_operators.py) - Fixed contradictory comments in test_validate_skips_dict_under_remove ## Type Safety ### Type annotations (preprocessor.py) - Added TYPE_CHECKING import block to avoid circular imports - Properly typed `locations` parameter as `Optional["LocationRegistry"]` in: - process_raw_refs() - _process_raw_refs_recursive() - _expand_raw_ref() - Removed all `# type: ignore` comments ## Documentation ### Updated operators.md - Removed misleading "idempotent delete" section - Clarified delete operator is strict (raises on missing keys) - Added three approaches for writing portable/reusable configs: 1. Document required keys 2. Use composition/override instead of delete 3. Conditional deletion with lists - Updated examples to reflect strict delete semantics ## Results - 633 tests passing (1 skipped) - Zero regressions - Improved error messages show exact file:line and relevant suggestions - Complete type safety for location tracking * Extract get_by_id to path_utils and improve error consistency - Move _get_by_id from Preprocessor to standalone get_by_id function in path_utils - Config._get_by_id now delegates to the shared get_by_id function - Change ValueError to TypeError when indexing non-dict/list values - Improve InstantiationError handling to preserve source location and suggestion - Move related tests from test_preprocessor.py to test_path_utils.py * Add a test for instantiation error * Make local raw references lazy to support CLI overrides Local % refs (%key) are now expanded during resolve() instead of update(), allowing CLI overrides to affect values used by local raw references. External file refs (%file.yaml::key) remain eager since external files are frozen. This fixes the surprising behavior where CLI overrides were silently ignored for values referenced by local % refs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Update the test * Update README --------- Co-authored-by: Claude <[email protected]>
1 parent 936bfc2 commit 35eb8d3

32 files changed

+1808
-913
lines changed

.github/workflows/release.yml

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
name: Release
2+
3+
on:
4+
push:
5+
tags:
6+
- '*'
7+
8+
permissions:
9+
contents: write
10+
11+
jobs:
12+
release:
13+
name: Create GitHub Release
14+
runs-on: ubuntu-latest
15+
timeout-minutes: 5
16+
if: startsWith(github.ref, 'refs/tags')
17+
18+
steps:
19+
- uses: actions/checkout@v4
20+
with:
21+
fetch-depth: 0
22+
23+
- name: Verify tag is on main branch
24+
run: |
25+
git fetch origin main
26+
if ! git merge-base --is-ancestor ${{ github.sha }} origin/main; then
27+
echo "Error: Tag is not on the main branch"
28+
exit 1
29+
fi
30+
31+
- name: Create Release
32+
uses: softprops/action-gh-release@v2
33+
with:
34+
generate_release_notes: true
35+
draft: false
36+
prerelease: false

README.md

Lines changed: 23 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -10,103 +10,56 @@
1010
<a href="https://github.com/project-lighter/sparkwheel/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/badge/License-Apache%202.0-blue.svg"></a>
1111
<a href="https://project-lighter.github.io/sparkwheel"><img alt="Documentation" src="https://img.shields.io/badge/docs-latest-olive"></a>
1212
</p>
13-
<br/>
1413

15-
<p align="center">⚙️ YAML configuration meets Python 🐍</p>
14+
<h3 align="center">YAML configuration meets Python</h3>
1615
<p align="center">Define Python objects in YAML. Reference, compose, and instantiate them effortlessly.</p>
1716
<br/>
1817

19-
## What is Sparkwheel?
18+
## Quick Start
2019

21-
Stop hardcoding parameters. Define complex Python objects in clean YAML files, compose them naturally, and instantiate with one line.
20+
```bash
21+
pip install sparkwheel
22+
```
2223

2324
```yaml
2425
# config.yaml
26+
dataset:
27+
num_classes: 10
28+
batch_size: 32
29+
2530
model:
2631
_target_: torch.nn.Linear
2732
in_features: 784
28-
out_features: "%dataset::num_classes" # Reference other values
33+
out_features: "%dataset::num_classes" # Reference
2934

30-
dataset:
31-
num_classes: 10
35+
training:
36+
steps_per_epoch: "$10000 // @dataset::batch_size" # Expression
3237
```
3338
3439
```python
3540
from sparkwheel import Config
3641

3742
config = Config()
3843
config.update("config.yaml")
39-
model = config.resolve("model") # Actual torch.nn.Linear(784, 10) instance!
40-
```
41-
42-
## Key Features
43-
44-
- **Declarative Object Creation** - Instantiate any Python class from YAML with `_target_`
45-
- **Smart References** - `@` for resolved values, `%` for raw YAML
46-
- **Composition by Default** - Configs merge naturally (dicts merge, lists extend)
47-
- **Explicit Operators** - `=` to replace, `~` to delete when needed
48-
- **Python Expressions** - Compute values dynamically with `$` prefix
49-
- **Schema Validation** - Type-check configs with Python dataclasses
50-
- **CLI Overrides** - Override any value from command line
51-
52-
## Installation
5344

54-
```bash
55-
pip install sparkwheel
45+
model = config.resolve("model") # Actual torch.nn.Linear(784, 10)
5646
```
5747

58-
**[→ Get Started in 5 Minutes](https://project-lighter.github.io/sparkwheel/getting-started/quickstart/)**
48+
## Features
5949

60-
## Coming from Hydra/OmegaConf?
61-
62-
Sparkwheel builds on similar ideas but adds powerful features:
63-
64-
| Feature | Hydra/OmegaConf | Sparkwheel |
65-
|---------|-----------------|------------|
66-
| Config composition | Explicit (`+`, `++`) | **By default** (dicts merge, lists extend) |
67-
| Replace semantics | Default | Explicit with `=` operator |
68-
| Delete keys | Not idempotent | Idempotent `~` operator |
69-
| References | OmegaConf interpolation | `@` (resolved) + `%` (raw YAML) |
70-
| Python expressions | Limited | Full Python with `$` |
71-
| Schema validation | Structured Configs | Python dataclasses |
72-
| List extension | Lists replace | **Lists extend by default** |
73-
74-
**Composition by default** means configs merge naturally without operators:
75-
```yaml
76-
# base.yaml
77-
model:
78-
hidden_size: 256
79-
dropout: 0.1
80-
81-
# experiment.yaml
82-
model:
83-
hidden_size: 512 # Override
84-
# dropout inherited
85-
```
86-
87-
## Documentation
50+
- **Declarative Objects** - Instantiate any Python class with `_target_`
51+
- **Smart References** - `@` for resolved values, `%` for raw YAML
52+
- **Composition by Default** - Dicts merge, lists extend automatically
53+
- **Explicit Control** - `=` to replace, `~` to delete
54+
- **Python Expressions** - Dynamic values with `$`
55+
- **Schema Validation** - Type-check with dataclasses
8856

89-
- [Full Documentation](https://project-lighter.github.io/sparkwheel/)
90-
- [Quick Start Guide](https://project-lighter.github.io/sparkwheel/getting-started/quickstart/)
91-
- [Core Concepts](https://project-lighter.github.io/sparkwheel/user-guide/basics/)
92-
- [API Reference](https://project-lighter.github.io/sparkwheel/reference/)
57+
**[Get Started](https://project-lighter.github.io/sparkwheel/getting-started/quickstart/)** · **[Documentation](https://project-lighter.github.io/sparkwheel/)** · **[Quick Reference](https://project-lighter.github.io/sparkwheel/user-guide/quick-reference/)**
9358

9459
## Community
9560

96-
- [Discord Server](https://discord.gg/zJcnp6KrUp) - Chat with the community
97-
- [YouTube Channel](https://www.youtube.com/channel/UCef1oTpv2QEBrD2pZtrdk1Q) - Tutorials and demos
98-
- [GitHub Issues](https://github.com/project-lighter/sparkwheel/issues) - Bug reports and feature requests
99-
100-
## Contributing
101-
102-
We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup and guidelines.
61+
- [Discord](https://discord.gg/zJcnp6KrUp) · [YouTube](https://www.youtube.com/channel/UCef1oTpv2QEBrD2pZtrdk1Q) · [Issues](https://github.com/project-lighter/sparkwheel/issues)
10362

10463
## About
10564

106-
Sparkwheel is a hard fork of [MONAI Bundle](https://github.com/Project-MONAI/MONAI/tree/dev/monai/bundle)'s configuration system, refined and expanded for general-purpose use. We're deeply grateful to the MONAI team for their excellent foundation.
107-
108-
Sparkwheel powers [Lighter](https://project-lighter.github.io/lighter/), our configuration-driven deep learning framework built on PyTorch Lightning.
109-
110-
## License
111-
112-
Apache License 2.0 - See [LICENSE](LICENSE) for details.
65+
Sparkwheel is a hard fork of [MONAI Bundle](https://github.com/Project-MONAI/MONAI/tree/dev/monai/bundle)'s config system, with the goal of making a more general-purpose configuration library for Python projects. It combines the best of MONAI Bundle and [Hydra](http://hydra.cc/)/[OmegaComf](https://omegaconf.readthedocs.io/), while introducing new features and improvements not found in either.

docs/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -199,7 +199,7 @@ Sparkwheel has two types of references with distinct purposes:
199199
- **Composition-by-default** - Configs merge/extend naturally, no operators needed for common case
200200
- **List extension** - Lists extend by default (unique vs Hydra!)
201201
- **`=` replace operator** - Explicit control when you need replacement
202-
- **`~` delete operator** - Remove inherited keys cleanly (idempotent!)
202+
- **`~` delete operator** - Remove inherited keys explicitly
203203
- **Python expressions with `$`** - Compute values dynamically
204204
- **Dataclass validation** - Type-safe configs without boilerplate
205205
- **Dual reference system** - `@` for resolved values, `%` for raw YAML

docs/user-guide/advanced.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -221,8 +221,6 @@ config.update({"~plugins": [0, 2]}) # Remove list items
221221
config.update({"~dataloaders": ["train", "test"]}) # Remove dict keys
222222
```
223223

224-
**Note:** The `~` directive is idempotent - it doesn't error if the key doesn't exist, enabling reusable configs.
225-
226224
### Programmatic Updates
227225

228226
Apply operators programmatically:

docs/user-guide/cli.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,7 @@ Three operators for fine-grained control:
111111
|----------|--------|----------|---------|
112112
| **Compose** (default) | `key=value` | Merges dicts, extends lists | `model::lr=0.001` |
113113
| **Replace** | `=key=value` | Completely replaces value | `=model={'_target_': 'ResNet'}` |
114-
| **Delete** | `~key` | Removes key (idempotent) | `~debug` |
114+
| **Delete** | `~key` | Removes key (errors if missing) | `~debug` |
115115

116116
!!! info "Type Inference"
117117
Values are automatically typed using `ast.literal_eval()`:

docs/user-guide/operators.md

Lines changed: 43 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -126,11 +126,14 @@ Remove keys or list items with `~key`:
126126
### Delete Entire Keys
127127

128128
```yaml
129-
# Remove keys (idempotent - no error if missing!)
129+
# Remove keys explicitly
130130
~old_param: null
131131
~debug_settings: null
132132
```
133133

134+
!!! warning "Key Must Exist"
135+
The delete operator will raise an error if the key doesn't exist. This helps catch typos and configuration mistakes.
136+
134137
### Delete Dict Keys
135138

136139
Use path notation for nested keys:
@@ -214,28 +217,6 @@ dataloaders:
214217

215218
**Why?** Path notation is designed for dict keys, not list indices. The batch syntax handles index normalization and processes deletions correctly (high to low order).
216219

217-
### Idempotent Delete
218-
219-
Delete operations don't error if the key doesn't exist:
220-
221-
```yaml
222-
# production.yaml - Remove debug settings if they exist
223-
~debug_mode: null
224-
~dev_logger: null
225-
~test_data: null
226-
# No errors if these don't exist!
227-
```
228-
229-
This enables **reusable configs** that work with multiple bases:
230-
231-
```yaml
232-
# production.yaml works with ANY base config
233-
~debug_settings: null
234-
~verbose_logging: null
235-
database:
236-
pool_size: 100
237-
```
238-
239220
## Combining Operators
240221

241222
Mix composition, replace, and delete:
@@ -298,7 +279,7 @@ config.update({"model": {"hidden_size": 1024}})
298279
# Replace explicitly
299280
config.update({"=optimizer": {"type": "sgd", "lr": 0.1}})
300281
301-
# Delete keys (idempotent)
282+
# Delete keys
302283
config.update({
303284
"~training::old_param": None,
304285
"~model::dropout": None
@@ -454,17 +435,40 @@ model:
454435

455436
### Write Reusable Configs
456437

457-
Use idempotent delete for portable configs:
438+
!!! warning "Delete Requires Key Existence"
439+
The delete operator (`~`) is **strict** - it raises an error if the key doesn't exist. This helps catch typos and configuration mistakes.
458440

441+
When writing configs that should work with different base configurations, you have a few options:
442+
443+
**Option 1: Document required keys**
459444
```yaml
460-
# production.yaml - works with ANY base!
461-
~debug_mode: null # Remove if exists
462-
~verbose_logging: null # No error if missing
445+
# production.yaml
446+
# Requires: base config must have debug_mode and verbose_logging
447+
~debug_mode: null
448+
~verbose_logging: null
463449
database:
464450
pool_size: 100
465451
ssl: true
466452
```
467453

454+
**Option 2: Use composition order**
455+
```yaml
456+
# production.yaml - override instead of delete
457+
debug_mode: false # Overrides if exists, sets if not
458+
verbose_logging: false
459+
database:
460+
pool_size: 100
461+
ssl: true
462+
```
463+
464+
**Option 3: Conditional deletion with lists**
465+
```yaml
466+
# Delete multiple optional keys - fails only if ALL are missing
467+
~: [debug_mode, verbose_logging] # At least one must exist
468+
database:
469+
pool_size: 100
470+
```
471+
468472
## Common Mistakes
469473

470474
### Using `=` When Not Needed
@@ -519,17 +523,17 @@ plugins: [cache]
519523
|---------|-------|------------|
520524
| Dict merge default | Yes ✅ | Yes ✅ |
521525
| List extend default | No ❌ | **Yes** ✅ |
522-
| Operators in YAML | No ❌ | Yes ✅ (`=`, `~`) |
523-
| Operator count | 4 (`+`, `++`, `~`) | **2** (`=`, `~`) ✅ |
524-
| Delete dict keys | No ❌ | Yes |
525-
| Delete list items | No ❌ | Yes |
526-
| Idempotent delete | N/A | Yes ✅ |
527-
528-
Sparkwheel goes beyond Hydra with:
529-
- Full composition-first philosophy (dicts **and** lists)
530-
- Operators directly in YAML files
531-
- Just 2 simple operators
532-
- Delete operations for fine-grained control
526+
| Operators in YAML | CLI-only | **Yes** ✅ (YAML + CLI) |
527+
| Operator count | 4 (`=`, `+`, `++`, `~`) | **2** (`=`, `~`) ✅ |
528+
| Delete dict keys | CLI-only (`~foo.bar`) | **Yes** ✅ (YAML + CLI) |
529+
| Delete list items | No ❌ | **Yes** ✅ (by index) |
530+
531+
Sparkwheel differs from Hydra:
532+
- **Full composition philosophy**: Both dicts AND lists compose by default
533+
- **Operators in YAML files**: Not just CLI overrides
534+
- **Simpler operator set**: Just 2 operators (`=`, `~`) vs 4 (`=`, `+`, `++`, `~`)
535+
- **List deletion**: Delete items by index with `~plugins: [0, 2]`
536+
- **Flexible delete**: Use `~` anywhere (YAML, CLI, programmatic)
533537

534538
## Next Steps
535539

0 commit comments

Comments
 (0)