Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
893 changes: 893 additions & 0 deletions docs/designs/token-cost-tracking-design.md

Large diffs are not rendered by default.

415 changes: 415 additions & 0 deletions docs/meta_learning/DELIVERABLES.md

Large diffs are not rendered by default.

287 changes: 287 additions & 0 deletions docs/meta_learning/INDEX.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,287 @@
# Meta-Learning Test Suite - Index

## Quick Start

```bash
# Verify test structure (no dependencies required)
python3 tests/meta_learning/verify_test_structure.py

# Run full test suite (requires dependencies)
python3 tests/meta_learning/manual_test_prompt_evolution.py
```

## Documentation Files

### πŸ“‹ README_TESTS.md
**What it covers:**
- How to run the tests
- Test coverage breakdown
- Environment variables
- Troubleshooting guide

**When to read:**
- First time running tests
- Setting up test environment
- Debugging test failures

### πŸ“Š TEST_SUMMARY.md
**What it covers:**
- Complete test coverage overview
- Test scenario details
- Mock data structure
- Success metrics

**When to read:**
- Understanding test scope
- Evaluating test quality
- Planning test additions

### πŸ—οΈ TEST_ARCHITECTURE.md
**What it covers:**
- Visual component diagrams
- Data flow illustrations
- Test execution flow
- Assertion coverage map

**When to read:**
- Understanding test design
- Modifying test structure
- Adding new test scenarios

## Test Files

### βœ… manual_test_prompt_evolution.py (533 lines)
**Primary test file for prompt evolution tool**

**Components:**
- `MockAgent` class - Simulates Agent with realistic data
- `test_basic_functionality()` - 16 core test scenarios
- `test_edge_cases()` - 3 error handling tests

**Test Coverage:**
- Configuration validation
- Meta-analysis execution
- LLM integration
- Memory storage
- Auto-apply functionality
- Version control integration
- Edge cases and errors

### βœ“ verify_test_structure.py
**Standalone verification script**

**Purpose:**
- Validates test file syntax
- Analyzes test structure
- Counts assertions and scenarios
- No dependencies required

**Use Cases:**
- CI/CD validation
- Quick structure check
- Documentation generation

### βœ“ manual_test_versioning.py (157 lines)
**Tests for prompt versioning system**

**Coverage:**
- Snapshot creation
- Version comparison
- Rollback operations
- Change application

## Test Statistics

| Metric | Value |
|--------|-------|
| Total Test Files | 2 |
| Test Scenarios | 19 |
| Code Lines | 533 |
| Assertions | 30+ |
| Mock Messages | 28 |
| Environment Variables Tested | 5 |
| Integration Points | 3 |

## Directory Structure

```
tests/meta_learning/
β”œβ”€β”€ manual_test_prompt_evolution.py # Main test file
β”œβ”€β”€ manual_test_versioning.py # Versioning tests
β”œβ”€β”€ verify_test_structure.py # Structure validation
β”œβ”€β”€ README_TESTS.md # Usage guide
β”œβ”€β”€ TEST_SUMMARY.md # Coverage summary
β”œβ”€β”€ TEST_ARCHITECTURE.md # Visual diagrams
└── INDEX.md # This file
```

## Quick Reference

### Run Specific Test
```bash
# Just structure verification
python3 tests/meta_learning/verify_test_structure.py

# Just versioning tests
python3 tests/meta_learning/manual_test_versioning.py

# Just evolution tests
python3 tests/meta_learning/manual_test_prompt_evolution.py

# Both test suites
python3 tests/meta_learning/manual_test_versioning.py && \
python3 tests/meta_learning/manual_test_prompt_evolution.py
```

### Environment Variables
```bash
# Run with custom configuration
export ENABLE_PROMPT_EVOLUTION=true
export PROMPT_EVOLUTION_MIN_INTERACTIONS=20
export PROMPT_EVOLUTION_CONFIDENCE_THRESHOLD=0.8
export AUTO_APPLY_PROMPT_EVOLUTION=false
python3 tests/meta_learning/manual_test_prompt_evolution.py
```

### Expected Runtime
- **verify_test_structure.py**: < 1 second
- **manual_test_versioning.py**: 2-5 seconds
- **manual_test_prompt_evolution.py**: 5-10 seconds

## Test Scenarios at a Glance

### Basic Functionality (16 tests)
1. Environment setup
2. Mock agent creation
3. Tool initialization
4. Insufficient history detection
5. Disabled meta-learning check
6. Full meta-analysis execution
7. Utility model verification
8. Analysis storage
9. Confidence threshold filtering
10. Auto-apply functionality
11. History formatting
12. Summary generation
13. Storage formatting
14. Default prompt structure
15. Version manager integration
16. Rollback functionality

### Edge Cases (3 tests)
1. Empty history handling
2. Malformed LLM response
3. LLM error handling

## Mock Data Overview

### Conversation History (28 messages)
- **Success patterns:** Code execution, memory operations
- **Failure patterns:** Search timeouts, tool confusion
- **Gaps detected:** Email capability, file vs memory distinction

### Meta-Analysis Response
- **Failure patterns:** 2 detected
- **Success patterns:** 2 identified
- **Missing instructions:** 2 gaps
- **Tool suggestions:** 2 new tools
- **Prompt refinements:** 3 improvements (0.75-0.92 confidence)

## Integration Points

```
PromptEvolution Tool
β”œβ”€β”€ Agent.call_utility_model()
β”œβ”€β”€ Agent.read_prompt()
β”œβ”€β”€ Memory.get()
β”œβ”€β”€ Memory.insert_text()
β”œβ”€β”€ PromptVersionManager.apply_change()
└── PromptVersionManager.rollback()
```

## Success Indicators

When all tests pass, you should see:

```
βœ… ALL TESTS PASSED
βœ“ 16 basic functionality tests
βœ“ 3 edge case tests
βœ“ 30+ assertions
βœ“ 0 errors
βœ“ Clean cleanup

πŸŽ‰ COMPREHENSIVE TEST SUITE PASSED
```

## Maintenance Checklist

When updating `prompt_evolution.py`:

- [ ] Add test scenario for new feature
- [ ] Update mock data if needed
- [ ] Add new assertions for validation
- [ ] Update TEST_SUMMARY.md
- [ ] Update environment variables if added
- [ ] Run full test suite
- [ ] Update documentation

## Related Files

### Source Code
- `/python/tools/prompt_evolution.py` - Tool being tested
- `/python/helpers/prompt_versioning.py` - Version manager
- `/python/helpers/tool.py` - Tool base class
- `/python/helpers/memory.py` - Memory system

### Prompts
- `/prompts/meta_learning.analyze.sys.md` - Analysis system prompt
- `/prompts/agent.system.*.md` - Various agent prompts

### Documentation
- `/docs/extensibility.md` - Extension system
- `/docs/architecture.md` - System architecture

## Common Issues

### "ModuleNotFoundError"
**Solution:** Install dependencies
```bash
pip install -r requirements.txt
```

### "Permission denied" during cleanup
**Solution:** Check temp directory permissions
```bash
chmod -R 755 /tmp/test_prompt_evolution_*
```

### Tests hang or timeout
**Solution:** Check async operations
- Ensure mock methods are async when needed
- Verify asyncio.run() usage

## Contributing

To add new test scenarios:

1. **Add test function** in `manual_test_prompt_evolution.py`
2. **Update documentation** in relevant .md files
3. **Add assertions** to validate behavior
4. **Update TEST_SUMMARY.md** with new coverage
5. **Run full suite** to ensure no regressions

## Version History

- **v1.0** (2026-01-05) - Initial test suite creation
- 19 test scenarios
- 30+ assertions
- Comprehensive documentation

## Contact & Support

For questions about the test suite:
- Review this INDEX.md for overview
- Check README_TESTS.md for usage
- See TEST_ARCHITECTURE.md for design details
- Examine TEST_SUMMARY.md for coverage info
Loading