Skip to content

Commit be644a9

Browse files
committed
tooling: changelog automatic reporting
Signed-off-by: Andres Taylor <[email protected]>
1 parent 1baf02b commit be644a9

File tree

9 files changed

+800
-0
lines changed

9 files changed

+800
-0
lines changed

changelog/tooling/README.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Vitess Release Documentation Tooling
2+
3+
This directory contains automated tools and methodologies for analyzing Vitess pull requests and generating comprehensive release documentation.
4+
5+
## Contents
6+
7+
### Core Documentation
8+
- **`automated-pr-analysis-guide.md`** - Complete methodology for analyzing hundreds of PRs efficiently using specialized agents
9+
- **`pr-flag-metric-tracker.md`** - Agent definition file for automated PR analysis (copy to `.claude/agents/`)
10+
11+
### Examples
12+
- **`examples/sample-pr-report.md`** - Example of individual PR analysis output
13+
- **`examples/sample-final-report.md`** - Example of comprehensive release documentation
14+
- **`examples/milestone-url-examples.md`** - How to find GitHub milestone URLs
15+
16+
### Scripts
17+
- **`scripts/analyze-milestone.sh`** - Automated setup script for milestone analysis
18+
- **`scripts/count-progress.sh`** - Progress monitoring during analysis
19+
20+
### Templates
21+
- **`templates/release-notes-template.md`** - Template for final release documentation
22+
23+
## Quick Start
24+
25+
1. **Setup**: Copy `pr-flag-metric-tracker.md` to your `.claude/agents/` directory
26+
2. **Find milestone**: Use `examples/milestone-url-examples.md` to locate your milestone URL
27+
3. **Run setup**: Execute `scripts/analyze-milestone.sh <milestone-id>` (e.g., `85` for v23)
28+
4. **Monitor progress**: Use `scripts/count-progress.sh` to track completion
29+
5. **Generate report**: Follow the guide in `automated-pr-analysis-guide.md`
30+
31+
**Expected time**: 4-6 hours for ~276 PRs
32+
**Expected cost**: $40-50 using Claude Code
33+
34+
## Prerequisites
35+
36+
- GitHub CLI (`gh`) authenticated
37+
- Claude Code with agent support
38+
- Repository access permissions
39+
- ~5GB free disk space for reports
40+
41+
## Output
42+
43+
- Individual PR reports (`PR{number}.md`)
44+
- Comprehensive API changes report
45+
- Structured tables for release notes
Lines changed: 222 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,222 @@
1+
# Automated PR Analysis for Release Documentation - Methodology & Approach
2+
3+
## Overview
4+
5+
This document describes an efficient methodology for analyzing large numbers of pull requests (PRs) to generate comprehensive release
6+
documentation focusing on public-facing API changes, flag modifications, and breaking changes.
7+
8+
_Key Innovation: Specialized Agent Architecture_
9+
10+
### Core Concept
11+
12+
Instead of manually reviewing hundreds of PRs, we used specialized pr-flag-metric-tracker agents that can work in parallel to analyze
13+
PRs systematically for specific types of changes. The agent description can be found in the file `pr-flag-metric-tracker.md`. This can be copied into the `.claude/agents` directory.
14+
15+
### Agent Capabilities
16+
17+
- Automatically fetch PR content using GitHub CLI (gh pr view, gh pr diff, gh api)
18+
- Parse code changes for flag additions/deletions/modifications
19+
- Identify metric changes (Prometheus counters, gauges, etc.)
20+
- Detect API changes (gRPC/HTTP endpoints)
21+
- Find parser modifications (SQL syntax changes)
22+
- Spot query planning behavior changes
23+
- Generate standardized reports
24+
25+
## Methodology: Three-Phase Approach
26+
27+
### Phase 1: Bulk PR Discovery
28+
29+
#### Get all PRs from milestone
30+
```
31+
gh api 'repos/org/repo/issues?milestone=X&state=all' --paginate --jq '.[].number'
32+
```
33+
34+
### Phase 2: Parallel Analysis with Merge Filtering
35+
36+
Key Innovation: Check merge status BEFORE analysis to avoid wasting time on unmerged PRs
37+
38+
#### Check if PR was actually merged (not just closed)
39+
```
40+
gh pr view PR_URL --json state,mergedAt
41+
```
42+
43+
Decision Tree:
44+
- If mergedAt is null → Create simple "PR not merged" file
45+
- If mergedAt has date → Perform full analysis
46+
47+
### Phase 3: Batched Agent Deployment
48+
49+
Deploy agents in batches of 5-10 PRs simultaneously for maximum parallelization while avoiding rate limits.
50+
51+
#### Template Standardization
52+
53+
By using a template, we can guide the agents to no be wordy and write a lot of unneccesary info to the reports that would then just take time to read and ignore. The agent profile describes a specific template to use.
54+
55+
### Key Principles
56+
57+
- Focus only on user-facing changes
58+
- No PR metadata or implementation details
59+
- Standardized sections for easy parsing
60+
- "No public changes" for PRs with only internal modifications
61+
62+
## Efficiency Optimizations
63+
64+
1. Batch Processing
65+
66+
- Process 5-10 PRs per agent call
67+
- Parallel execution across multiple agents
68+
- Reduces API calls and context switching
69+
70+
2. Smart Filtering
71+
72+
- Merge status check first - eliminates ~30% of PRs immediately
73+
- Public-facing focus - skip internal refactoring and test-only changes
74+
- Template enforcement - consistent output format for easy aggregation
75+
76+
3. Pre-approved Command Strategy
77+
78+
Ensure agents only use pre-approved GitHub commands to avoid permission prompts:
79+
- gh pr view (approved)
80+
- gh pr diff (approved)
81+
- gh api (approved)
82+
83+
4. Progressive Refinement
84+
85+
- Start with checking for closed and merged PRs
86+
- Only analyze merged PRs
87+
- Avoid re-work through systematic tracking
88+
89+
### Scalability Lessons
90+
91+
What Worked Well
92+
93+
1. Agent specialization - Single-purpose agents are more reliable than general-purpose
94+
2. Parallel execution - 5x faster than sequential analysis
95+
3. Standardized templates - Easy to aggregate and parse results
96+
4. Merge filtering - Eliminates ~30% of work upfront
97+
5. Batching - Reduces overhead and improves throughput
98+
99+
### What to Avoid
100+
101+
- Verbose reports with implementation details
102+
- Analyzing non-merged PRs
103+
- Sequential processing
104+
- Inconsistent report formats
105+
- Re-analyzing already completed work
106+
107+
## Output Processing
108+
109+
### Individual PR Reports
110+
111+
Each PR gets a focused report following the standard template, making it easy to:
112+
- Scan for breaking changes
113+
- Identify new features
114+
- Track deprecations
115+
- Generate migration guides
116+
117+
### Aggregate Reporting
118+
119+
Parse all individual reports to create comprehensive release documentation with:
120+
- Structured tables by change category
121+
- Component-wise organization
122+
- Breaking change highlights
123+
- Migration timelines
124+
125+
## Replication Instructions
126+
127+
### Prerequisites
128+
129+
- **GitHub CLI (`gh`)**: Authenticated and configured (`gh auth login`)
130+
- **Claude Code**: With agent support enabled
131+
- **Repository access**: Read permissions to target repository
132+
- **Agent setup**: Copy `pr-flag-metric-tracker.md` to your `.claude/agents/` directory
133+
- **Disk space**: ~5GB for storing individual PR reports
134+
- **Time estimate**: 4-6 hours for 276 PRs
135+
- **Cost estimate**: $40-50 in Claude API usage
136+
137+
### Step-by-Step Process
138+
139+
#### 1. Initial Setup
140+
```bash
141+
# Authenticate GitHub CLI if not already done
142+
gh auth login
143+
144+
# Copy agent definition to Claude agents directory
145+
cp pr-flag-metric-tracker.md ~/.claude/agents/
146+
147+
# Create working directory
148+
mkdir release-analysis && cd release-analysis
149+
```
150+
151+
#### 2. Fetch Milestone PRs
152+
```bash
153+
# Get milestone ID from GitHub UI, then fetch all PR numbers
154+
gh api 'repos/vitessio/vitess/issues?milestone=MILESTONE_ID&state=all' --paginate --jq '.[].number' > all_pr_numbers.txt
155+
156+
# Verify count
157+
echo "Total PRs to analyze: $(wc -l < all_pr_numbers.txt)"
158+
```
159+
160+
#### 3. Launch Batched Analysis
161+
Launch agents in batches of 5 PRs at a time using this prompt template:
162+
163+
```
164+
Analyze these 5 PRs in batch. For each:
165+
1. Check merge: gh pr view https://github.com/vitessio/vitess/pull/XXXX --json state,mergedAt
166+
2. If NOT merged: Create PRXXXX.md with just "PR not merged"
167+
3. If MERGED: Create full analysis with template focusing on public-facing changes
168+
169+
PRs: 18520, 18521, 18522, 18523, 18524
170+
171+
Use ONLY: gh pr view, gh pr diff, gh api, Edit tool. Focus on flags, metrics, APIs, parser changes, query planning.
172+
```
173+
174+
#### 4. Monitor Progress
175+
```bash
176+
# Check completion status
177+
ls PR*.md | wc -l
178+
echo "Progress: $(ls PR*.md | wc -l)/$(wc -l < all_pr_numbers.txt)"
179+
```
180+
181+
#### 5. Generate Final Report
182+
Once all PRs analyzed, create comprehensive release documentation by parsing individual reports into structured tables.
183+
184+
## Troubleshooting
185+
186+
### Common Issues
187+
188+
**Permission Errors with GitHub CLI**:
189+
- Ensure `gh pr view`, `gh pr diff`, `gh api` are pre-approved in Claude Code
190+
- Check GitHub token permissions
191+
192+
**Agent Rate Limiting**:
193+
- Reduce batch size from 5 to 3 PRs
194+
- Add delays between batches if needed
195+
196+
**Inconsistent Report Formats**:
197+
- Emphasize template adherence in agent prompts
198+
- Review and correct agent instructions
199+
200+
**Missing PRs**:
201+
- Some PR numbers may not exist (normal in GitHub)
202+
- Agents will handle gracefully with "PR not found" reports
203+
204+
### Performance Tips
205+
206+
- **Batch size**: 5-10 PRs per agent call is optimal
207+
- **Parallel agents**: Launch multiple agent batches simultaneously
208+
- **Template enforcement**: Be strict about output format for easier parsing
209+
- **Merge filtering**: Always check merge status first to avoid wasted analysis
210+
211+
## Expected Results
212+
213+
**Time Performance**:
214+
- **Manual approach**: 2-3 minutes per PR = 9-14 hours for 276 PRs
215+
- **Automated approach**: 4-6 hours total including setup
216+
- **Efficiency gain**: 70%+ time reduction
217+
218+
**Output Quality**:
219+
- Standardized format across all reports
220+
- Focus on user-impacting changes only
221+
- Easy to parse for release documentation
222+
- Comprehensive coverage with no missed PRs
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# Finding Vitess Milestone URLs
2+
3+
## Vitess Release Milestones
4+
5+
### Recent Releases
6+
7+
- **v23 milestone**: https://github.com/vitessio/vitess/milestone/85?closed=1
8+
- **v22 milestone**: https://github.com/vitessio/vitess/milestone/84?closed=1
9+
- **v21 milestone**: https://github.com/vitessio/vitess/milestone/83?closed=1
10+
11+
### Pattern for Future Releases
12+
13+
**Pattern**: `https://github.com/vitessio/vitess/milestone/NUMBER?closed=1`
14+
15+
### How to Find Vitess Milestone URLs
16+
17+
1. **Navigate to**: https://github.com/vitessio/vitess
18+
2. **Click "Issues"** tab
19+
3. **Click "Milestones"** link
20+
4. **Find your milestone**: Look for the release milestone (e.g., "v23.0.0")
21+
5. **Click milestone name**: This opens the milestone page
22+
6. **Add `?closed=1`** to URL to see closed/merged PRs
23+
7. **Copy full URL**: Use this URL with the analysis tools
24+
25+
### Milestone ID vs URL
26+
27+
**You can use either**:
28+
- **Milestone ID**: `85` (for API calls)
29+
- **Milestone URL**: `https://github.com/vitessio/vitess/milestone/85?closed=1` (for web scraping)
30+
31+
### API Equivalent
32+
33+
```bash
34+
# Using milestone ID directly
35+
gh api 'repos/vitessio/vitess/issues?milestone=85&state=all' --paginate --jq '.[].number'
36+
```
37+
38+
### Tips
39+
40+
- **Include `?closed=1`** in URLs to see all PRs (open + closed + merged)
41+
- **Milestone numbers** are sequential integers assigned by GitHub
42+
- **Check milestone status** before starting analysis to ensure it's complete
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# Vitess v23.0.0 API Changes Report
2+
3+
## Summary
4+
5+
This report documents all public-facing API changes, flag modifications, metric additions/removals, and parser enhancements that were merged into Vitess v23.0.0. Based on analysis of 276 pull requests from the v23 milestone.
6+
7+
### Table of Contents
8+
9+
- **[Major Changes](#major-changes)**
10+
- **[Flag Standardization](#flag-standardization)**
11+
- **[New Flags](#new-flags)**
12+
- **[New Metrics](#new-metrics)**
13+
14+
---
15+
16+
## <a id="major-changes"/>Major Changes</a>
17+
18+
### <a id="flag-standardization"/>Flag Standardization</a>
19+
20+
**The most significant change in v23 is the systematic migration of CLI flags from underscore (`_`) to dash (`-`) notation.** This affects over 1,000+ flags across all Vitess components.
21+
22+
#### Key Flag Migration PRs
23+
24+
| PR | Component Focus | Flags Migrated | Description | Breaking Change |
25+
|:--:|:---------------:|:--------------:|:------------|:---------------:|
26+
| [#18009](https://github.com/vitessio/vitess/pull/18009) | gRPC | 234 | gRPC authentication, TLS, keepalive flags | ⚠️ Yes |
27+
| [#18280](https://github.com/vitessio/vitess/pull/18280) | All Components | 1,170+ | **MEGA MIGRATION** - Most comprehensive flag refactor | ⚠️ Yes |
28+
29+
### <a id="new-flags"/>New Flags</a>
30+
31+
| Component | Flag Name | Type | Description | PR |
32+
|:---------:|:---------:|:----:|:------------|:--:|
33+
| vtgate, vttablet, vtcombo | `--querylog-time-threshold` | duration | Execution time threshold for query logging | [#18520](https://github.com/vitessio/vitess/pull/18520) |
34+
| vtorc | `--allow-recovery` | bool | Allow VTOrc recoveries to be disabled from startup | [#18005](https://github.com/vitessio/vitess/pull/18005) |
35+
36+
### <a id="new-metrics"/>New Metrics</a>
37+
38+
#### VTGate
39+
40+
| Name | Dimensions | Description | PR |
41+
|:----:|:----------:|:-----------:|:--:|
42+
| `TransactionsProcessed` | `TransactionType`, `ShardDistribution` | Track transactions by type and shard distribution | [#18171](https://github.com/vitessio/vitess/pull/18171) |
43+
44+
---
45+
46+
*Generated from analysis of all v23 milestone pull requests*
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# PR 18520 - Public Changes Analysis
2+
3+
## Flags
4+
- Added `--querylog-time-threshold` (duration) to vtgate, vttablet, vtcombo for time-based query logging
5+
6+
## Metrics
7+
No metric changes
8+
9+
## Public APIs
10+
No API changes
11+
12+
## Parser Changes (go/vt/sqlparser)
13+
No parser changes
14+
15+
## Query Planning
16+
No query planning changes
17+
18+
## Summary
19+
New time-based query logging flag added to match MySQL slow query log functionality.

0 commit comments

Comments
 (0)