-
Notifications
You must be signed in to change notification settings - Fork 0
feat(pipeline): add database schema, repositories, services, and loader for pipeline data #101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
- Add test:unit script excluding *.integration.test.ts - Add test:integration script for integration tests only - Add turbo test:unit task with caching - Rename data-anomaly.test.ts to integration test (requires extracted data) Workflow: Ralph loop uses fast test:unit, pre-merge uses full test suite. Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Implement CacheService for R2-backed caching with conditional fetching - Add ETag and Last-Modified support for cache validation - Extend StatsService to use CacheService for efficient data fetching - Add comprehensive tests for cache service functionality
8e01a1e to
43e026b
Compare
Code ReviewSummaryThis PR implements the database schema, repositories, services, and loader for the pipeline data ingestion system. It adds comprehensive support for storing and managing lacrosse stats from multiple leagues including player identities, stats, teams, games, standings, and scrape runs. The implementation follows Effect-TS patterns with typed errors, includes a caching layer with R2 backing, and provides services for player identity resolution and stats aggregation. Title Convention✅ Title follows conventional commit format correctly (all lowercase). FeedbackStrengths:
Issues:
Suggestions
OverallStrong implementation with excellent Effect-TS patterns and comprehensive test coverage. The architecture is well-designed with clear separation of concerns. Main concerns are the type safety violation (must fix per project rules), potential performance issues with large dataset loads, and some missing error handling. The caching strategy is well-thought-out and aligns with the spec. After addressing the type safety issue and improving error handling, this will be production-ready. Recommendation: Request changes for type safety fix, then approve after verification. Auto-generated review by Claude |
Add Effect Schema definitions and RPC contracts for: - Stats: LeaderboardEntry, GetLeaderboardInput, LeaderboardResponse - Players: CanonicalPlayer, SourcePlayer, GetPlayerInput, GetPlayersInput - Teams: TeamDetails, TeamWithRoster, GetTeamInput, GetTeamsInput Co-Authored-By: Claude <[email protected]>
Add Effect-based repos and services for pipeline data access: - StatsRepo/StatsService: leaderboard queries with cursor pagination - PlayersRepo/PlayersService: canonical player lookup with source records - TeamsRepo/TeamsService: team details with roster information Uses PgDrizzle for database queries with proper error handling. Co-Authored-By: Claude <[email protected]>
Add Effect RPC handlers for stats, players, and teams: - StatsRpcs: getLeaderboard with multi-league filtering - PlayersRpcs: getPlayer, getPlayers with canonical resolution - TeamsRpcs: getTeam, getTeams with roster data Integrates with LaxdbRpc group and handlers. Co-Authored-By: Claude <[email protected]>
Add REST API for public stats access: - POST /api/stats/leaderboard - paginated leaderboard with league filtering - StatsApiGroup with HttpApiBuilder handlers - Integrates with LaxdbApi definition Co-Authored-By: Claude <[email protected]>
Add lacrosse stats leaderboard page: - URL state management with Effect Schema validation - StatsTable with sortable columns (points, goals, assists) - LeagueFilter checkboxes (PLL, NLL, MLL, MSL, WLA) - Cursor-based pagination component - TanStack Query integration with prefetching Co-Authored-By: Claude <[email protected]>
Add league season configuration: - LEAGUE_SEASONS with start/end dates for each league - isInSeason helper handling year-boundary cases (NLL: Dec-May) - getActiveLeagues returns currently in-season leagues - getAllActiveLeagues returns non-historical leagues Used by cron worker to determine which leagues to extract. Co-Authored-By: Claude <[email protected]>
Add Cloudflare Workers scheduled handler: - Hourly cron trigger for data extraction - Season-aware league selection (only active leagues) - Parallel extraction with error isolation per league - Cache invalidation after successful loads - Cloudflare worker types in tsconfig Note: Extract/load functions are placeholders for MVP. Co-Authored-By: Claude <[email protected]>
Configure Alchemy deployment for pipeline MVP: - Add pipelineKV namespace for caching - Enable API worker with PIPELINE_KV binding - Add hourly cron trigger (0 * * * *) - Update PR preview comments to include API URL Co-Authored-By: Claude <[email protected]>
- Add @laxdb/pipeline workspace dependency to api package - Remove unused PLLTeamDetail import from pll.extractor.ts - Update bun.lock Co-Authored-By: Claude <[email protected]>
Document simplified 8-item MVP plan for public /stats page: - Stats/Players/Teams RPC endpoints - /stats route with URL state management - League filtering and pagination - Unified cron worker with season awareness - Alchemy deployment configuration Co-Authored-By: Claude <[email protected]>
Code ReviewSummaryThis PR adds comprehensive database schema, repositories, services, and a loader for pipeline data. Key additions include:
This represents significant infrastructure for the MVP pipeline feature. The PR description correctly notes this is WIP with potential lint/test failures. Title Convention✅ Title follows conventional commit format:
FeedbackCritical Issues
Medium Priority Issues
Minor Issues
Positive Observations
Suggestions
OverallStatus: ✅ Approve with Comments This is a well-architected, comprehensive foundation for the pipeline MVP. The code demonstrates:
Blocking Issues:
Recommended Before Merge:
Can Be Addressed in Follow-up PRs:
The PR correctly notes this is WIP with potential lint/test failures. The architecture and patterns are solid - the issues are mostly about tightening type safety and completing TODOs rather than fundamental problems. Great work on a complex feature! Auto-generated review by Claude |
- Use Effect Array.head + Option for safe array access - Replace type assertions with type-safe iteration using LEAGUE_KEYS - Remove unnecessary type assertion in stats.repo.ts Co-Authored-By: Claude <[email protected]>
Code ReviewSummaryThis PR adds foundational pipeline infrastructure including 10 database tables for leagues/seasons/players/stats, Cloudflare Worker cron bindings, RPC/HTTP API endpoints for stats/players/teams, and supporting services. The PR establishes comprehensive database schema with proper indexing and foreign keys. However, several critical issues must be addressed before merging, particularly around incomplete Effect-TS integration in the cron handler and placeholder implementations. Title Convention✅ Title follows conventional commit format correctly: "feat(pipeline): add database schema, repositories, services, and loader for pipeline data" - lowercase with proper scope. FeedbackCritical Issues - Must Fix Before Merge:
High Priority Issues:
Database Schema Issues:
Security Concerns:
Architecture Strengths:
Suggestions
const results = yield* Effect.forEach(activeLeagues, (league) =>
extractLeague(league, env).pipe(
Effect.catchTag("ExtractionError", (err) =>
Effect.succeed({ league, error: err })
)
)
);
OverallThis is a well-structured foundation for the pipeline system with solid database design and proper architectural layering. However, it should not be merged in its current state due to:
The database schema is well-designed with appropriate indexes and constraints (with minor improvements needed). Once the Effect integration is corrected, placeholders are removed, and tests are added, this will be a strong addition to the codebase. Recommendation: Mark as WIP and address critical issues (1-3) before requesting re-review. Auto-generated review by Claude |
- Move helper functions to module scope to fix consistent-function-scoping: - incremental.service.ts: normalizeOptions - season-config.ts: isTimestampStale - cache.service.ts: serialize - stats.service.ts: computeAggregatedTotals, groupStatsByLeague - mll.client.ts: normalizeTeamName, extractTeamId, parseDate, parseScore, getScheduleUrlPriority - Fix no-negated-condition by flipping ternary expressions: - stats.service.ts: seasonId checks - nll.schema.ts: null checks for weight, team_id, winningSquadId - msl.client.ts: team_id undefined checks - Fix max-depth in mll.client.ts by using early continues - Add exhaustive case for 'unknown' in cache.service.ts switch - Add compare function to toSorted() in cache.service.ts - Remove unused imports across multiple files Remaining warnings are: - Test file false positives (unbound-method for mock assertions) - Necessary type assertions for Drizzle query results Co-Authored-By: Claude <[email protected]>
Code ReviewSummaryThis PR adds comprehensive infrastructure for pipeline data extraction and querying. It introduces:
The database schema is well-designed with proper indexes and foreign key constraints. The service/repo/contract structure follows Effect-TS patterns correctly. However, the cron handler has placeholder implementations (TODOs) and some type safety concerns exist. Title Convention✅ Title follows conventional commit format:
FeedbackType Safety Concerns
Effect-TS Pattern Issues
Schema/Configuration
Missing Implementation
Test Coverage
SQL/Query Considerations
Suggestions
OverallThis is a well-structured foundation for the pipeline infrastructure. The database schema is properly normalized with good indexing strategy. The service/repo separation is clean and follows Effect-TS patterns in most places. The API surface (RPC + HTTP) is well-designed. However, there are blocking issues that should be addressed before merge:
Recommendation: Request changes to fix type safety issues and add test coverage. The placeholder TODOs should either be implemented or the PR scope adjusted to clearly indicate this is infrastructure-only (no data extraction yet). Auto-generated review by Claude |
Refactored mock KV setup to expose mock functions directly as separate variables (getMock, putMock, deleteMock, listMock) instead of accessing them via mockKV.kv.method which caused unbound-method lint violations. Co-Authored-By: Claude Opus 4.5 <[email protected]>
Code ReviewSummaryThis PR adds comprehensive pipeline data infrastructure including database schemas for 11 tables (leagues, seasons, teams, players, games, stats, standings, scrape runs, player identities, canonical players, and team seasons), along with repositories, services, and loader logic. It also adds CacheService with R2-backed caching, ETags support, RPC/HTTP API endpoints for stats/players/teams, cron-based scheduled data extraction, and extensive test coverage. Title Convention✅ Title follows conventional commit format: FeedbackPositive:
Issues:
Suggestions
OverallThis is a substantial and well-structured addition to the pipeline infrastructure. The database schema design is solid, Effect patterns are correctly applied, and test coverage is comprehensive. However, the PR is explicitly marked WIP with placeholder implementations in the cron handler, which should not be merged to production. Once the TODOs are resolved, tests pass, and the architectural concerns (cross-package dependencies, missing DB connections) are addressed, this will be ready for merge. Recommendation: Request changes - resolve WIP status, implement or remove TODO placeholders, fix cross-package schema dependencies. Auto-generated review by Claude |
- Fix unbound-method lint errors in cache.service.test.ts by exposing mock functions directly - Add RpcStatsClient for typed stats API access - Fix TypeScript errors in stats page with proper response typing - Apply formatter fixes across pipeline and web packages Co-Authored-By: Claude Opus 4.5 <[email protected]>
Code ReviewSummaryThis PR implements a comprehensive data pipeline infrastructure for ingesting and serving statistics from professional lacrosse leagues (PLL, NLL, MLL, MSL, WLA). The implementation includes:
Title Convention✅ Title follows conventional commit format:
Feedback🔴 Critical Issues (3)
🟡 Medium Issues (4)
🟢 Low Issues / Observations (3)
Positive Observations✅ Excellent Effect-TS Pattern Compliance
✅ Strong Database Design
✅ Well-Architected Caching Strategy
✅ Solid Test Coverage
✅ Security Best Practices
Suggestions
OverallQuality Score: 8.5/10 This is a well-architected, production-ready implementation that demonstrates strong engineering practices:
The three critical issues are straightforward to address and don't require architectural changes. The LoaderService duplication is acceptable for MVP but should be refactored post-merge to improve maintainability. Recommendation: Request changes - address the 3 critical blockers (LoaderService tests, cron placeholder implementation/removal, rate limiting), then approve. Auto-generated review by Claude |
Summary
WIP
This is a work-in-progress PR - lint/tests may not pass yet.