-
Notifications
You must be signed in to change notification settings - Fork 2
GCP BigQuery Pipeline: Core Pipeline Implementation #51
Copy link
Copy link
Open
Labels
enhancementNew feature or requestNew feature or requestgcpGoogle Cloud Platform related issuesGoogle Cloud Platform related issuespipelineData pipeline related issuesData pipeline related issues
Milestone
Description
Overview
Implement the core GCP BigQuery billing data pipeline with incremental loading, DuckDB integration, and CLI interface.
Tasks
- Create vendors/gcp/ directory structure
- Implement BigQuery to DuckDB data pipeline
- Add incremental loading with _PARTITIONTIME watermark
- Create state management for GCP imports
- Integrate with existing database backend abstraction
- Add error handling and retry logic
Acceptance Criteria
- Pipeline can extract billing data from BigQuery
- Data is loaded into DuckDB following existing patterns
- Incremental loading prevents duplicate imports
- State tracking similar to AWS pipeline
- Proper error handling and logging
Dependencies
- Issue GCP BigQuery Pipeline: Authentication & Configuration #49 (Authentication & Configuration)
- Issue GCP BigQuery Pipeline: Data Discovery & Schema Analysis #50 (Data Discovery & Schema Analysis)
Architecture
- Follow vendors/aws/ pattern for consistency
- Single DuckDB table with partitioning: gcp_billing.billing_data
- Reuse existing database backend interfaces
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgcpGoogle Cloud Platform related issuesGoogle Cloud Platform related issuespipelineData pipeline related issuesData pipeline related issues