Skip to content

GCP BigQuery Pipeline: Core Pipeline Implementation #51

@JGrubb

Description

@JGrubb

Overview

Implement the core GCP BigQuery billing data pipeline with incremental loading, DuckDB integration, and CLI interface.

Tasks

  • Create vendors/gcp/ directory structure
  • Implement BigQuery to DuckDB data pipeline
  • Add incremental loading with _PARTITIONTIME watermark
  • Create state management for GCP imports
  • Integrate with existing database backend abstraction
  • Add error handling and retry logic

Acceptance Criteria

  • Pipeline can extract billing data from BigQuery
  • Data is loaded into DuckDB following existing patterns
  • Incremental loading prevents duplicate imports
  • State tracking similar to AWS pipeline
  • Proper error handling and logging

Dependencies

Architecture

  • Follow vendors/aws/ pattern for consistency
  • Single DuckDB table with partitioning: gcp_billing.billing_data
  • Reuse existing database backend interfaces

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestgcpGoogle Cloud Platform related issuespipelineData pipeline related issues

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions