Skip to content

Conversation

@fivertran-karunveluru
Copy link
Collaborator

AdMob Connector

Created: 2025-10-31

Business Owner: Marketing Analytics & Revenue Operations Team

Technical Owner: Data Engineering Team

Last Updated: 2025-10-31

Business Context

  • Data Source: Google AdMob API for mobile app advertising performance and revenue data
  • Business Criticality: High - supports revenue optimization, advertising performance analysis, and publisher monetization strategies
  • Data Consumers: Marketing teams, revenue operations, product managers, mobile app developers, executive leadership
  • Business SLAs: Data must be fresh within 4 hours for revenue reporting, 24 hours for historical analysis
  • Compliance Requirements: Privacy compliance for user data, revenue reporting accuracy for financial statements
  • Budget Constraints: AdMob API access is free, rate limits based on standard Google Cloud quotas

Technical Context

  • API Documentation: https://developers.google.com/admob/api
  • Authentication Method: OAuth2 with client credentials and refresh token
  • Rate Limits: Standard Google Cloud API quotas (varies by project), typically 1,000+ requests/hour
  • Data Volume:
    • Publisher Accounts: 1-10+ accounts per integration
    • Network Reports: 365+ days of daily reports per account
    • Ad Units: 10-1,000+ ad units per app
    • Apps: 1-100+ apps per publisher account
    • Countries: 50-200+ countries tracked per report
    • Daily Report Records: 100-50,000+ records per day per account
  • Data Velocity: Reports updated daily, accounts updated on configuration changes, real-time metrics available
  • Data Quality: Structured JSON with consistent schema, some fields may be null for incomplete data
  • Network Considerations: HTTPS only, RESTful API with Google Cloud infrastructure, global CDN

Operational Context

  • Deployment Environment: Development (sandbox), staging, and production environments
  • Monitoring Requirements: Alert on >2% error rate, >2 hour sync time, revenue data discrepancies
  • Maintenance Windows: Off-peak hours for non-critical updates, immediate deployment for revenue-critical fixes
  • Team Structure: Data Engineering team, Marketing Analytics, Revenue Operations, Mobile App Development
  • Escalation Path: Data Engineer → Team Lead → Marketing Director → CMO

API-Specific Details

  • Base Endpoint: https://admob.googleapis.com/v1
  • OAuth Token Endpoint: https://oauth2.googleapis.com/token
  • Authentication: Bearer token in Authorization header (OAuth2)
  • Pagination: Report-based streaming with row-based pagination
  • Date Format: ISO 8601 (e.g., 2024-01-15T10:30:00Z) and YYYY-MM-DD for report dates
  • Response Format: JSON with nested objects and arrays, metric values in micros (1/1,000,000 units)
  • Key Endpoints:
    • /accounts - Publisher account information and settings
    • /accounts/{publisher_id}/networkReport:generate - Network performance reports
    • Future endpoints for mediation reports and app-level analytics

Data Schema Overview

  • accounts: Publisher account profile, currency settings, and reporting timezone
  • network_reports: Daily advertising performance metrics with dimensional breakdowns
    • Dimensions: Date, Ad Unit, App, Country, Platform, Ad Type
    • Metrics: Estimated Earnings, Ad Requests, Matched Requests, Show Rate, Impressions, Clicks

Data Replication Expectations

  • Initial Sync: Last 90 days of network reports by default (configurable up to 365 days)
  • Incremental Sync: Data since last successful sync timestamp using date-based cursors
  • Sync Frequency:
    • Production: Every 4 hours for network reports, daily for account updates
    • Development: Daily for all data types
  • Data Retention: 2+ years of historical report data for trend analysis
  • Backfill Capability: Full historical data available based on AdMob retention policies (typically 2 years)
  • Data Consistency: Daily updates with 4-hour maximum lag for operational reporting

Operational Requirements

  • Uptime SLA: 99.5% availability during business hours (revenue reporting critical)
  • Performance SLA:
    • Initial sync: <4 hours for 90 days of report data
    • Incremental sync: <30 minutes for daily updates
  • Error Handling:
    • Automatic retry with exponential backoff and jitter
    • Dead letter queue for failed report records
    • Alert on consecutive sync failures during reporting periods
  • Monitoring:
    • API response times and error rates
    • Report record count trends and anomaly detection
    • Revenue data completeness validation
    • OAuth token refresh success rates
  • Security:
    • OAuth tokens refreshed automatically (1-hour expiration)
    • Access logs maintained for 2 years (audit compliance)
    • Publisher account data handling per privacy regulations

Rate Limiting Strategy

  • Standard Quotas: 1,000 requests/hour per OAuth client, 10,000 requests/day
  • Quota Management: Implement exponential backoff with jitter for 429 responses
  • Error Handling: 429 status code indicates rate limit exceeded, respect Retry-After header when provided
  • Recommended: Implement exponential backoff with jitter (10-30% of wait time) to prevent thundering herd
  • Monitoring: Track rate limit utilization and plan for quota increases if needed
  • Retry Strategy: Default 3 retry attempts with exponential backoff, configurable per request

Data Quality Considerations

  • Required Fields: publisher_id, id (composite key), date, estimated_earnings
  • Optional Fields: ad_unit_name, app_name, country, platform, ad_type, matched_requests
  • Data Validation:
    • Publisher IDs must be valid AdMob account identifiers
    • Report IDs must be unique (composite: publisher_id + date + ad_unit_id)
    • Dates must be valid and within supported range
    • Earnings values must be non-negative and converted from micros (divided by 1,000,000)
    • Metric values must be numeric (integer or float)
  • Data Completeness:
    • Accounts: 100% have basic publisher information
    • Network Reports: 95%+ have complete metric data for active ad units
    • Historical Reports: 90%+ completeness for date ranges
  • Duplicate Handling: Primary key constraints prevent duplicate report records (publisher_id + date + ad_unit_id)
  • Data Transformation:
    • Micros values (earnings) converted to decimal currency format
    • ISO 8601 timestamps for all date/time fields
    • UTC timezone normalization for consistency

Integration Points

  • Fivetran Destinations: Snowflake, BigQuery, Redshift, PostgreSQL, Databricks
  • Downstream Systems:
    • Marketing analytics platforms (Tableau, Looker, Power BI)
    • Revenue operations systems
    • Mobile app analytics dashboards
    • Financial reporting systems
    • Ad performance optimization tools
  • Data Dependencies: None - standalone advertising data source
  • External Dependencies: Google AdMob API availability, OAuth token refresh service

Disaster Recovery

  • Backup Strategy: Daily snapshots of all account and report tables
  • Recovery Time Objective: 4 hours for full data recovery
  • Recovery Point Objective: 2 hours maximum data loss for revenue-critical reports
  • Failover: Automatic failover to backup OAuth credentials
  • Testing: Monthly disaster recovery drills with revenue operations team validation

Compliance & Security

  • Data Classification: Revenue data - financially sensitive, publisher account data - business sensitive
  • Retention Policy: 2 years for report data (analytics), 3 years for account data (audit)
  • Access Controls: Strict role-based access with principle of least privilege
  • Audit Trail: All data access logged and monitored for compliance audits
  • Encryption: Data encrypted in transit (HTTPS) and at rest with enterprise-grade security
  • Privacy: Publisher account data privacy compliance, revenue data accuracy for financial reporting
  • OAuth Security: Refresh tokens stored securely, access tokens never logged or exposed

Performance Optimization

  • Streaming Processing: Generator-based data processing prevents memory accumulation
  • Checkpointing: Incremental state checkpoints every N records (configurable, default 100)
  • Caching: Account data cached during sync to avoid redundant API calls
  • Indexing: Publisher ID, date, ad unit ID, and app ID columns indexed for efficient querying
  • Partitioning: Report data partitioned by date and publisher for efficient querying
  • Parallel Processing: Multiple publisher accounts processed sequentially to respect rate limits
  • Memory Management: Streaming approach prevents memory issues with large report datasets

Troubleshooting Guide

  • Common Issues:
    • OAuth token refresh failed: Verify client_id, client_secret, and refresh_token validity
    • Rate limit exceeded: Reduce sync frequency, implement backoff delays, or request quota increase
    • Missing report data: Check publisher ID validity and account access permissions
    • Timeout errors: Increase timeout values (default 30 seconds) or reduce batch size
    • Revenue discrepancies: Validate micros-to-currency conversion (division by 1,000,000)
    • Date range errors: Verify initial_sync_days configuration and AdMob data availability
    • Network errors: Check API endpoint availability and network connectivity
  • Debug Mode: Enable detailed logging via enable_debug_logging configuration parameter
  • Support Contacts:
    • Technical: Data Engineering team
    • Business: Marketing Analytics team
    • Vendor: Google AdMob support (for API and account issues)
    • Revenue Operations: Revenue Operations team (for data accuracy validation)

Configuration Parameters

  • Required Parameters:
    • client_id: OAuth2 client identifier from Google Cloud Console
    • client_secret: OAuth2 client secret from Google Cloud Console
    • refresh_token: OAuth2 refresh token for automatic access token renewal
  • Optional Parameters:
    • sync_frequency_hours: Incremental sync frequency (default: 4 hours)
    • initial_sync_days: Historical data range for initial sync (default: 90 days, max: 365)
    • max_records_per_page: Batch size for checkpointing (default: 100, range: 1-1000)
    • request_timeout_seconds: HTTP request timeout (default: 30 seconds)
    • retry_attempts: Number of retry attempts for failed requests (default: 3)
    • enable_incremental_sync: Enable date-based incremental sync (default: true)
    • enable_debug_logging: Enable detailed logging (default: false)

Data Transformation Details

  • Metric Conversions:
    • Estimated Earnings: Micros value divided by 1,000,000 to get currency amount
    • All monetary values converted from micros to decimal format
  • Timestamp Handling:
    • All timestamps converted to UTC timezone
    • ISO 8601 format for all date/time fields
    • Date-only fields stored as YYYY-MM-DD strings
  • Schema Mapping:
    • API dimensionValues and metricValues flattened to columnar format
    • Display labels extracted where available for human-readable names
    • Composite primary keys generated from publisher_id, date, and ad_unit_id

Checklist

Some tips and links to help validate your PR:

  • Tested the connector with fivetran debug command.
  • Added/Updated example specific README.md file, refer here for template.
  • Followed Python Coding Standards, refer here
capture

@fivertran-karunveluru fivertran-karunveluru requested review from a team as code owners November 1, 2025 01:17
@fivertran-karunveluru fivertran-karunveluru added the hackathon For all the PRs related to the internal Fivetran 2025 Connector SDK Hackathon. label Nov 1, 2025
@github-actions github-actions bot added the size/XL PR size: extra large label Nov 1, 2025
@github-actions
Copy link

github-actions bot commented Nov 1, 2025

🧹 Python Code Quality Check

✅ No issues found in Python Files.

🔍 See how this check works

This comment is auto-updated with every commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hackathon For all the PRs related to the internal Fivetran 2025 Connector SDK Hackathon. size/XL PR size: extra large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants