Skip to content

Conversation

@fivertran-karunveluru
Copy link
Collaborator

Marqeta Connector

Created: 2025-10-31

Business Owner: Payments & Card Operations Team

Technical Owner: Data Engineering Team

Last Updated: 2025-10-31

Business Context

  • Data Source: Marqeta Core API for card issuing, payment processing, and transaction data
  • Business Criticality: Critical - supports payment processing, transaction reconciliation, compliance reporting, and fraud monitoring
  • Data Consumers: Payments operations, finance teams, compliance officers, fraud analysts, product teams, executive leadership
  • Business SLAs: Transaction data must be fresh within 1 hour for real-time monitoring, 4 hours for reporting and reconciliation
  • Compliance Requirements: PCI DSS compliance for card data, GDPR compliance for user data, KYC/AML reporting requirements, financial regulatory compliance
  • Budget Constraints: Marqeta API access based on subscription tier, rate limits vary by plan and usage volume

Technical Context

  • API Documentation: https://www.marqeta.com/docs
  • Authentication Method: HTTP Basic Authentication (username and password)
  • Rate Limits: Varies by Marqeta subscription plan and API endpoint, typically handles 429 responses with Retry-After headers
  • Data Volume:
    • Users: 100-100,000+ cardholders per program
    • Businesses: 10-10,000+ business entities per program
    • Transactions: 1,000-10,000,000+ transactions per month per program
    • Real-time transaction data with high velocity updates
  • Data Velocity: Users and businesses updated on changes, transactions updated in near real-time (within seconds of processing)
  • Data Quality: Structured JSON with consistent schema, some fields may be null for incomplete records, nested objects serialized as JSON strings
  • Network Considerations: HTTPS only, RESTful API with standard reliability, global infrastructure with regional endpoints

Operational Context

  • Deployment Environment: Sandbox (for testing), staging, and production environments
  • Monitoring Requirements: Alert on >2% error rate, >1 hour sync time for transactions, transaction data discrepancies, rate limit violations
  • Maintenance Windows: Off-peak hours for non-critical updates, immediate deployment for payment-critical fixes
  • Team Structure: Data Engineering team, Payments Operations, Compliance officers, Fraud analysts, Product teams
  • Escalation Path: Data Engineer → Team Lead → Payments Director → CTO/CFO

API-Specific Details

  • Base Endpoint: https://api.marqeta.com/v3
  • Sandbox Endpoint: Same base URL with sandbox credentials
  • Authentication: HTTP Basic Auth (username and password in Authorization header)
  • Pagination: Index-based pagination using start_index and count parameters (max 500 per page, default 100)
  • Date Format: ISO 8601 (e.g., 2024-01-15T10:30:00Z)
  • Response Format: JSON with nested objects and arrays (nested objects serialized to JSON strings)
  • Key Endpoints:
    • /users - User profiles and cardholder information
    • /businesses - Business entities and company information
    • /transactions - Transaction records and payment processing data

Data Schema Overview

  • users: User profiles with personal information, addresses, account status, and metadata
  • businesses: Business entities with legal names, DBA names, EINs, addresses, and registration details
  • transactions: Transaction records with amounts, currencies, merchant information, network details, settlement data, and processing metadata

Data Replication Expectations

  • Initial Sync: Configurable historical data fetch (default 90 days, up to 365 days) for compliance baseline
  • Incremental Sync: Data since last successful sync timestamp using last_sync_time state parameter
  • Sync Frequency:
    • Production: Every 1 hour for transaction data (critical for real-time monitoring), 4 hours for users and businesses
    • Development/Sandbox: Every 4 hours for all data types
  • Data Retention: 7 years of historical transaction data for compliance requirements (PCI DSS, financial regulations)
  • Backfill Capability: Full historical data available based on Marqeta retention policies and initial sync configuration
  • Data Consistency: Near real-time transaction updates (within 1 hour), user/business updates within 4 hours

Operational Requirements

  • Uptime SLA: 99.9% availability (critical for payment processing operations)
  • Performance SLA:
    • Initial sync: <8 hours for 90 days of transaction data
    • Incremental sync: <30 minutes for hourly transaction updates, <45 minutes for user/business updates
  • Error Handling:
    • Automatic retry with exponential backoff and jitter
    • Dead letter queue for failed transaction records
    • Alert on consecutive sync failures during peak payment periods
    • Rate limit handling with Retry-After header support
  • Monitoring:
    • API response times and error rates
    • Transaction count trends and anomaly detection
    • Payment data completeness validation
    • Rate limit utilization tracking
  • Security:
    • HTTP Basic Auth credentials encrypted in transit and at rest
    • Access logs maintained for 7 years (compliance)
    • Cardholder PII handling per PCI DSS and privacy regulations
    • No storage of sensitive card data (tokenized values only)

Rate Limiting Strategy

  • Sandbox Plan: Lower rate limits for development and testing
  • Production Plans: Rate limits vary by subscription tier and program volume
  • Recommended: Implement exponential backoff with jitter for 429 responses
  • Error Handling: 429 status code indicates rate limit exceeded, respect Retry-After header
  • Monitoring: Track rate limit utilization and plan for subscription upgrades as volume grows
  • Mitigation: Adjust sync frequency or batch size if rate limits are consistently hit

Data Quality Considerations

  • Required Fields: token (primary key for all tables), created_time for timestamp tracking
  • Optional Fields: Various fields may be null depending on data completeness (e.g., address2, middle_name, metadata)
  • Data Validation:
    • Tokens must be unique within each table
    • Email addresses should be valid format when present
    • Transaction amounts must be numeric and non-negative
    • Currency codes must be valid ISO 4217 codes
    • Timestamps must be valid ISO 8601 format
  • Data Completeness:
    • Users: 100% have basic profile data (name, email, token)
    • Businesses: 95%+ have complete registration information
    • Transactions: 100% have core transaction data (amount, currency, timestamp)
  • Duplicate Handling: Primary key constraints prevent duplicate records by token
  • Nested Data: Complex nested objects (merchant, card_acceptor, acquirer, issuer, response, metadata) serialized as JSON strings

Integration Points

  • Fivetran Destinations: Snowflake, BigQuery, Redshift, PostgreSQL, Databricks
  • Downstream Systems:
    • Payment processing and reconciliation platforms
    • Fraud detection and monitoring systems
    • Financial reporting and analytics tools
    • Compliance and regulatory reporting systems
    • Business intelligence and data warehouses
  • Data Dependencies: None - standalone payment processing data source
  • External Dependencies: Marqeta API availability, payment processing infrastructure

Disaster Recovery

  • Backup Strategy: Daily snapshots of all transaction, user, and business tables
  • Recovery Time Objective: 2 hours for full data recovery
  • Recovery Point Objective: 1 hour maximum data loss for transaction-critical data
  • Failover: Automatic failover to backup API credentials
  • Testing: Monthly disaster recovery drills with payments team validation

Compliance & Security

  • Data Classification:
    • Transaction data - financial sensitive, PCI DSS regulated
    • User PII - highly sensitive, GDPR/CCPA regulated
    • Business data - moderately sensitive
  • Retention Policy: 7 years for transaction data (compliance), 3 years for operational user/business data
  • Access Controls: Strict role-based access with principle of least privilege
  • Audit Trail: All data access logged and monitored for compliance audits
  • Encryption: Data encrypted in transit (HTTPS) and at rest with enterprise-grade security
  • Privacy:
    • GDPR compliance for EU users
    • CCPA compliance for CA users
    • PCI DSS compliance for card data handling
    • KYC/AML compliance for user verification data

Performance Optimization

  • Parallel Processing: Multiple API calls for different data types (users, businesses, transactions)
  • Caching: User and business profile data cached for 1 hour
  • Indexing: Token (primary key), created_time, user_token, business_token, card_token, settlement_date columns indexed
  • Partitioning: Transaction data partitioned by date and settlement_date for efficient querying
  • Compression: Historical transaction data compressed for storage efficiency
  • Memory Efficiency: Streaming generator-based processing prevents memory accumulation for large datasets

Troubleshooting Guide

  • Common Issues:
    • Rate limit exceeded: Reduce sync frequency, adjust batch size, or upgrade Marqeta plan
    • Authentication failed: Verify HTTP Basic Auth credentials and account permissions
    • Missing transaction data: Check date range filters and transaction state filters
    • Timeout errors: Increase timeout values or reduce batch size
    • Transaction reconciliation discrepancies: Validate settlement_date handling and currency conversion
    • User/business data incomplete: Check API permissions and data completeness in source
    • Pagination errors: Verify start_index and count parameters are within valid ranges
  • Debug Mode: Enable detailed logging via enable_debug_logging configuration parameter
  • Support Contacts:
    • Technical: Data Engineering team
    • Business: Payments Operations team
    • Vendor: Marqeta support (for API and account issues)
    • Compliance: Legal/Compliance team (for privacy and regulatory issues)

Checklist

Some tips and links to help validate your PR:

  • Tested the connector with fivetran debug command.
  • Added/Updated example specific README.md file, refer here for template.
  • Followed Python Coding Standards, refer here
capture

@fivertran-karunveluru fivertran-karunveluru requested a review from a team as a code owner November 1, 2025 00:25
@fivertran-karunveluru fivertran-karunveluru added the hackathon For all the PRs related to the internal Fivetran 2025 Connector SDK Hackathon. label Nov 1, 2025
@fivertran-karunveluru fivertran-karunveluru requested a review from a team as a code owner November 1, 2025 00:25
@github-actions github-actions bot added the size/XL PR size: extra large label Nov 1, 2025
@github-actions
Copy link

github-actions bot commented Nov 1, 2025

🧹 Python Code Quality Check

✅ No issues found in Python Files.

🔍 See how this check works

This comment is auto-updated with every commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hackathon For all the PRs related to the internal Fivetran 2025 Connector SDK Hackathon. size/XL PR size: extra large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant