Skip to content

Conversation

@fivertran-karunveluru
Copy link
Collaborator

Meltwater API Connector

Created: 2025-10-31

Business Owner: Marketing & Communications Team

Technical Owner: Data Engineering Team

Last Updated: 2025-10-31

Business Context

  • Data Source: Meltwater API for media monitoring, webhook configurations, and content tagging
  • Business Criticality: High - supports brand monitoring, media intelligence, and real-time alerting
  • Data Consumers: Marketing teams, PR specialists, social media managers, brand managers, communications directors
  • Business SLAs: Data must be fresh within 4 hours for real-time monitoring alerts, 24 hours for reporting and analytics
  • Compliance Requirements: GDPR compliance for media mentions, brand protection requirements, competitive intelligence
  • Budget Constraints: Meltwater API access based on subscription tier, rate limits vary by plan

Technical Context

  • API Documentation: https://developer.meltwater.com/
  • Authentication Method: Bearer token authentication (API key)
  • Rate Limits: Varies by Meltwater plan, typically 1000-10000 requests/hour
  • Data Volume:
    • Hooks: 10-500+ webhook configurations per account
    • Searches: 50-1000+ saved search definitions per account
    • Tags: 100-5000+ tag definitions per account
    • Updates: Continuous updates for active searches and webhook configurations
  • Data Velocity: Webhook configurations updated on changes, searches updated daily, tags updated as users modify taxonomies
  • Data Quality: Structured JSON with consistent schema, some fields may be null for incomplete records
  • Network Considerations: HTTPS only, RESTful API with standard reliability, global infrastructure

Operational Context

  • Deployment Environment: Development (sandbox), staging, and production environments
  • Monitoring Requirements: Alert on >2% error rate, >2 hour sync time, webhook delivery failures
  • Maintenance Windows: Weekends for non-critical updates, immediate deployment for monitoring-critical fixes
  • Team Structure: Data Engineering team, Marketing Operations, Brand Management, PR teams
  • Escalation Path: Data Engineer → Team Lead → Marketing Director → CMO

API-Specific Details

  • Base Endpoint: https://api.meltwater.com/v2
  • Authentication: Bearer token in Authorization header
  • Pagination: Offset-based pagination with limit and offset parameters (max 1000 per page, default 100)
  • Date Format: ISO 8601 (e.g., 2024-01-15T10:30:00Z)
  • Response Format: JSON with nested objects and arrays
  • Key Endpoints:
    • /hooks - Webhook configurations and event subscriptions
    • /searches - Saved search definitions and query parameters
    • /tags - Tag definitions and categorization metadata

Data Schema Overview

  • hooks: Webhook configurations with URLs, event types, and subscription status
  • searches: Saved search definitions with queries, language filters, source types, and active status
  • tags: Tag definitions with names, descriptions, colors, categories, and usage counts

Data Replication Expectations

  • Initial Sync: Last 90 days of historical data by default (configurable up to 365 days)
  • Incremental Sync: Data since last successful sync timestamp using updated_since parameter
  • Sync Frequency:
    • Production: Every 4 hours for all data types
    • Development: Daily for all data types
  • Data Retention: 2 years of historical data for trend analysis and compliance
  • Data Consistency: Near real-time with 4-hour maximum lag for monitoring operations

Operational Requirements

  • Uptime SLA: 99.5% availability during business hours (monitoring critical)
  • Performance SLA:
    • Initial sync: <2 hours for 90 days of historical data
    • Incremental sync: <30 minutes for regular updates
  • Error Handling:
    • Automatic retry with exponential backoff and jitter
    • Dead letter queue for failed webhook/search records
    • Alert on consecutive sync failures during critical monitoring periods
  • Monitoring:
    • API response times and error rates
    • Search count trends and anomaly detection
    • Webhook configuration completeness validation
    • Tag usage and categorization quality
  • Security:
    • API keys rotated every 90 days
    • Access logs maintained for 2 years (compliance)
    • Webhook URLs validated and secured

Rate Limiting Strategy

  • Starter Plan: 1,000 requests/hour, 10,000 requests/day
  • Professional Plan: 5,000 requests/hour, 50,000 requests/day
  • Enterprise Plan: 10,000 requests/hour, 100,000 requests/day
  • Recommended: Implement exponential backoff with jitter for 429 responses
  • Error Handling: 429 status code indicates rate limit exceeded, respect Retry-After header
  • Monitoring: Track rate limit utilization and plan for subscription upgrades

Data Quality Considerations

  • Required Fields: id, name, created_at, updated_at
  • Optional Fields: description, tags, color, category, usage_count
  • Data Validation:
    • IDs must be unique within account
    • Webhook URLs must be valid format
    • Search queries must be valid syntax
    • Tag names must be non-empty strings
    • Timestamps must be valid ISO 8601 format
  • Data Completeness:
    • Hooks: 100% have basic configuration data
    • Searches: 95%+ have complete query definitions
    • Tags: 90%+ have categorization information
  • Duplicate Handling: Primary key constraints prevent duplicate records

Integration Points

  • Fivetran Destinations: Snowflake, BigQuery, Redshift, PostgreSQL
  • Downstream Systems:
    • Business intelligence and analytics platforms
    • Brand monitoring dashboards
    • Competitive intelligence systems
    • Marketing automation platforms
    • Social media management tools
    • PR and communications platforms
  • Data Dependencies: None - standalone media monitoring data source
  • External Dependencies: Meltwater API availability, webhook endpoint accessibility

Disaster Recovery

  • Backup Strategy: Daily snapshots of all webhook, search, and tag tables
  • Recovery Time Objective: 4 hours for full data recovery
  • Recovery Point Objective: 4 hours maximum data loss for monitoring-critical data
  • Failover: Automatic failover to backup API credentials
  • Testing: Monthly disaster recovery drills with Marketing team validation

Compliance & Security

  • Data Classification: Media mentions - public data, webhook configurations - operational sensitive, tag taxonomies - business sensitive
  • Retention Policy: 2 years for historical monitoring data (compliance), 1 year for operational configuration data
  • Access Controls: Strict role-based access with principle of least privilege
  • Audit Trail: All data access logged and monitored for compliance audits
  • Encryption: Data encrypted in transit and at rest with enterprise-grade security
  • Privacy: GDPR compliance for EU mentions, CCPA compliance for CA mentions, brand protection requirements

Performance Optimization

  • Parallel Processing: Multiple API calls for different data types (hooks, searches, tags)
  • Caching: Tag definitions cached for 24 hours
  • Indexing: ID, name, created_at, updated_at, and status columns indexed
  • Partitioning: Historical data partitioned by date for efficient querying
  • Compression: Historical configuration data compressed for storage efficiency
  • Streaming: Memory-efficient generator patterns prevent data accumulation for large datasets

Troubleshooting Guide

  • Common Issues:
    • Rate limit exceeded: Reduce sync frequency or upgrade Meltwater plan
    • API key expired: Verify token validity and permissions
    • Missing search data: Check search active status and query validity
    • Timeout errors: Increase timeout values or reduce batch size
    • Webhook delivery failures: Validate webhook URLs and event type subscriptions
    • Pagination issues: Verify offset/limit parameters and total record counts
  • Debug Mode: Enable detailed logging for webhook, search, and tag data troubleshooting
  • Support Contacts:
    • Technical: Data Engineering team
    • Business: Marketing Operations team
    • Vendor: Meltwater support (for API and account issues)
    • Compliance: Legal/Compliance team (for privacy and regulatory issues)

Checklist

Some tips and links to help validate your PR:

  • Tested the connector with fivetran debug command.
  • Added/Updated example specific README.md file, refer here for template.
  • Followed Python Coding Standards, refer here
capture

@fivertran-karunveluru fivertran-karunveluru requested review from a team as code owners October 31, 2025 22:19
@fivertran-karunveluru fivertran-karunveluru added the hackathon For all the PRs related to the internal Fivetran 2025 Connector SDK Hackathon. label Oct 31, 2025
@github-actions github-actions bot added the size/XL PR size: extra large label Oct 31, 2025
@github-actions
Copy link

🧹 Python Code Quality Check

✅ No issues found in Python Files.

🔍 See how this check works

This comment is auto-updated with every commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hackathon For all the PRs related to the internal Fivetran 2025 Connector SDK Hackathon. size/XL PR size: extra large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant