Skip to content

Conversation

@fivertran-karunveluru
Copy link
Collaborator

Humaans Connector

Created: 2025-10-31

Business Owner: HR & People Operations Team

Technical Owner: Data Engineering Team

Last Updated: 2025-10-31

Business Context

  • Data Source: Humaans API for HR, people management, and organizational data
  • Business Criticality: High - supports HR operations, document management, and recruitment analytics
  • Data Consumers: HR teams, people operations, recruitment teams, finance teams, executive leadership
  • Business SLAs: Data must be fresh within 6 hours for HR operations, 24 hours for reporting and analytics
  • Compliance Requirements: GDPR compliance for employee data, document retention policies, data privacy regulations
  • Budget Constraints: Humaans API access included with subscription, rate limits based on plan tier

Technical Context

  • API Documentation: https://app.humaans.io/api (API endpoint and documentation)
  • Authentication Method: Bearer token authentication with API token
  • Rate Limits: Varies by Humaans plan, typically 1000-5000 requests/hour
  • Data Volume:
    • Companies: 1-100+ companies per integration
    • Documents: 100-50,000+ documents per organization
    • Job Roles: 10-500+ job roles per organization
    • Documents per person: 5-50+ documents per employee
  • Data Velocity: Company data updated on changes, documents updated on upload/modification, job roles updated quarterly or on hiring needs
  • Data Quality: Structured JSON with consistent schema, some fields may be null for incomplete records
  • Network Considerations: HTTPS only, RESTful API with standard reliability, cloud-based infrastructure

Operational Context

  • Deployment Environment: Development (sandbox), staging, and production environments
  • Monitoring Requirements: Alert on >2% error rate, >3 hour sync time, document data discrepancies
  • Maintenance Windows: Weekends for non-critical updates, immediate deployment for HR-critical fixes
  • Team Structure: Data Engineering team, HR Operations, People Operations, Recruitment teams
  • Escalation Path: Data Engineer → Team Lead → HR Director → CHRO

API-Specific Details

  • Base Endpoint: https://app.humaans.io/api
  • Authentication: Bearer token in Authorization header
  • Pagination: Offset-based using $skip and $limit parameters (max 250 per page, default 100)
  • Date Format: ISO 8601 (e.g., 2024-01-15T10:30:00Z)
  • Response Format: JSON with nested objects and arrays, paginated responses in data field
  • Key Endpoints:
    • /companies - Company information, addresses, and contact details
    • /documents - Employee documents with metadata, file information, and expiry dates
    • /job-roles - Job role definitions with salary ranges, requirements, and department information

Data Schema Overview

  • company: Company profile, addresses, contact information, and organizational details
  • document: Employee documents with metadata, file information, expiry dates, and tags
  • job_role: Job role definitions with salary ranges, requirements, responsibilities, and benefits

Data Replication Expectations

  • Initial Sync: Last 90 days of company, document, and job role data for baseline (configurable)
  • Incremental Sync: Data since last successful sync timestamp using updated_at field
  • Sync Frequency:
    • Production: Every 6 hours for company and document data, daily for job roles
    • Development: Daily for all data types
  • Data Retention: 7 years of historical document data for compliance requirements
  • Backfill Capability: Full historical data available based on Humaans retention policies
  • Data Consistency: Near real-time with 6-hour maximum lag for HR operations

Operational Requirements

  • Uptime SLA: 99.5% availability during business hours (HR operations important)
  • Performance SLA:
    • Initial sync: <4 hours for 90 days of data
    • Incremental sync: <1 hour for daily updates
  • Error Handling:
    • Automatic retry with exponential backoff
    • Dead letter queue for failed document/company records
    • Alert on consecutive sync failures during business hours
  • Monitoring:
    • API response times and error rates
    • Document count trends and anomaly detection
    • Company and job role data completeness validation
  • Security:
    • API tokens refreshed as needed
    • Access logs maintained for 3 years (compliance)
    • Employee PII handling per privacy regulations

Rate Limiting Strategy

  • Standard Plan: 1,000 requests/hour, 10,000 requests/day
  • Professional Plan: 3,000 requests/hour, 30,000 requests/day
  • Enterprise Plan: 5,000 requests/hour, 50,000 requests/day
  • Recommended: Implement exponential backoff with jitter for 429 responses
  • Error Handling: 429 status code indicates rate limit exceeded, respect Retry-After header
  • Monitoring: Track rate limit utilization and plan for subscription upgrades

Data Quality Considerations

  • Required Fields:
    • Company: id, name, created_at, updated_at
    • Document: id, person_id, name, created_at, updated_at
    • Job Role: id, name, created_at, updated_at
  • Optional Fields:
    • Company: description, website, phone, address fields
    • Document: description, type, filename, file_size, mime_type, url, expiry_date, tags
    • Job Role: description, department, location, employment_type, salary_range, requirements, responsibilities, benefits, is_active
  • Data Validation:
    • Company IDs must be unique
    • Document IDs must be unique within organization
    • Job role IDs must be unique
    • Timestamps must be valid ISO 8601 format
    • File sizes must be non-negative
    • Salary ranges must have valid min/max values
  • Data Completeness:
    • Companies: 100% have basic identification data
    • Documents: 95%+ have file metadata and person association
    • Job Roles: 90%+ have basic role information
  • Duplicate Handling: Primary key constraints prevent duplicate records

Integration Points

  • Fivetran Destinations: Snowflake, BigQuery, Redshift, PostgreSQL
  • Downstream Systems:
    • HR information systems (HRIS)
    • Document management platforms
    • Recruitment and applicant tracking systems
    • People analytics platforms
    • Financial reporting systems
  • Data Dependencies: None - standalone HR/organizational data source
  • External Dependencies: Humaans API availability, document storage infrastructure

Disaster Recovery

  • Backup Strategy: Daily snapshots of all company, document, and job role tables
  • Recovery Time Objective: 6 hours for full data recovery
  • Recovery Point Objective: 3 hours maximum data loss for HR-critical data
  • Failover: Automatic failover to backup API credentials
  • Testing: Monthly disaster recovery drills with HR team validation

Compliance & Security

  • Data Classification: Employee PII - highly sensitive, document data - confidential, job role data - internal
  • Retention Policy: 7 years for document data (compliance), 3 years for operational HR data
  • Access Controls: Strict role-based access with principle of least privilege
  • Audit Trail: All data access logged and monitored for compliance audits
  • Encryption: Data encrypted in transit and at rest with enterprise-grade security
  • Privacy: GDPR compliance for EU employees, CCPA compliance for CA employees, document privacy regulations

Performance Optimization

  • Parallel Processing: Multiple API calls for different data types (companies, documents, job roles)
  • Caching: Company data cached for 24 hours
  • Indexing: Document ID, company ID, person ID, job role ID, and date columns indexed
  • Partitioning: Document data partitioned by date and person for efficient querying
  • Compression: Historical document metadata compressed for storage efficiency

Troubleshooting Guide

  • Common Issues:
    • Rate limit exceeded: Reduce sync frequency or upgrade Humaans plan
    • API token expired: Verify token validity and permissions
    • Missing document data: Check person ID and document active status
    • Timeout errors: Increase timeout values or reduce batch size
    • Job role data discrepancies: Validate employment type and salary range handling
    • Document expiry date issues: Verify date format parsing and timezone handling
  • Debug Mode: Enable detailed logging for company, document, and job role data troubleshooting
  • Support Contacts:
    • Technical: Data Engineering team
    • Business: HR Operations team
    • Vendor: Humaans support (for API and account issues)
    • Compliance: Legal/Compliance team (for privacy and regulatory issues)

Checklist

Some tips and links to help validate your PR:

  • Tested the connector with fivetran debug command.
  • Added/Updated example specific README.md file, refer here for template.
  • Followed Python Coding Standards, refer here
capture

@fivertran-karunveluru fivertran-karunveluru requested review from a team as code owners November 1, 2025 00:18
@fivertran-karunveluru fivertran-karunveluru added the hackathon For all the PRs related to the internal Fivetran 2025 Connector SDK Hackathon. label Nov 1, 2025
@github-actions github-actions bot added the size/XL PR size: extra large label Nov 1, 2025
@github-actions
Copy link

github-actions bot commented Nov 1, 2025

🧹 Python Code Quality Check

✅ No issues found in Python Files.

🔍 See how this check works

This comment is auto-updated with every commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hackathon For all the PRs related to the internal Fivetran 2025 Connector SDK Hackathon. size/XL PR size: extra large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant