Skip to content

Conversation

@fivetran-surabhisingh
Copy link
Collaborator

Jira ticket

Closes <ADD TICKET LINK HERE, EACH PR MUST BE LINKED TO A JIRA TICKET>

Description of Change

<MENTION A SHORT DESCRIPTION OF YOUR CHANGES HERE>

Testing

<MENTION ABOUT YOUR TESTING DETAILS HERE, ATTACH SCREENSHOTS IF NEEDED (WITHOUT PII)>

Checklist

Some tips and links to help validate your PR:

  • Tested the connector with fivetran debug command.
  • Added/Updated example specific README.md file, refer here for template.
  • Followed Python Coding Standards, refer here

@fivetran-surabhisingh fivetran-surabhisingh self-assigned this Oct 31, 2025
@fivetran-surabhisingh fivetran-surabhisingh requested a review from a team as a code owner October 31, 2025 18:23
@fivetran-surabhisingh fivetran-surabhisingh added the hackathon For all the PRs related to the internal Fivetran 2025 Connector SDK Hackathon. label Oct 31, 2025
@github-actions github-actions bot added the size/S PR size: small label Oct 31, 2025
@github-actions
Copy link

🧹 Python Code Quality Check

⚠️ Flake8 has detected issues, please fix the issues before merging:

📎 Download full report from workflow artifacts.

📌 Only Python files changed in this PR were checked.

🔍 See how this check works

This comment is auto-updated with every commit.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new DagWorks connector that fetches pipeline metadata and run status from the DagWorks API. However, the implementation uses an incorrect API pattern that is incompatible with the Fivetran Connector SDK.

Key Changes:

  • Adds a new connector at connectors/DagWorks/connector.py using decorator-based configuration
  • Implements a single table schema for pipeline runs
  • Fetches data from a DagWorks API endpoint

Comment on lines +6 to +7
import requests
from fivetran_connector_sdk import connector, config, state, records, log, schema
Copy link

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BLOCKER: The imports are using an incorrect SDK API. The Fivetran Connector SDK does not expose connector, config, state, records, or schema as importable modules with these APIs. The correct SDK v2+ imports should be: from fivetran_connector_sdk import Connector, from fivetran_connector_sdk import Logging as log, and from fivetran_connector_sdk import Operations as op. See the template at template_example_connector/connector.py lines 11-17 for the correct import pattern.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +9 to +12
CONFIG = config.Config(
base_url=config.StringField(description="DagWorks API base URL"),
api_key=config.SecretField()
)
Copy link

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BLOCKER: The SDK does not use a config.Config() API for configuration. Configuration should be defined in a separate configuration.json file with fields like base_url and api_key, which the SDK will auto-validate. Configuration values are then accessed in the update() function via the configuration parameter dictionary (e.g., configuration['base_url']).

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +14 to +23
SCHEMA = schema.Schema(
name="dagworks_runs",
columns={
"id": schema.StringColumn(),
"dag_id": schema.StringColumn(),
"status": schema.StringColumn(),
"start_time": schema.StringColumn(),
"end_time": schema.StringColumn(),
}
)
Copy link

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BLOCKER: The SDK does not use a schema.Schema() API with column type classes. Schema should be defined in a schema() function that returns a list of table dictionaries. For example: return [{"table": "dagworks_runs", "primary_key": ["id"], "columns": {"id": "STRING", "dag_id": "STRING", "status": "STRING", "start_time": "STRING", "end_time": "STRING"}}]. See template_example_connector/connector.py lines 65-82 for the correct pattern.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +25 to +31
@connector(
name="DagWorksConnector",
version="0.1.0",
config=CONFIG,
schema=SCHEMA,
)
def run_connector(ctx: state.Context):
Copy link

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BLOCKER: The SDK does not use a decorator-based @connector API. The correct pattern is to define an update(configuration: dict, state: dict) function and optionally a schema(configuration: dict) function, then create a connector object using connector = Connector(update=update, schema=schema). The update function should accept configuration and state dictionaries, not a Context object. See template_example_connector/connector.py lines 85-154 for the correct structure.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +32 to +34
headers = {"Authorization": f"Bearer {ctx.config.api_key}"}
response = requests.get(f"{ctx.config.base_url}/runs", headers=headers)
response.raise_for_status()
Copy link

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BLOCKER: No retry logic is implemented for the API request. Network calls must include retry logic with exponential backoff to handle transient failures. Wrap the request in a try-except block catching specific exceptions like requests.Timeout, requests.ConnectionError, and implement retries with a pattern like: for attempt in range(__MAX_RETRIES): try: ... except (...) as e: if attempt == __MAX_RETRIES - 1: raise; sleep_time = min(60, 2 ** attempt); time.sleep(sleep_time).

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +1 to +4
"""
DagWorks connector for Fivetran Connector SDK.
Fetches pipeline metadata and run status from DagWorks API.
"""
Copy link

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module docstring is missing the required reference to technical documentation. It should follow the template format: Include a description and add references like See the Technical Reference documentation (https://fivetran.com/docs/connectors/connector-sdk/technical-reference#update) and the Best Practices documentation (https://fivetran.com/docs/connectors/connector-sdk/best-practices) for details. See template_example_connector/connector.py lines 1-5.

Copilot generated this review using guidance from repository custom instructions.
Fetches pipeline metadata and run status from DagWorks API.
"""

import requests
Copy link

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requests import is missing an explanatory comment. Every import must have an inline comment explaining its purpose. This should be: import requests # For making HTTP API requests (NOTE: provided by SDK runtime). Additionally, requests should never be added to requirements.txt as it's provided by the SDK runtime.

Copilot generated this review using guidance from repository custom instructions.
for run in response.json().get("runs", []):
records.write("dagworks_runs", run)

return ctx.update_state({"last_sync": "now"})
Copy link

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state value "now" is not a valid timestamp format. State should use ISO 8601 format timestamps (e.g., "2024-01-15T10:30:00Z") or other meaningful cursor values. Using "now" as a string literal is not actionable for incremental syncs and doesn't provide proper cursor-based pagination.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +1 to +9
"""
DagWorks connector for Fivetran Connector SDK.
Fetches pipeline metadata and run status from DagWorks API.
"""

import requests
from fivetran_connector_sdk import connector, config, state, records, log, schema

CONFIG = config.Config(
Copy link

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The connector is missing the required main block for debugging. At the end of the file, you must include: connector = Connector(update=update, schema=schema) followed by an if __name__ == "__main__": block that loads configuration from a JSON file and calls connector.debug(configuration=configuration). See template_example_connector/connector.py lines 153-166 for the exact pattern and required comments.

Copilot generated this review using guidance from repository custom instructions.
"""

import requests
from fivetran_connector_sdk import connector, config, state, records, log, schema
Copy link

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'log' is not used.

Suggested change
from fivetran_connector_sdk import connector, config, state, records, log, schema
from fivetran_connector_sdk import connector, config, state, records, schema

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hackathon For all the PRs related to the internal Fivetran 2025 Connector SDK Hackathon. size/S PR size: small

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant