-
Notifications
You must be signed in to change notification settings - Fork 119
perf: Parallelize all items by central encapsulation into base classes #761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
perf: Parallelize all items by central encapsulation into base classes #761
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the item publishing/unpublishing pipeline to use a centralized, class-based ItemPublisher abstraction, enabling per-item-type parallelism while preserving cross-type ordering and dependency handling. It also formalizes item types and feature flags via enums and consolidates various constants and path/format mappings.
Changes:
- Introduces
ItemType,FeatureFlag,OperationType, and related mappings (SERIAL_ITEM_PUBLISH_ORDER,UNPUBLISH_FLAG_MAPPING,EXCLUDE_PATH_REGEX_MAPPING, etc.) to centralize item/flag metadata. - Replaces function-style item publishers (
publish_*) withItemPublisher-based classes, adding configurable parallelism, dependency-aware ordering (Dataflow/DataPipeline), and optional async post-publish checks (Environment). - Updates
publish_all_items/unpublish_all_orphan_itemsto orchestrate publishing/unpublishing via the new publisher framework, adjusts validation for selective includes, and adapts tests and docs accordingly.
Reviewed changes
Copilot reviewed 38 out of 38 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/test_shortcut_exclude.py | Adapts shortcut exclusion tests to use the new ShortcutPublisher class instead of the removed process_shortcuts function. |
| tests/test_publish.py | Updates publish/unpublish tests to mock concrete publisher classes and to validate type scoping and MirroredDatabase-before-Lakehouse ordering under the new orchestration. |
| tests/test_environment_publish.py | Switches environment tests to exercise EnvironmentPublisher.publish_all, keeping the previous behavior under the refactored class-based model. |
| src/fabric_cicd/publish.py | Replaces the large publish_*/check_environment_publish_state dispatch block with ItemPublisher-based orchestration using SERIAL_ITEM_PUBLISH_ORDER and new unpublish ordering/orphan detection helpers. |
| src/fabric_cicd/fabric_workspace.py | Uses ItemType/FeatureFlag enums in repository/deployed item refresh and _publish_item folder-move logic, aligning workspace internals with the new constants. |
| src/fabric_cicd/constants.py | Introduces ItemType, FeatureFlag, OperationType, publish/unpublish order and flag mappings, path/format mappings, and rewrites ACCEPTED_ITEM_TYPES, shell-only lists, and property mappings to use enum values. |
| src/fabric_cicd/_parameter/_utils.py | Switches Dataflow-specific logic in _extract_item_attribute to compare against ItemType.DATAFLOW.value rather than a hard-coded string. |
| src/fabric_cicd/_items/_warehouse.py | Converts warehouse publishing into WarehousePublisher(ItemPublisher) with per-item publish_one suitable for parallel execution. |
| src/fabric_cicd/_items/_variablelibrary.py | Refactors variable library publishing into VariableLibraryPublisher and keeps activate_value_set as a post-publish per-item action. |
| src/fabric_cicd/_items/_userdatafunction.py | Replaces function-based user data function publishing with a minimal UserDataFunctionPublisher class. |
| src/fabric_cicd/_items/_sqldatabase.py | Wraps SQL Database publishing in SQLDatabasePublisher, enabling parallel publishing while preserving skip/published logging semantics. |
| src/fabric_cicd/_items/_sparkjobdefinition.py | Implements SparkJobDefinitionPublisher using the centralized API_FORMAT_MAPPING for the V2 format. |
| src/fabric_cicd/_items/_semanticmodel.py | Moves core semantic model publishing into SemanticModelPublisher, using shared exclude-path mappings and adding a post_publish_all hook to bind models to connections. |
| src/fabric_cicd/_items/_report.py | Converts report publishing into ReportPublisher, leveraging EXCLUDE_PATH_REGEX_MAPPING and ItemType.SEMANTIC_MODEL for path-to-ID resolution. |
| src/fabric_cicd/_items/_orgapp.py | Replaces org app publishing with an OrgAppPublisher stub class. |
| src/fabric_cicd/_items/_notebook.py | Refactors notebook publishing into a simple NotebookPublisher(ItemPublisher) implementation. |
| src/fabric_cicd/_items/_mounteddatafactory.py | Introduces MountedDataFactoryPublisher for Mounted Data Factory items. |
| src/fabric_cicd/_items/_mlexperiment.py | Converts ML experiment shell-only publishing into MLExperimentPublisher. |
| src/fabric_cicd/_items/_mirroreddatabase.py | Wraps Mirrored Database publishing in MirroredDatabasePublisher to participate in ordered, possibly parallel publishing. |
| src/fabric_cicd/_items/_lakehouse.py | Refactors lakehouse publishing into LakehousePublisher and introduces ShortcutPublisher to encapsulate shortcut processing, including exclusion and orphan handling. |
| src/fabric_cicd/_items/_kqlqueryset.py | Moves KQL Queryset publishing into KQLQuerysetPublisher, adds a pre_publish_all refresh hook, and uses ItemType for type checks and deployed KQL database lookups. |
| src/fabric_cicd/_items/_kqldatabase.py | Provides KQLDatabasePublisher as the item-type-specific publisher class. |
| src/fabric_cicd/_items/_kqldashboard.py | Refactors KQL Dashboard publishing into KQLDashboardPublisher with a pre_publish_all deployed-item refresh and ItemType-based type checks. |
| src/fabric_cicd/_items/_graphqlapi.py | Replaces GraphQL API publishing with GraphQLApiPublisher. |
| src/fabric_cicd/_items/_eventstream.py | Wraps Eventstream publishing in an EventstreamPublisher class. |
| src/fabric_cicd/_items/_eventhouse.py | Re-implements Eventhouse publishing as EventhousePublisher and draws its exclude-path pattern from EXCLUDE_PATH_REGEX_MAPPING. |
| src/fabric_cicd/_items/_environment.py | Refactors environment publishing into EnvironmentPublisher with async state checks, while keeping compute update logic but moving it under the class-based model. |
| src/fabric_cicd/_items/_datapipeline.py | Converts Data Pipeline publishing to DataPipelinePublisher with explicit dependency tracking, ordered (non-parallel) publishing, and unpublish ordering based on set_unpublish_order. |
| src/fabric_cicd/_items/_dataflowgen2.py | Implements DataflowPublisher using ParallelConfig(ordered_items_func=...) to enforce dependency-based sequential ordering, and updates replacement logic to use ItemType/repository lookups. |
| src/fabric_cicd/_items/_dataagent.py | Introduces DataAgentPublisher with shared exclude-path regexes for .pbi artifacts. |
| src/fabric_cicd/_items/_copyjob.py | Wraps Copy Job publishing in a minimal CopyJobPublisher. |
| src/fabric_cicd/_items/_base_publisher.py | Adds the core Publisher/ItemPublisher abstractions, ParallelConfig, and PublishError, centralizing publish orchestration, parallelism, dependency hooks, and unpublish/orphan helper methods. |
| src/fabric_cicd/_items/_apacheairflowjob.py | Converts Apache Airflow Job publishing into ApacheAirflowJobPublisher. |
| src/fabric_cicd/_items/_activator.py | Replaces Reflex/Activator publishing with ActivatorPublisher. |
| src/fabric_cicd/_items/init.py | Simplifies the _items public surface to export only the new base publisher primitives instead of individual publish_* functions. |
| src/fabric_cicd/_common/_validate_input.py | Adds validate_items_to_include, centralizing feature-flag checks and warnings for selective publish/unpublish operations. |
| src/fabric_cicd/init.py | Re-exports FeatureFlag and ItemType at the package root to make the new enums part of the public API. |
| docs/how_to/optional_feature.md | Documents the new enable_debug_mode feature flag alongside existing optional feature flags. |
Comments suppressed due to low confidence (1)
src/fabric_cicd/_items/_environment.py:160
- The documentation for
_publish_environment_metadatais inconsistent with the actual behavior: it claims the process "involves two steps" and lists three, and still refers to "Check for ongoing publish" even though that responsibility has been moved to_check_environment_publish_state/EnvironmentPublisher.pre_publish_all. Please update the docstring to accurately describe the current steps performed by this function and remove the outdated reference to the pre‑publish check.
def _publish_environment_metadata(fabric_workspace_obj: FabricWorkspace, item_name: str) -> None:
"""
Updates compute settings and publishes compute settings and libraries for a given environment item.
This process involves two steps:
1. Check for ongoing publish.
2. Updating the compute settings.
3. Publish the updated settings and libraries.
Args:
fabric_workspace_obj: The FabricWorkspace object.
item_name: Name of the environment item whose compute settings are to be published.
is_excluded: Flag indicating if Sparkcompute.yml was excluded from definition deployment.
"""
item_type = ItemType.ENVIRONMENT.value
item_guid = fabric_workspace_obj.repository_items[item_type][item_name].guid
# Update compute settings
_update_compute_settings(fabric_workspace_obj, item_guid, item_name)
# Publish updated settings - compute settings and libraries (long-running operation)
# https://learn.microsoft.com/en-us/rest/api/fabric/environment/items/publish-environment
fabric_workspace_obj.endpoint.invoke(
method="POST",
url=f"{fabric_workspace_obj.base_api_url}/environments/{item_guid}/staging/publish?beta=False",
poll_long_running=False,
)
logger.info(f"{constants.INDENT}Publish Submitted for Environment '{item_name}'")
|
@shirasassoon - Please see the changes to the logger for legibility despite parallelism: |


Description
Parallelizes deployments within a given item type.
This required changes to all
itemfiles, in order to push as much logic into the base classes as possible.Item specific logic remains in the implementations via override.
We do not parallelize across item types, as the dependencies are non-deterministic at this time.
Linked Issue
Test
Tested
devtools/debug_trace_deployment.py:Tested across 3 workspaces: