-
Notifications
You must be signed in to change notification settings - Fork 19
Implement ORB AWS EC2 Worker Adapter #525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
magniloquency
wants to merge
88
commits into
finos:main
Choose a base branch
from
magniloquency:orb
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,071
−8
Open
Changes from all commits
Commits
Show all changes
88 commits
Select commit
Hold shift + click to select a range
4506147
Implement ORB Worker Adapter
magniloquency 783c10c
Add submit_tasks example to documentation and CI skip list
magniloquency 9133f52
Merge branch 'main' into orb
magniloquency a1ff776
move orb/ and ami/ to driver/
magniloquency a46af96
remove superfluous checks in orb config
magniloquency e8ae3fc
move import
magniloquency 2312d66
add a check for self._orb before returning machines
magniloquency 89d964b
adjust security group rules
magniloquency 248f2f7
move method
magniloquency 0333ee7
rename no random worker ids to deterministic worker ids
magniloquency ce17847
add comment
magniloquency 2796ead
run submit tasks in ci
magniloquency 389a7b6
fix help text
magniloquency 9eae58d
make _filter_data a static method
magniloquency d333899
refactor orb worker adapter polling to use constants
magniloquency 562ec6a
flake8
magniloquency 7a61c0e
don't touch skip examples file
magniloquency 594342d
Merge branch 'main' of https://github.com/finos/opengris-scaler into orb
magniloquency 350e840
bump minor version
magniloquency 8ad0a61
Merge branch 'main' into orb
magniloquency a2d3856
docs: add worker adapter tutorials and update ORB integration details
magniloquency 0329716
Merge branch 'orb' of https://github.com/magniloquency/scaler into orb
magniloquency c50fddb
refactor: move orb and ami drivers to src/scaler/drivers
magniloquency 90f608a
Delete test.toml
magniloquency 711e9e9
Merge branch 'main' into orb
magniloquency f789743
Refactor ORB worker adapter to use ZMQ-based protocol
magniloquency 287cd7c
fix type error
magniloquency 1235fae
fix type error
magniloquency 8fda46c
Output ORB templates in snake_case
magniloquency 5c5a76e
Simplify ORB driver config files
magniloquency 2130431
Document default no-scaling policy and vanilla scaler example in work…
magniloquency 2b54a3a
Merge branch 'main' into orb
sharpener6 4956370
Merge branch 'main' into orb
sharpener6 31de2fd
import documentation changes from #574
magniloquency 8d893fe
Merge branch 'main' into orb
magniloquency 759f21a
Merge main into orb, adopting worker_manager naming convention
magniloquency 2928bef
Refactor ORB worker adapter to worker_manager_adapter naming and impr…
magniloquency d31f8ad
Refactor ORB config handling and simplify worker manager
magniloquency 830d35f
Rename run_worker_adapter_orb to run_worker_manager_orb
magniloquency da6338c
Fix ORB template missing instance_types and broken region injection
magniloquency 940f0d0
Use subnet_ids list field instead of subnet_id in ORB template
magniloquency ef54411
Add name field to ORBMachine to fix TypeError on deserialization
magniloquency e90f85c
Filter unknown keys when deserializing ORBMachine from dict
magniloquency 15f3c42
Fix duplicate commands sent to adapter while previous command is in-f…
magniloquency 1e931e1
Merge origin/main into orb, resolving conflicts
magniloquency 31af0c7
upgrade to orb 1.2
magniloquency 2e591d8
Migrate ORB worker manager from CLI subprocess to Python SDK
magniloquency 1d50e13
Remove unused ORBMachine and ORBRequest types
magniloquency 9948bf6
Fix WorkerAdapterConfig -> WorkerManagerConfig rename
magniloquency 342735e
Work around ORB SDK app_config timing bug by writing temp config file
magniloquency 983589f
Use ORB_CONFIG_DIR env var to inject config into ORB singleton
magniloquency c086599
Add template_id, image_id, provider_api to configuration dict for ORB…
magniloquency 721acc5
Switch ORB storage from sql to json (SQLQueryBuilder is abstract in i…
magniloquency b2235e7
Monkey-patch ORB TemplateRepositoryImpl.get_by_id to accept plain str
magniloquency 380831d
Fix ORB 1.2.2 missing add() method on TemplateRepositoryImpl
magniloquency 67d1710
Patch Template.get_domain_events/clear_domain_events missing in ORB 1…
magniloquency d77718c
Merge origin/main into orb, resolving conflicts
magniloquency 071d4bc
Upgrade ORB dependency to 1.3 and adopt context-manager SDK API
magniloquency feee439
Update ORB worker manager adapter for post-WorkerGroup protocol
magniloquency e785ca9
Add opengris-scaler 1.15.0 AMI and move packer files to orb adapter d…
magniloquency af01940
Fix ORB create_template call to use flat kwargs instead of nested con…
magniloquency 89e01a7
Add validate_template call and logging after create_template in ORB s…
magniloquency 1c3821e
Rename _worker_groups to _workers in ORB worker adapter
magniloquency 6b69334
Remove hardcoded --num-of-workers from ORB cluster launch script
magniloquency 949348e
Remove inaccurate worker ID tracking comment from ORB cluster launch …
magniloquency 321ed08
Remove hardcoded attribute metadata from ORB create_template call
magniloquency 90633b7
Merge main into orb branch
magniloquency 3bd5c9c
Merge branch 'main' into orb
sharpener6 3f025de
Fix ymq import in ORB worker manager after e921fff3 refactor
magniloquency 81d3cef
Merge main into orb, resolving documentation reorganization conflicts
magniloquency a79a3f6
Update orb-py dependency to 1.5.1
magniloquency 325dcd4
Merge branch 'main' into orb
magniloquency 1c034c1
Merge origin/main into orb
magniloquency 602b736
Fix import order in orb worker_manager
magniloquency 2f95e5b
Add orb worker manager support to unified entry points
magniloquency 183f998
Remove dedicated scaler_worker_manager_orb entry point
magniloquency 9fdcb20
Work around ORB skipping strategy defaults when config_dict is provided
magniloquency 672ad72
Merge origin/main into orb
magniloquency 2d50d26
Remove run_worker_manager_orb script
magniloquency 1e8603b
Suppress repeated StartWorkers requests after TooManyWorkers
magniloquency 4da853c
Lazy-import orb to fix CI test failures
magniloquency 1177f26
Update ORB user data to use scaler_worker_manager with --mode fixed
magniloquency e254eac
fix io threads
magniloquency 68d7e68
Add AMI 1.26.4 to docs and fix build.sh version path
magniloquency 95f4fe7
Fix zero-worker default on single-core machines
magniloquency 1fe9667
Fix TooManyWorkers suppression not working during EC2 boot
magniloquency d1d9633
Merge remote-tracking branch 'origin/main' into orb
magniloquency a0c9198
Rename ORB worker manager to orb_aws_ec2
magniloquency File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,145 @@ | ||
| ORB AWS EC2 Worker Adapter | ||
| ========================== | ||
|
|
||
| The ORB AWS EC2 worker adapter allows Scaler to dynamically provision workers on AWS EC2 instances using the ORB (Open Resource Broker) system. This is particularly useful for scaling workloads that require significant compute resources or specialized hardware available in the cloud. | ||
|
|
||
| This tutorial describes the steps required to get up and running with the ORB AWS EC2 adapter. | ||
|
|
||
| Requirements | ||
| ------------ | ||
|
|
||
| Before using the ORB AWS EC2 worker adapter, ensure the following requirements are met on the machine that will run the adapter: | ||
|
|
||
| 1. **orb-py and boto3**: The ``orb-py`` and ``boto3`` packages must be installed. These can be installed using the ``orb_aws_ec2`` optional dependency of Scaler: | ||
|
|
||
| .. code-block:: bash | ||
|
|
||
| pip install "opengris-scaler[orb_aws_ec2]" | ||
|
|
||
| 2. **AWS CLI**: The AWS Command Line Interface must be installed and configured with a default profile that has permissions to launch, describe, and terminate EC2 instances. | ||
|
|
||
| 3. **Network Connectivity**: The adapter must be able to communicate with AWS APIs and the Scaler scheduler. | ||
|
|
||
| Getting Started | ||
| --------------- | ||
|
|
||
| To start the ORB AWS EC2 worker adapter, use the ``scaler_worker_manager orb_aws_ec2`` subcommand: | ||
|
|
||
| .. code-block:: bash | ||
|
|
||
| scaler_worker_manager orb_aws_ec2 tcp://<SCHEDULER_EXTERNAL_IP>:8516 \ | ||
| --object-storage-address tcp://<OSS_EXTERNAL_IP>:8517 \ | ||
| --image-id ami-0528819f94f4f5fa5 \ | ||
| --instance-type t3.medium \ | ||
| --aws-region us-east-1 \ | ||
| --logging-level INFO \ | ||
| --task-timeout-seconds 60 | ||
|
|
||
| Equivalent configuration using a TOML file with ``scaler``: | ||
|
|
||
| .. code-block:: toml | ||
|
|
||
| # stack.toml | ||
|
|
||
| [scheduler] | ||
| scheduler_address = "tcp://<SCHEDULER_EXTERNAL_IP>:8516" | ||
|
|
||
| [[worker_manager]] | ||
| type = "orb_aws_ec2" | ||
| scheduler_address = "tcp://<SCHEDULER_EXTERNAL_IP>:8516" | ||
| object_storage_address = "tcp://<OSS_EXTERNAL_IP>:8517" | ||
| image_id = "ami-0528819f94f4f5fa5" | ||
| instance_type = "t3.medium" | ||
| aws_region = "us-east-1" | ||
| logging_level = "INFO" | ||
| task_timeout_seconds = 60 | ||
|
|
||
| .. code-block:: bash | ||
|
|
||
| scaler stack.toml | ||
|
|
||
| * ``tcp://<SCHEDULER_EXTERNAL_IP>:8516`` is the address workers will use to connect to the scheduler. | ||
| * ``tcp://<OSS_EXTERNAL_IP>:8517`` is the address workers will use to connect to the object storage server. | ||
| * New workers will be launched using the specified AMI and instance type. | ||
|
|
||
| Networking Configuration | ||
| ------------------------ | ||
|
|
||
| Workers launched by the ORB AWS EC2 adapter are EC2 instances and require an externally-reachable IP address for the scheduler. | ||
|
|
||
| * **Internal Communication**: If the machine running the scheduler is another EC2 instance in the same VPC, you can use EC2 private IP addresses. | ||
| * **Public Internet**: If communicating over the public internet, it is highly recommended to set up robust security rules and/or a VPN to protect the cluster. | ||
|
|
||
| Publicly Available AMIs | ||
| ----------------------- | ||
|
|
||
| We regularly publish publicly available Amazon Machine Images (AMIs) with Python and ``opengris-scaler`` pre-installed. | ||
|
|
||
| .. list-table:: Available Public AMIs | ||
| :widths: 15 15 20 20 30 | ||
| :header-rows: 1 | ||
|
|
||
| * - Scaler Version | ||
| - Python Version | ||
| - Amazon Linux 2023 Version | ||
| - Date (MM/DD/YYYY) | ||
| - AMI ID (us-east-1) | ||
| * - 1.14.2 | ||
| - 3.13 | ||
| - 2023.10.20260120 | ||
| - 01/30/2026 | ||
| - ``ami-0528819f94f4f5fa5`` | ||
| * - 1.15.0 | ||
| - 3.13 | ||
| - 2023.10.20260302.1 | ||
| - 03/16/2026 | ||
| - ``ami-044265172bea55d51`` | ||
| * - 1.26.4 | ||
| - 3.13 | ||
| - 2023.10.20260302.1 | ||
| - 03/26/2026 | ||
| - ``ami-0b76605999d8f5d2b`` | ||
|
|
||
| New AMIs will be added to this list as they become available. | ||
|
|
||
| Supported Parameters | ||
| -------------------- | ||
|
|
||
| .. note:: | ||
| For more details on how to configure Scaler, see the :doc:`../configuration` section. | ||
|
|
||
| The ORB AWS EC2 worker adapter supports ORB-specific configuration parameters as well as common worker adapter parameters. | ||
|
|
||
| ORB AWS EC2 Template Configuration | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| * ``--image-id`` (Required): AMI ID for the worker instances. | ||
| * ``--instance-type``: EC2 instance type (default: ``t2.micro``). | ||
| * ``--aws-region``: AWS region (default: ``us-east-1``). | ||
| * ``--key-name``: AWS key pair name for the instances. If not provided, a temporary key pair will be created and deleted on cleanup. | ||
| * ``--subnet-id``: AWS subnet ID where the instances will be launched. If not provided, it attempts to discover the default subnet in the default VPC. | ||
| * ``--security-group-ids``: Comma-separated list of AWS security group IDs. | ||
| * ``--allowed-ip``: IP address to allow in the security group (if created automatically). Defaults to the adapter's external IP. | ||
| * ``--orb-config-path``: Path to the ORB root directory (default: ``src/scaler/drivers/orb``). | ||
|
|
||
| Common Parameters | ||
| ~~~~~~~~~~~~~~~~~ | ||
|
|
||
| For a full list of common parameters including networking, worker configuration, and logging, see :doc:`common_parameters`. | ||
|
|
||
| Cleanup | ||
| ------- | ||
|
|
||
| The ORB AWS EC2 worker adapter is designed to be self-cleaning, but it is important to be aware of the resources it manages: | ||
|
|
||
| * **Key Pairs**: If a ``--key-name`` is not provided, the adapter creates a temporary AWS key pair. | ||
| * **Security Groups**: If ``--security-group-ids`` are not provided, the adapter creates a temporary security group to allow communication. | ||
| * **Launch Templates**: ORB may additionally create EC2 Launch Templates as part of the machine provisioning process. | ||
|
|
||
| The adapter attempts to delete these temporary resources and terminate all launched EC2 instances when it shuts down gracefully. However, in the event of an ungraceful crash or network failure, some resources may persist in your AWS account. | ||
|
|
||
| .. tip:: | ||
| It is recommended to periodically check your AWS console for any orphaned resources (instances, security groups, key pairs, or launch templates) and clean them up manually if necessary to avoid unexpected costs. | ||
|
|
||
| .. warning:: | ||
| **Subnet and Security Groups**: Currently, specifying ``--subnet-id`` or ``--security-group-ids`` via configuration might not have the intended effect as the adapter is designed to auto-discover or create these resources. Specifically, the adapter may still attempt to use default subnets or create its own temporary security groups regardless of these parameters. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.