Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing/manager #14

Open
wants to merge 72 commits into
base: feature/coverage
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
78ef619
Include celerymanager and update celeryadapter to check the status of…
ryannova Jul 23, 2024
8561a18
Fixed issue where the update status was outside of if statement for c…
ryannova Jul 23, 2024
1120dd7
Include worker status stop and add template for merlin restart
ryannova Aug 1, 2024
f41938f
Added comment to the CeleryManager init
ryannova Aug 2, 2024
690115e
Increment db_num instead of being fixed
ryannova Aug 2, 2024
de4ffd0
Added other subprocess parameters and created a linking system for re…
ryannova Aug 2, 2024
67e9268
Implemented stopping of celery workers and restarting workers properly
ryannova Aug 6, 2024
406e4c2
Update stopped to stalled for when the worker doesn't respond to restart
ryannova Aug 6, 2024
78e4525
Working merlin manager run but start and stop not working properly
ryannova Aug 7, 2024
eca74ac
Made fix for subprocess to start new shell and fixed manager start an…
ryannova Aug 7, 2024
ec8aa78
Added comments and update changelog
ryannova Aug 7, 2024
3f04d24
Include style fixes
ryannova Aug 7, 2024
5538f4b
Fix style for black
ryannova Aug 7, 2024
b6bcd33
Revert launch_job script that was edited when doing automated lint
ryannova Aug 7, 2024
9b97f8b
Move importing of CONFIG to be within redis_connection due to error o…
ryannova Aug 7, 2024
c9dfd31
Added space to fix style
ryannova Aug 7, 2024
a9bd865
Revert launch_jobs.py:
ryannova Aug 7, 2024
ddc7614
Update import of all merlin.config to be in the function
ryannova Aug 7, 2024
353a66b
suggested changes plus beginning work on monitor/manager collab
bgunnar5 Aug 17, 2024
1a4d416
move managers to their own folder and fix ssl problems
bgunnar5 Aug 22, 2024
875f137
final PR touch ups
bgunnar5 Sep 3, 2024
9020aa0
Merge pull request #2 from bgunnar5/monitor_manager_collab
ryannova Sep 3, 2024
58da9bc
Fix lint style changes
ryannova Sep 3, 2024
e75dcc2
Fixed issue with context manager
ryannova Sep 4, 2024
11f9e7c
Reset file that was incorrect changed
ryannova Sep 4, 2024
7204e46
Check for ssl cert before applying to Redis connection
ryannova Sep 4, 2024
53d8f32
Comment out Active tests for celerymanager
ryannova Sep 4, 2024
a5ccb2d
Fix lint issue with unused import after commenting out Active celery …
ryannova Sep 9, 2024
2b0e8a6
Fixed style for import
ryannova Sep 9, 2024
e49f378
Fixed kwargs being modified when making a copy for saving to redis wo…
ryannova Sep 12, 2024
352e7df
Added password check and omit if a password doesn't exist
ryannova Sep 13, 2024
75a9972
change testing log level to debug
bgunnar5 Sep 16, 2024
c27a208
add debug statement for redis_connection
bgunnar5 Sep 17, 2024
97a9cf1
change debug log to info so github ci will display it
bgunnar5 Sep 17, 2024
ce8bf37
attempt to fix password missing from Namespace error
bgunnar5 Sep 17, 2024
5851d9d
run checks for all necessary configurations
bgunnar5 Sep 17, 2024
97d075e
convert stop-workers tests to pytest format
bgunnar5 Sep 20, 2024
04e9122
update github wf and comment out stop-workers tests in definitions.py
bgunnar5 Sep 20, 2024
f93c7f6
add missing key to GH wf file
bgunnar5 Sep 20, 2024
835399c
fix invalid syntax in definitions.py
bgunnar5 Sep 20, 2024
176ff4d
comment out stop_workers tests
bgunnar5 Sep 24, 2024
e38cc93
playing with new caches for workflow CI
bgunnar5 Sep 24, 2024
c136058
fix yaml syntax error
bgunnar5 Sep 24, 2024
56a6a05
fix typo for getting runner os
bgunnar5 Sep 24, 2024
f45a798
fix test and add python version to CI cache
bgunnar5 Sep 24, 2024
290d350
add in common-setup step again with caches this time
bgunnar5 Sep 24, 2024
8a1bc14
run fix-style
bgunnar5 Sep 24, 2024
58622ba
update CHANGELOG
bgunnar5 Sep 24, 2024
917f8d7
fix remaining style issues
bgunnar5 Sep 24, 2024
91c7505
run without caches to compare execution time of test suite
bgunnar5 Sep 24, 2024
c7adb96
resolve merge conflict
bgunnar5 Sep 24, 2024
608e00e
allow redis config to not use ssl
bgunnar5 Sep 25, 2024
bf41a2d
remove stop-workers and query-workers tests from definitions.py
bgunnar5 Sep 26, 2024
630c9c9
create helper_funcs file with common testing functions
bgunnar5 Sep 26, 2024
17889fd
move query-workers to pytest and add base class w/ stop-workers tests
bgunnar5 Sep 26, 2024
5c28b49
update CHANGELOG
bgunnar5 Sep 26, 2024
643b4d1
final changes for the stop-workers & query-workers tests
bgunnar5 Sep 27, 2024
0f0264c
run fix-style
bgunnar5 Sep 27, 2024
6340604
move stop and query workers tests to the same file
bgunnar5 Sep 30, 2024
19c4bf7
run fix-style
bgunnar5 Sep 30, 2024
99257d8
go back to original cache setup
bgunnar5 Sep 30, 2024
f947eae
try new cache for singularity install
bgunnar5 Sep 30, 2024
beafb22
fix syntax issue in github workflow
bgunnar5 Sep 30, 2024
fac2892
attempt to fix singularity cache
bgunnar5 Sep 30, 2024
c49660a
remove ls statement that breaks workflow
bgunnar5 Sep 30, 2024
ecb1762
revert back to no common setup
bgunnar5 Sep 30, 2024
39d09d6
remove unnecessary dependency
bgunnar5 Sep 30, 2024
5f4673b
update github actions versions to use latest
bgunnar5 Sep 30, 2024
70e540f
update action versions that didn't save
bgunnar5 Sep 30, 2024
94f8f72
merge in new tests for stop and query workers commands
bgunnar5 Sep 30, 2024
d2e85ec
run fix-style
bgunnar5 Sep 30, 2024
9c4fca4
move distributed test suite actions back to v2
bgunnar5 Oct 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 75 additions & 9 deletions .github/workflows/push-pr_workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ jobs:

steps:
- name: Checkout code
uses: actions/checkout@v2
uses: actions/checkout@v4
with:
fetch-depth: 0 # Checkout the whole history, in case the target is way far behind

Expand Down Expand Up @@ -40,14 +40,14 @@ jobs:
MAX_COMPLEXITY: 15

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v2
uses: actions/setup-python@v5
with:
python-version: '3.x'

- name: Check cache
uses: actions/cache@v2
uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ hashFiles('requirements/release.txt') }}-${{ hashFiles('requirements/dev.txt') }}
Expand Down Expand Up @@ -95,14 +95,14 @@ jobs:
python-version: ['3.7', '3.8', '3.9', '3.10', '3.11']

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Check cache
uses: actions/cache@v2
uses: actions/cache@v4
with:
path: ${{ env.pythonLocation }}
key: ${{ env.pythonLocation }}-${{ hashFiles('requirements/release.txt') }}-${{ hashFiles('requirements/dev.txt') }}
Expand All @@ -112,8 +112,7 @@ jobs:
python3 -m pip install --upgrade pip
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
pip3 install -r requirements/dev.txt
pip freeze


- name: Install singularity
run: |
sudo apt-get update && sudo apt-get install -y \
Expand Down Expand Up @@ -153,6 +152,73 @@ jobs:
run: |
python3 tests/integration/run_tests.py --verbose --local

Integration-tests:
runs-on: ubuntu-latest
env:
GO_VERSION: 1.18.1
SINGULARITY_VERSION: 3.9.9
OS: linux
ARCH: amd64

strategy:
matrix:
python-version: ['3.7', '3.8', '3.9', '3.10', '3.11']

steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Check cache
uses: actions/cache@v4
with:
path: ${{ env.pythonLocation }}
key: ${{ env.pythonLocation }}-${{ hashFiles('requirements/release.txt') }}-${{ hashFiles('requirements/dev.txt') }}

- name: Install dependencies
run: |
python3 -m pip install --upgrade pip
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
pip3 install -r requirements/dev.txt

- name: Install merlin
run: |
pip3 install -e .
merlin config

- name: Install singularity
run: |
sudo apt-get update && sudo apt-get install -y \
build-essential \
libssl-dev \
uuid-dev \
libgpgme11-dev \
squashfs-tools \
libseccomp-dev \
pkg-config
wget https://go.dev/dl/go$GO_VERSION.$OS-$ARCH.tar.gz
sudo tar -C /usr/local -xzf go$GO_VERSION.$OS-$ARCH.tar.gz
rm go$GO_VERSION.$OS-$ARCH.tar.gz
export PATH=$PATH:/usr/local/go/bin
wget https://github.com/sylabs/singularity/releases/download/v$SINGULARITY_VERSION/singularity-ce-$SINGULARITY_VERSION.tar.gz
tar -xzf singularity-ce-$SINGULARITY_VERSION.tar.gz
cd singularity-ce-$SINGULARITY_VERSION
./mconfig && \
make -C ./builddir && \
sudo make -C ./builddir install

- name: Install CLI task dependencies generated from the 'feature demo' workflow
run: |
merlin example feature_demo
pip3 install -r feature_demo/requirements.txt

# TODO remove the --ignore statement once those tests are fixed
- name: Run integration test suite for distributed tests
run: |
pytest --ignore tests/integration/test_celeryadapter.py tests/integration/

Distributed-test-suite:
runs-on: ubuntu-latest
services:
Expand Down
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]
### Added
- Merlin manager capability to monitor celery workers.
- Several new unit tests for the following subdirectories:
- `merlin/common/`
- `merlin/config/`
Expand All @@ -21,6 +22,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Split the `start_server` and `config_server` functions of `merlin/server/server_commands.py` into multiple functions to make testing easier
- Split the `create_server_config` function of `merlin/server/server_config.py` into two functions to make testing easier
- Combined `set_snapshot_seconds` and `set_snapshot_changes` methods of `RedisConfig` into one method `set_snapshot`
- Moved stop-workers and query-workers integration tests to pytest tests

## [1.12.2b1]
### Added
Expand Down
4 changes: 3 additions & 1 deletion merlin/celery.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,9 +114,11 @@ def route_for_task(name, args, kwargs, options, task=None, **kw): # pylint: dis
BROKER_URI = None
RESULTS_BACKEND_URI = None

app_name = "merlin_test_app" if os.getenv("CELERY_ENV") == "test" else "merlin"

# initialize app with essential properties
app: Celery = patch_celery().Celery(
"merlin",
app_name,
broker=BROKER_URI,
backend=RESULTS_BACKEND_URI,
broker_use_ssl=BROKER_SSL,
Expand Down
6 changes: 3 additions & 3 deletions merlin/examples/dev_workflows/multiple_workers.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,11 @@ merlin:
resources:
workers:
step_1_merlin_test_worker:
args: -l INFO
args: -l INFO --concurrency 1
steps: [step_1]
step_2_merlin_test_worker:
args: -l INFO
args: -l INFO --concurrency 1
steps: [step_2]
other_merlin_test_worker:
args: -l INFO
args: -l INFO --concurrency 1
steps: [step_3, step_4]
101 changes: 100 additions & 1 deletion merlin/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@
from merlin.server.server_commands import config_server, init_server, restart_server, start_server, status_server, stop_server
from merlin.spec.expansion import RESERVED, get_spec_with_expansion
from merlin.spec.specification import MerlinSpec
from merlin.study.celerymanageradapter import run_manager, start_manager, stop_manager
from merlin.study.status import DetailedStatus, Status
from merlin.study.status_constants import VALID_RETURN_CODES, VALID_STATUS_FILTERS
from merlin.study.status_renderers import status_renderer_factory
Expand Down Expand Up @@ -359,7 +360,7 @@ def stop_workers(args):
LOG.warning(f"Worker '{worker_name}' is unexpanded. Target provenance spec instead?")

# Send stop command to router
router.stop_workers(args.task_server, worker_names, args.queues, args.workers)
router.stop_workers(args.task_server, worker_names, args.queues, args.workers, args.level.upper())


def print_info(args):
Expand Down Expand Up @@ -400,6 +401,35 @@ def process_example(args: Namespace) -> None:
setup_example(args.workflow, args.path)


def process_manager(args: Namespace):
"""
Process the command for managing the workers.

This function interprets the command provided in the `args` namespace and
executes the corresponding manager function. It supports three commands:
"run", "start", and "stop".

:param args: parsed CLI arguments
"""
if args.command == "run":
run_manager(query_frequency=args.query_frequency, query_timeout=args.query_timeout, worker_timeout=args.worker_timeout)
elif args.command == "start":
try:
start_manager(
query_frequency=args.query_frequency, query_timeout=args.query_timeout, worker_timeout=args.worker_timeout
)
LOG.info("Manager started successfully.")
except Exception as e:
LOG.error(f"Unable to start manager.\n{e}")
elif args.command == "stop":
if stop_manager():
LOG.info("Manager stopped successfully.")
else:
LOG.error("Unable to stop manager.")
else:
print("Run manager with a command. Try 'merlin manager -h' for more details")


def process_monitor(args):
"""
CLI command to monitor merlin workers and queues to keep
Expand Down Expand Up @@ -898,6 +928,75 @@ def generate_worker_touching_parsers(subparsers: ArgumentParser) -> None:
help="regex match for specific workers to stop",
)

# merlin manager
manager: ArgumentParser = subparsers.add_parser(
"manager",
help="Watchdog application to manage workers",
description="A daemon process that helps to restart and communicate with workers while running.",
formatter_class=ArgumentDefaultsHelpFormatter,
)
manager.set_defaults(func=process_manager)

def add_manager_options(manager_parser: ArgumentParser):
"""
Add shared options for manager subcommands.

The `manager run` and `manager start` subcommands have the same options.
Rather than writing duplicate code for these we'll use this function
to add the arguments to these subcommands.

:param manager_parser: The ArgumentParser object to add these options to
"""
manager_parser.add_argument(
"-qf",
"--query_frequency",
action="store",
type=int,
default=60,
help="The frequency at which workers will be queried for response.",
)
manager_parser.add_argument(
"-qt",
"--query_timeout",
action="store",
type=float,
default=0.5,
help="The timeout for the query response that are sent to workers.",
)
manager_parser.add_argument(
"-wt",
"--worker_timeout",
action="store",
type=int,
default=180,
help="The sum total (query_frequency*tries) time before an attempt is made to restart worker.",
)

manager_commands: ArgumentParser = manager.add_subparsers(dest="command")
manager_run = manager_commands.add_parser(
"run",
help="Run the daemon process",
description="Run manager",
formatter_class=ArgumentDefaultsHelpFormatter,
)
add_manager_options(manager_run)
manager_run.set_defaults(func=process_manager)
manager_start = manager_commands.add_parser(
"start",
help="Start the daemon process",
description="Start manager",
formatter_class=ArgumentDefaultsHelpFormatter,
)
add_manager_options(manager_start)
manager_start.set_defaults(func=process_manager)
manager_stop = manager_commands.add_parser(
"stop",
help="Stop the daemon process",
description="Stop manager",
formatter_class=ArgumentDefaultsHelpFormatter,
)
manager_stop.set_defaults(func=process_manager)

# merlin monitor
monitor: ArgumentParser = subparsers.add_parser(
"monitor",
Expand Down
Empty file added merlin/managers/__init__.py
Empty file.
Loading
Loading