You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/architecture/03_module_layer.md
+68-63Lines changed: 68 additions & 63 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,12 +11,14 @@ Modules define both specifications and compute graphs without performing computa
11
11
The module layer offers significant advantages over directly using CLI commands:
12
12
13
13
### Practical Benefits
14
+
14
15
-**Standardization** - Consistent interfaces across different algorithm types
15
16
-**Composability** - Modules can be connected into larger workflows
16
17
-**Containerization** - Built-in container specification for reproducibility
17
18
-**Documentation** - Structured approach to capturing metadata and citations
18
19
19
20
### Architectural Significance
21
+
20
22
-**Backend Independence** - Run the same module on different execution systems
21
23
-**Inspection** - Examine inputs, outputs, and operations before execution
22
24
-**Automated UI Generation** - Specifications support interface generation
@@ -27,16 +29,17 @@ The module layer offers significant advantages over directly using CLI commands:
27
29
The module layer's defining characteristic is its dual focus on specifications and compute graphs:
28
30
29
31
1.**Specification (Spec)** - Defines what a module does:
30
-
- Input ports with types, descriptions, and validation rules
31
-
- Output ports with types and descriptions
32
-
- Documentation and metadata
33
-
- Parameter constraints and defaults
32
+
33
+
- Input ports with types, descriptions, and validation rules
34
+
- Output ports with types and descriptions
35
+
- Documentation and metadata
36
+
- Parameter constraints and defaults
34
37
35
38
2.**Compute Graph** - Defines how operations should be structured:
36
-
- Container configurations
37
-
- Command construction
38
-
- Input/output relationships
39
-
- Execution sequence
39
+
- Container configurations
40
+
- Command construction
41
+
- Input/output relationships
42
+
- Execution sequence
40
43
41
44
Crucially, modules define computation but don't perform it. This separation enables inspection and modification before execution, and allows the same module to run on different platforms without changing its definition.
42
45
@@ -47,35 +50,35 @@ Just as the CLI layer is organized into command groups that map to algorithm set
47
50
Each module set typically follows a consistent pattern with three types of modules:
48
51
49
52
1.**Load Data Modules** - Generate data loading configurations
50
-
- Define which images to process
51
-
- Create CSV files for CellProfiler to locate images
52
-
- Organize data by batch, plate, well, and site
53
+
- Define which images to process
54
+
- Create CSV files for CellProfiler to locate images
- Configure pipeline parameters based on experiment settings
56
-
- Define processing operations
57
+
- Generate CellProfiler pipeline files
58
+
- Configure pipeline parameters based on experiment settings
59
+
- Define processing operations
57
60
3.**Execution Modules** - Execute the pipeline on prepared data
58
-
- Run pipelines with appropriate parallelism
59
-
- Manage resource allocation
60
-
- Organize outputs according to experimental structure
61
+
- Run pipelines with appropriate parallelism
62
+
- Manage resource allocation
63
+
- Organize outputs according to experimental structure
61
64
62
65
This pattern mirrors the organization of algorithms and CLI commands, but adds the standardized abstraction and container orchestration capabilities of the module layer.
@@ -255,9 +258,9 @@ This example illustrates several important aspects of module implementation:
255
258
256
259
1.**Module Identity** - The `uid()` method provides a unique identifier
257
260
2.**Module Specification** - The `_spec()` method defines inputs, outputs, and metadata and also set default values.
258
-
4.**Compute Graph Creation** - The module uses the `_create_pipe` function to generate a Pipecraft pipeline
259
-
5.**CLI Command Construction** - The module constructs a CLI command that will be executed in a container
260
-
6.**Container Specification** - The module defines the container image and execution environment
261
+
3.**Compute Graph Creation** - The module uses the `_create_pipe` function to generate a Pipecraft pipeline
262
+
4.**CLI Command Construction** - The module constructs a CLI command that will be executed in a container
263
+
5.**Container Specification** - The module defines the container image and execution environment
261
264
262
265
### Module Configuration
263
266
@@ -305,12 +308,14 @@ The Bilayers project enables algorithm developers to write a single configuratio
305
308
StarryNight leverages the Bilayers specification system to standardize its module interfaces. The integration works through several mechanisms:
306
309
307
310
1.**Schema Download and Synchronization**: StarryNight maintains a local copy of the Bilayers validation schema, which is automatically downloaded from the Bilayers repository:
1.**Standardization**: All modules follow the same specification format, making them predictable and easy to understand.
390
395
2.**Interoperability**: Because StarryNight uses the Bilayers specification, there's potential for:
391
-
- Importing Bilayers-compatible tools from other projects
392
-
- Exporting StarryNight modules for use in other Bilayers-compatible systems
393
-
- Leveraging the broader Bilayers ecosystem of tools and interfaces
396
+
- Importing Bilayers-compatible tools from other projects
397
+
- Exporting StarryNight modules for use in other Bilayers-compatible systems
398
+
- Leveraging the broader Bilayers ecosystem of tools and interfaces
394
399
3.**Automatic UI Generation**: While StarryNight doesn't currently generate Gradio or Jupyter interfaces from these specs, the Bilayers-compliant specifications make this possible in the future.
395
400
4.**Validation**: The LinkML-based schema provides robust validation of module specifications, catching configuration errors early.
396
401
5.**Documentation**: The structured format ensures that all modules have consistent documentation for their inputs, outputs, and parameters.
@@ -511,26 +516,26 @@ In pipeline composition, modules are created and configured individually, but th
511
516
Creating a new module set involves implementing classes for each stage of processing:
512
517
513
518
1.**Plan the Module Set**:
514
-
- Identify the algorithm set to wrap
515
-
- Determine inputs, outputs, and parameters
516
-
- Design the module structure (typically load data, pipeline generation, and execution)
519
+
- Identify the algorithm set to wrap
520
+
- Determine inputs, outputs, and parameters
521
+
- Design the module structure (typically load data, pipeline generation, and execution)
517
522
2.**Create Module Classes**:
518
-
- Implement subclasses of `StarryNightModule` for each stage
519
-
- Define unique identifiers and specifications
520
-
- Implement `from_config` methods
521
-
- Create pipeline generation methods
523
+
- Implement subclasses of `StarryNightModule` for each stage
524
+
- Define unique identifiers and specifications
525
+
- Implement `from_config` methods
526
+
- Create pipeline generation methods
522
527
3.**Define Specifications**:
523
-
- Use Bilayers to define inputs, outputs, and parameters
524
-
- Document parameters with clear descriptions
525
-
- Define validation rules
528
+
- Use Bilayers to define inputs, outputs, and parameters
Copy file name to clipboardExpand all lines: docs/architecture/08_practical_integration.md
+11-8Lines changed: 11 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@
3
3
This document provides a concrete example of how StarryNight's architectural layers work together in practice by examining `exec_pcp_generic_pipe.py`, an example pipeline implementation file that demonstrates the PCP Generic workflow. While the [architecture overview](00_architecture_overview.md) and individual layer documents ([Algorithm](01_algorithm_layer.md), [CLI](02_cli_layer.md), [Module](03_module_layer.md), [Pipeline](04_pipeline_layer.md), [Execution](05_execution_layer.md), [Configuration](06_configuration_layer.md)) explain each architectural layer conceptually, this walkthrough shows how these components integrate in a real workflow.
4
4
5
5
!!!note "Pedagogical Approach"
6
-
This document deliberately uses the step-by-step implementation in `exec_pcp_generic_pipe.py` to clearly demonstrate individual components and their interactions. This approach:
6
+
This document deliberately uses the step-by-step implementation in `exec_pcp_generic_pipe.py` to clearly demonstrate individual components and their interactions. This approach:
7
7
8
8
- Allows researchers to inspect intermediate results between pipeline stages
9
9
- Matches biological research workflows where verification at each stage is crucial
The three-phase pattern described below (Generate Load Data → Generate Pipeline File → Execute Pipeline) is specific to how StarryNight integrates with CellProfiler. This pattern isn't a requirement of the StarryNight architecture, but rather a practical approach for this particular integration. Other tools may use different patterns while still adhering to the module abstraction.
154
+
The three-phase pattern described below (Generate Load Data → Generate Pipeline File → Execute Pipeline) is specific to how StarryNight integrates with CellProfiler. This pattern isn't a requirement of the StarryNight architecture, but rather a practical approach for this particular integration. Other tools may use different patterns while still adhering to the module abstraction.
155
155
156
156
With the experiment configured, we can now examine one complete pipeline step (CP calculate illumination). Each step follows a consistent three-phase pattern:
157
157
@@ -228,12 +228,14 @@ This module finds both the LoadData file and the pipeline file created in the pr
228
228
Looking at this example, we can see how all the architecture layers work together across the two main phases:
229
229
230
230
### Pipeline Composition Phase
231
+
231
232
1.**Configuration Layer**: `DataConfig` and experiment configuration drive behavior across all layers
232
233
2.**Module Layer**: Defines standardized components (like `CPCalcIllumInvokeCPModule`) with specifications and compute graphs
233
234
3.**Pipeline Layer**: In this example, we're executing modules one by one, but they can be composed into a complete pipeline as seen in `create_pcp_generic_pipeline`
@@ -346,13 +348,14 @@ This approach enables complex parallel execution patterns, where CP and SBS proc
346
348
When implementing your own modules, follow these patterns:
347
349
348
350
!!!note "Module vs. Algorithm Extension"
349
-
This section focuses on extending StarryNight with new **modules** rather than new algorithms. Modules provide standardized interfaces to existing algorithms, whether those algorithms are part of StarryNight's core or from external tools. To add your own algorithms to StarryNight, see the ["Adding a New Algorithm"](#adding-a-new-algorithm) section below.
351
+
This section focuses on extending StarryNight with new **modules** rather than new algorithms. Modules provide standardized interfaces to existing algorithms, whether those algorithms are part of StarryNight's core or from external tools. To add your own algorithms to StarryNight, see the ["Adding a New Algorithm"](#adding-a-new-algorithm) section below.
350
352
351
353
1.**Module Structure**: Consider your module's specific requirements:
352
-
- For CellProfiler integrations, use the three-phase pattern shown earlier
353
-
- For other tools, design appropriate module structures based on tool requirements
354
-
- Ensure your modules have clear inputs, outputs, and containerized execution specifications
354
+
- For CellProfiler integrations, use the three-phase pattern shown earlier
355
+
- For other tools, design appropriate module structures based on tool requirements
356
+
- Ensure your modules have clear inputs, outputs, and containerized execution specifications
355
357
2.**Registry Integration**: Define a unique ID and register your module in the registry:
0 commit comments