Skip to content

Commit

Permalink
address comments from internal review
Browse files Browse the repository at this point in the history
  • Loading branch information
backmari committed Dec 5, 2024
1 parent 450c9e7 commit 56e23c5
Show file tree
Hide file tree
Showing 3 changed files with 106 additions and 47 deletions.
117 changes: 85 additions & 32 deletions docs/developer/architecture/communication_flows.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
.. _communication_flows:

.. Note that the mermaid diagrams are styled using some ugly CSS since styling of sequence diagrams
is an open issue: https://github.com/mermaid-js/mermaid/issues/523
CSS hack from: https://stackoverflow.com/questions/63587556/color-change-of-one-element-in-a-mermaid-sequence-diagram
Communication Flows
===================

Expand Down Expand Up @@ -30,6 +34,7 @@ the high volume of PV updates, DASMON writes PV:s directly to the PostgreSQL dat
DASMON->>Dasmon listener: Instrument status
Dasmon listener->>Workflow DB: Instrument status
end
%%{init:{'themeCSS':'g:nth-of-type(6) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(2) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };'}}%%

Run status updates
..................
Expand All @@ -50,6 +55,7 @@ and when the Streaming Translation Client (STC) completes translation to NeXus.
Dasmon listener->>Workflow DB: Run status
SMS->>Dasmon listener: Translation succeeded
Dasmon listener->>Workflow DB: Run status
%%{init:{'themeCSS':'g:nth-of-type(2) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(5) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };'}}%%


Experiment data post-processing
Expand All @@ -76,29 +82,30 @@ configured sequence and the steps included depending on the instrument.
sequenceDiagram
participant STC
participant Workflow Manager
participant Autoreducer
participant Post-Processing Agent
participant ONCat
participant HFIR/SNS File Archive

STC->>Workflow Manager: POSTPROCESS.DATA_READY
Workflow Manager->>Autoreducer: CATALOG.ONCAT.DATA_READY
Autoreducer->>Workflow Manager: CATALOG.ONCAT.STARTED
Autoreducer->>ONCat: pyoncat
Autoreducer->>Workflow Manager: CATALOG.ONCAT.COMPLETE
Workflow Manager->>Autoreducer: REDUCTION.DATA_READY
Autoreducer->>Workflow Manager: REDUCTION.STARTED
Autoreducer->>HFIR/SNS File Archive: reduced data, reduction log
Autoreducer->>Workflow Manager: REDUCTION.COMPLETE
Workflow Manager->>Autoreducer: REDUCTION_CATALOG.DATA_READY
Autoreducer->>Workflow Manager: REDUCTION_CATALOG.STARTED
Autoreducer->>ONCat: pyoncat
Autoreducer->>Workflow Manager: REDUCTION_CATALOG.COMPLETE
Workflow Manager->>Post-Processing Agent: CATALOG.ONCAT.DATA_READY
Post-Processing Agent->>Workflow Manager: CATALOG.ONCAT.STARTED
Post-Processing Agent->>ONCat: pyoncat
Post-Processing Agent->>Workflow Manager: CATALOG.ONCAT.COMPLETE
Workflow Manager->>Post-Processing Agent: REDUCTION.DATA_READY
Post-Processing Agent->>Workflow Manager: REDUCTION.STARTED
Post-Processing Agent->>HFIR/SNS File Archive: reduced data, reduction log
Post-Processing Agent->>Workflow Manager: REDUCTION.COMPLETE
Workflow Manager->>Post-Processing Agent: REDUCTION_CATALOG.DATA_READY
Post-Processing Agent->>Workflow Manager: REDUCTION_CATALOG.STARTED
Post-Processing Agent->>ONCat: pyoncat
Post-Processing Agent->>Workflow Manager: REDUCTION_CATALOG.COMPLETE
%%{init:{'themeCSS':'g:nth-of-type(2) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(5) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(6) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(7) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(10) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(11) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };'}}%%

Configuring the autoreduction
.............................

In addition to run post-processing, the autoreducers handle updating instrument reduction script
parameters for instruments that have implemented
In addition to run post-processing, Post-Processing Agent handles updating instrument reduction
script parameters for instruments that have implemented
:doc:`autoreduction parameter configuration<../instruction/autoreduction>` at
`monitor.sns.gov/reduction/<instrument>/ <https://monitor.sns.gov/reduction/cncs/>`_.

Expand All @@ -107,12 +114,13 @@ parameters for instruments that have implemented
sequenceDiagram
actor Instrument Scientist
participant WebMon
participant Autoreducer
participant Post-Processing Agent
participant HFIR/SNS File archive

Instrument Scientist->>WebMon: Submit form with parameter values
WebMon->>Autoreducer: REDUCTION.CREATE_SCRIPT
Autoreducer->>HFIR/SNS File archive: Update instrument reduction script
WebMon->>Post-Processing Agent: REDUCTION.CREATE_SCRIPT
Post-Processing Agent->>HFIR/SNS File archive: Update instrument reduction script
%%{init:{'themeCSS':'g:nth-of-type(5) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(9) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };'}}%%

Live data visualization
--------------------------
Expand Down Expand Up @@ -141,32 +149,38 @@ script can make the results available in WebMon by publishing plots to Live Data
Livereduce->>Livereduce: run processing script
Livereduce->>Live Data Server: HTTP POST
end
%%{init:{'themeCSS':'g:nth-of-type(2) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(6) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };'}}%%

Publish to Live Data Server from autoreduction script
.....................................................

The instrument-specific autoreduction script can include a step to publish plots (in either JSON
format or HTML div) to Live Data Server. The Post-processing Agent repository includes some
format or HTML div) to Live Data Server. The Post-Processing Agent repository includes some
convenience functions for generating and publishing plots in `publish_plot.py
<https://github.com/neutrons/post_processing_agent/blob/main/postprocessing/publish_plot.py>`_.

.. mermaid::

sequenceDiagram
participant Workflow Manager
participant Autoreducer
participant Post-Processing Agent
participant Live Data Server

Workflow Manager->>Autoreducer: REDUCTION.DATA_READY
Workflow Manager->>Post-Processing Agent: REDUCTION.DATA_READY
opt Publish plot
Autoreducer->>Live Data Server: HTTP POST
Post-Processing Agent->>Live Data Server: HTTP POST
end

Display plot from Live Data Server
................................

Run overview pages (``monitor.sns.gov/report/<instrument>/<run number>/``) will query the Live
Data Server for a plot for that instrument and run number and display it if available.
Data Server for a plot for that instrument and run number and display it, if available.

The Live Data Server database stores a single plot for each combination of instrument and run
number. Publishing a new plot automatically replaces the previous plot. When WebMon fetches a plot
it will, therefore, always display the latest plot, whether it was published by Livereduce during
the run or by autoreduction after the run has finished.

.. mermaid::

Expand All @@ -191,25 +205,64 @@ Diagnostics information is primarily collected by Dasmon listener.
Heartbeats from services
........................

Dasmon listener subscribes to heartbeats from the other services. There is a mechanism for alerting
admins by email when a service has missed heartbeats (needs to be verified that this still works).
Dasmon listener subscribes to heartbeat messages from the other services and stores the last
received status for each service in the database. Post-Processing Agent and Workflow Manager
also include their process ID (PID) in the heartbeat message.

.. mermaid::

flowchart LR
SMS["SMS (per beamline)"]
PVSD["PVSD (per beamline)"]
DASMON["DASMON (per beamline)"]
STC
Autoreducers
PostProcessingAgent["Post-Processing Agent"]
DasmonListener
WorkflowDB[(DB)]
SMS-->|heartbeat|DasmonListener
PVSD-->|heartbeat|DasmonListener
DASMON-->|heartbeat|DasmonListener
STC-->|heartbeat|DasmonListener
Autoreducers-->|heartbeat|DasmonListener
WorkflowManager-->|heartbeat|DasmonListener
DasmonListener-->|heartbeat|DasmonListener
PostProcessingAgent-->|heartbeat, PID|DasmonListener
WorkflowManager-->|heartbeat, PID|DasmonListener
DasmonListener-->WorkflowDB
DasmonListener-.->|if missed 3 heartbeats|InstrumentScientist
classDef externalStyle fill:#faf2e6, stroke:#f2e3cb
class SMS,PVSD,DASMON externalStyle

subgraph Legend
direction LR
Internal["Internal resource"]
External["External resource"]
Internal ~~~ External
end
WorkflowManager ~~~ Internal
style Legend fill:#FFFFFF,stroke:#000000
class External externalStyle

Dasmon listener handles messages sent to a message broker topic with the string "STATUS" in the name
as heartbeat messages. For example, Workflow Manager sends a heartbeat message to
``SNS.COMMON.STATUS.WORKFLOW.0`` every 5 seconds. Dasmon listener also records heartbeats from the
beamline-specific services, e.g. the PVSD service at the HFIR beamline
CG3 sends heartbeat messages to the topic ``HFIR.CG3.STATUS.PVSD``. Table 2 lists the services that
send heartbeats to Dasmon listener, as well as their message broker topic and heartbeat frequency.

.. list-table:: Table 2: Service heartbeat messages
:widths: 40 40 20
:header-rows: 1

* - Service
- Message broker topic
- Frequency
* - Workflow Manager
- SNS.COMMON.STATUS.WORKFLOW.0
- 5 s
* - Post-Processing Agent
- SNS.COMMON.STATUS.AUTOREDUCE.0
- 30 s
* - DASMON
- <facility>.<instrument>.STATUS.DASMON
- 5 s
* - PVSD
- <facility>.<instrument>.STATUS.PVSD
- 5 s
* - SMS
- <facility>.<instrument>.STATUS.SMS
- 5 s
30 changes: 15 additions & 15 deletions docs/developer/architecture/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ The arrows represent relationships between these services and resources.
WorkflowManager[Workflow Manager]
DasmonListener[Dasmon listener]
Database[(Workflow DB)]
Autoreducers-->ONCat
PostProcessingAgent[Post-Processing Agent]-->ONCat
LiveDataServer-->WebMon
LiveDataServer<-->LiveDataDB[(LiveData DB)]
LiveReduce
Expand All @@ -38,16 +38,16 @@ The arrows represent relationships between these services and resources.
DASMON-->DasmonListener
DASMON-->Database
WorkflowManager-->Database
WorkflowManager<-->Autoreducers
Autoreducers-->LiveDataServer
WorkflowManager<-->PostProcessingAgent
PostProcessingAgent-->LiveDataServer
Database-->WebMon
ONCat-->WebMon
LiveReduce-->LiveDataServer
DasmonListener-->Database
TranslationService-->FileArchive
FileArchive<-->Autoreducers
FileArchive<-->PostProcessingAgent
style DAS fill:#D3D3D3, stroke-dasharray: 5 5
classDef externalStyle fill:#FAEFDE, stroke:#B08D55
classDef externalStyle fill:#faf2e6, stroke:#f2e3cb
class DASMON,TranslationService,SMS,FileArchive,ONCat externalStyle

subgraph Legend
Expand All @@ -61,12 +61,12 @@ The arrows represent relationships between these services and resources.
class External externalStyle

The gray box labeled "DAS" are services managed by the Data Acquisition System team that pass
information to WebMon. The autoreducers interact with the HFIR/SNS file archive to access
instrument-specific reduction scripts and experiment data files. The autoreducers also write reduced
data files and reduction log files back to the file archive.
information to WebMon. Post-Processing Agent interacts with the HFIR/SNS file archive to access
instrument-specific reduction scripts and experiment data files. Post-Processing Agent also writes
reduced data files and reduction log files back to the file archive.

Another external component is the experiment data catalog, `ONCat <https://oncat.ornl.gov/>`_, where
the autoreducers catalog experiment metadata. The WebMon frontend retrieves and displays this
Post-Processing Agent catalogs experiment metadata. The WebMon frontend retrieves and displays this
metadata from ONCat.

The section :ref:`communication_flows` includes sequence diagrams that show how the services
Expand All @@ -77,8 +77,8 @@ Inter-service communication

WebMon uses an `ActiveMQ <https://activemq.apache.org/>`_ message broker as the main method of
communication between services. The message broker also serves as a load balancer by distributing
post-processing jobs to the available autoreducers in a round-robin fashion. Communication with Live
Data Server and ONCat occurs via their respective REST API:s.
post-processing jobs to the available instances of Post-Processing Agent in a round-robin fashion.
Communication with Live Data Server and ONCat occurs via their respective REST API:s.

Table 1 lists the type of communication between pairs services, which are loosely categorized as
"client" and "service" in that interaction.
Expand All @@ -90,13 +90,13 @@ Table 1 lists the type of communication between pairs services, which are loosel
* - "Client"
- "Server"
- Communication type
* - Autoreducers
* - Post-Processing Agent
- Dasmon Listener
- Message queue
* - Autoreducers
* - Post-Processing Agent
- Live Data Server
- REST API
* - Autoreducers
* - Post-Processing Agent
- ONCat
- REST API
* - DASMON
Expand Down Expand Up @@ -130,7 +130,7 @@ Table 1 lists the type of communication between pairs services, which are loosel
- Workflow Manager
- Message queue
* - Workflow Manager
- Autoreducers
- Post-Processing Agent
- Message queue
* - Workflow Manager
- Dasmon Listener
Expand Down
6 changes: 6 additions & 0 deletions docs/glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,9 @@ Glossary
Workflow
Experiment data post-processing workflow. The available tasks are cataloging, autoreduction
and reduced data cataloging.

Post-Processing Agent
Service that performs post-processing tasks like cataloging and autoreduction.

Autoreducer
Server where an instance of :term:`Post-Processing Agent` is running.

0 comments on commit 56e23c5

Please sign in to comment.