-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial workflow defs #163
Open
ladinesa
wants to merge
7
commits into
develop
Choose a base branch
from
add-workflows
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 2 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
0530c27
Initial workflow defs
ladinesa 4e35a66
Linting fix
ladinesa 9232147
Add method and results
ladinesa 0b39f46
Refactor def and normalizer
ladinesa 81182e8
Refactor single point
ladinesa 3e46ccc
Add md
ladinesa 175b7af
Add phonon workflow
ladinesa File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
import numpy as np | ||
from nomad.datamodel.data import ArchiveSection | ||
from nomad.datamodel.metainfo.annotations import ELNAnnotation | ||
from nomad.metainfo import Datetime, Quantity | ||
|
||
|
||
class Time(ArchiveSection): | ||
""" | ||
Contains time-related quantities. | ||
""" | ||
|
||
datetime_end = Quantity( | ||
type=Datetime, | ||
description=""" | ||
The date and time when this computation ended. | ||
""", | ||
a_eln=ELNAnnotation(component='DateTimeEditQuantity'), | ||
) | ||
|
||
cpu1_start = Quantity( | ||
type=np.float64, | ||
unit='second', | ||
description=""" | ||
The starting time of the computation on the (first) CPU 1. | ||
""", | ||
a_eln=ELNAnnotation(component='NumberEditQuantity'), | ||
) | ||
|
||
cpu1_end = Quantity( | ||
type=np.float64, | ||
unit='second', | ||
description=""" | ||
The end time of the computation on the (first) CPU 1. | ||
""", | ||
a_eln=ELNAnnotation(component='NumberEditQuantity'), | ||
) | ||
|
||
wall_start = Quantity( | ||
type=np.float64, | ||
unit='second', | ||
description=""" | ||
The internal wall-clock time from the starting of the computation. | ||
""", | ||
a_eln=ELNAnnotation(component='NumberEditQuantity'), | ||
) | ||
|
||
wall_end = Quantity( | ||
type=np.float64, | ||
unit='second', | ||
description=""" | ||
The internal wall-clock time from the end of the computation. | ||
""", | ||
a_eln=ELNAnnotation(component='NumberEditQuantity'), | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from .general import SimulationWorkflow | ||
from .geometry_optimization import GeometryOptimization | ||
from .gw import DFTGWWorkflow | ||
from .single_point import SinglePoint |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
from nomad.datamodel import EntryArchive | ||
from nomad.datamodel.metainfo.workflow import Link, Task, Workflow | ||
from nomad.metainfo.util import MSubSectionList | ||
from structlog.stdlib import BoundLogger | ||
|
||
INCORRECT_N_TASKS = 'Incorrect number of tasks found.' | ||
|
||
|
||
class SimulationWorkflow(Workflow): | ||
""" | ||
Base class for simulation workflows. | ||
""" | ||
|
||
def normalize(self, archive: EntryArchive, logger: BoundLogger): | ||
""" | ||
Generate tasks from the archive data outputs. | ||
""" | ||
if not archive.data or not archive.data.outputs: | ||
return | ||
|
||
# generate tasks from outputs | ||
if not self.tasks: | ||
# default should to serial execution | ||
times: list[tuple[float, float]] = list( | ||
[ | ||
(o.wall_start or n, o.wall_end or n) | ||
for n, o in enumerate(archive.data.outputs) | ||
] | ||
) | ||
times.sort(key=lambda x: x[0]) | ||
# current parent task | ||
parent_n = 0 | ||
parent_outputs: list[Link] = [] | ||
for n, time in enumerate(times): | ||
task = Task( | ||
outputs=[ | ||
Link( | ||
name='Output', | ||
section=archive.data.outputs[n], | ||
) | ||
], | ||
) | ||
self.tasks.append(task) | ||
# link tasks based on overlap in execution time | ||
if time[0] >= times[parent_n][1]: | ||
# if no overlap, assign outputs of parent as input to next task | ||
task.inputs.extend( | ||
[ | ||
Link(name='Input', section=output.section) | ||
for output in parent_outputs or task.outputs | ||
] | ||
) | ||
# assign first parent outputs as workflow inputs | ||
if not self.inputs: | ||
self.inputs.extend(task.inputs) | ||
# assign as new parent | ||
parent_n = n | ||
# reset outputs | ||
parent_outputs = list(task.outputs) | ||
else: | ||
parent_outputs.extend(task.outputs) | ||
# if overlap, assign parent outputs to task inputs | ||
task.inputs.extend( | ||
[ | ||
Link(name='Input', section=output.section) | ||
for output in self.tasks[parent_n or n].outputs | ||
] | ||
) | ||
if not self.outputs: | ||
# assign parent outputs as workflow outputs | ||
self.outputs.extend(parent_outputs) |
33 changes: 33 additions & 0 deletions
33
src/nomad_simulations/schema_packages/workflow/geometry_optimization.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
from nomad.datamodel import EntryArchive | ||
from nomad.datamodel.metainfo.workflow import Link, Task | ||
from structlog.stdlib import BoundLogger | ||
|
||
from .general import SimulationWorkflow | ||
|
||
|
||
class GeometryOptimization(SimulationWorkflow): | ||
""" | ||
Definitions for geometry optimization workflow. | ||
""" | ||
|
||
def normalize(self, archive: EntryArchive, logger: BoundLogger) -> None: | ||
""" | ||
Specify the inputs and outputs of the tasks as the model system. | ||
""" | ||
super().normalize(archive, logger) | ||
|
||
def to_system_links(task: Task) -> None: | ||
task.inputs = [ | ||
Link(name='Input system', section=link.section.model_system_ref) | ||
for link in task.inputs | ||
if link.section and link.section.model_system_ref | ||
] | ||
task.outputs = [ | ||
Link(name='Output system', section=link.section.model_system_ref) | ||
for link in task.inputs | ||
if link.section and link.section.model_system_ref | ||
] | ||
|
||
to_system_links(self) | ||
for task in self.tasks: | ||
to_system_links(task) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
from nomad.datamodel import EntryArchive | ||
from structlog.stdlib import BoundLogger | ||
|
||
from .general import INCORRECT_N_TASKS, SimulationWorkflow | ||
|
||
|
||
class DFTGWWorkflow(SimulationWorkflow): | ||
""" | ||
Definitions for GW calculation based on DFT workflow. | ||
""" | ||
|
||
def normalize(self, archive: EntryArchive, logger: BoundLogger) -> None: | ||
""" | ||
Link the DFT and GW single point workflows in the DFT-GW workflow. | ||
""" | ||
super().normalize(archive, logger) | ||
|
||
if not self.name: | ||
self.name: str = 'DFT+GW' | ||
|
||
if len(self.tasks) != 2: | ||
logger.error(INCORRECT_N_TASKS) | ||
return | ||
|
||
if not self.inputs: | ||
# set inputs to inputs of DFT | ||
self.inputs.extend(self.tasks[0].task.inputs) | ||
|
||
if not self.outputs: | ||
# set ouputs to outputs of GW | ||
self.outputs.extend(self.tasks[1].task.outputs) | ||
|
||
# link dft and gw workflows | ||
self.tasks[0].inputs = self.inputs | ||
self.tasks[0].outputs = self.tasks[0].task.outputs | ||
self.tasks[1].inputs = self.tasks[0].outputs | ||
self.tasks[1].outputs = self.outputs |
40 changes: 40 additions & 0 deletions
40
src/nomad_simulations/schema_packages/workflow/single_point.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
from nomad.datamodel import EntryArchive | ||
from nomad.datamodel.metainfo.workflow import Link | ||
from structlog.stdlib import BoundLogger | ||
|
||
from .general import INCORRECT_N_TASKS, SimulationWorkflow | ||
|
||
|
||
class SinglePoint(SimulationWorkflow): | ||
""" | ||
Definitions for single point workflow. | ||
""" | ||
|
||
def normalize(self, archive: EntryArchive, logger: BoundLogger) -> None: | ||
""" | ||
Specify the method and system as inputs. | ||
""" | ||
super().normalize(archive, logger) | ||
if len(self.tasks) != 1: | ||
logger.error(INCORRECT_N_TASKS) | ||
return | ||
|
||
if not self.inputs: | ||
self.inputs.extend(self.tasks[0].inputs) | ||
|
||
inps: list[Link] = [] | ||
for inp in self.inputs: | ||
if inp.section and inp.section.model_system_ref: | ||
inps.append( | ||
Link(name='Input system', section=inp.section.model_system_ref) | ||
) | ||
if inp.section and inp.section.model_method_ref: | ||
inps.append( | ||
Link(name='Input method', section=inp.section.model_method_ref) | ||
) | ||
self.inputs.clear() | ||
self.inputs.extend(inps) | ||
|
||
# reconnect inputs to link as these are redefined | ||
self.tasks[0].inputs.clear() | ||
self.tasks[0].inputs.extend(inps) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does Outputs need to inherit from Time? Is there some situation where the simulation package is outputting such detailed timing info?
I guess maybe you are thinking of the workflows. Is the plan to reuse the output class directly for workflows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The workflow tasks are built from the outputs and the time info are needed for determining the order of the tasks. There are a several codes that output the time for each step of a geometry optimization for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed the task link at least for the general simulation workflow class references the output. For geometry optimization, I do a renormalization to extract the system to as I feel it is the natural input and output. Simulation.outputs as I understand, will not be extended to cover workflow methods and results, right? So I still would like to put the workflow-related quantities under workflow.method and workflow.results as in the old def.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, ok. I mean I am not against this I am just trying to understand.
I think I will put this on the agenda for tomorrow's meeting, and then we can resolve clearly this discussion? (You can probably continue as you think fit until then, it is unlikely that we override things that you deem necessary)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for the automated workflows. Sure, we can discuss them in our meeting.