flowlord taskmaster

flowlord schedules and coordinates task dependency across workflows. Flowlord reads tasks from the done topic, failed tasks can be configured to retry a set number of times before being sent to slack and/or a retry_failed topic. Successful tasks will start children tasks.

workflow

A workflow consists of one or more phases as a way to define of how a set of task is to be scheduled and run and the dependencies between them.

[[Phase]]
task = "topic_name"
dependsOn = ""
rule = "cron=0 * * * *&offset=-4h&job=t2"
retry = 3
template = "?date={yyyy}-{mm}-{dd}"

[[Phase]]
task = "task1"
dependsOn = "topic_name"
rule = "retry_delay=30s"
retry = 3
template = "?hour={yyyy}-{mm}-{dd}T{hh}"

Phase

task: the name of the topic this task will be sent to. It is also the unique name of the task. In Addition a job of a task can be added in the task name using a colon (:) as a separator (task:job)
dependsOn:
- the name of the parent task
- this task will start after the parent task has completed successfully
- if left blank this tasks will only start based on the rule
rule: rules on about the tasks that are encoded as query params
- cron: schedule the task based on the cron pattern (see scheduling)
- offset: offsets the current time passed into the template
- files: used in conjection with a filewatcher to start tasks after a file is written
- require: used in a child task saying to only start task if value is present
- retry_delay: duration to wait before retrying the task
- batching to create multiple jobs
  - for: create a number of jobs starting with current time + offset to end of for statement
  - by: iterator when creating tasks. day (default), hour, month
  - meta: create a task for each item in the array passed in as meta.
  - meta_file: path to a file (json,csv) use to meta data template, each row will create a task.
retry: the number of times a task is retried before being sent to failed_tasks
template: a URL string that is parsed and put into the task's info string when created

Template

templating is used to create dynamic tasks based on the time run or previous jobs run. templates are designated with surrounding brackets {}

keyword	definition
{yyyy}	year
{dd}	day of month (1-31)
{mm}	month of year (1-12)
{hh}	hour of day (0-23)
{ts}	full timestamp 20060102T150405
{meta:(\w+)}	insert meta data provided in parent task or rule

The timestamp is derived from the parent task's info string and supports the following params and formats

field	format	example
day	2006-01-02	?day=2020-02-05
date	2006-01-02	?date=2010-10-12
hour	2006-01-02T15	?hour=2000-01-02T13
time	2006-01-02T15:04:05Z07:00	?time=2000-01-02T13:12:15Z

rule

defition on how the phase's task/job is to be created

cron

schedule a task based on a cron expression. * * * * * * (second minute hour mday month wday) The current time is used in the template to generate time specific tasks. An offset can be added to change the time used in the template.

Example: schedule a task to run every day at 1 AM UTC. If this ran on 2020-01-02 the info string generated would be "?date=2020-01-01T19"

[[Phase]]
task = "topic"
rule = "cron=0 0 1 * * *&Offset=-6h"
template = "?date={yyyy}-{mm}-{dd}T{hh}"

files

schedule a task after a specified file is written. This should be used with the filewatcher taskmaster or GCP file watching service. File matching is done using the stdlib filepath.Match which does not support ** matching. Flowlord will attempt to pull the timestamp from the filepath which will be used to populate the date-time in phase's template {yyyy}|{dd}|{mm}|{hh}. The matching file can be referenced in the phase's template with {meta:file}.. The filename can be references with {meta:filename}.

[[Phase]]
task = "topic"
rule = "files=/folder/*/*/*.txt"
template = "{meta:file}?opt=true"

require

used to indicate a required field or value before starting a child process.

[[Phase]]
task = "child:job"
rule = "require:{meta:file}"
template = "{meta:file}" 
dependsOn = "parent:job"

The example task 'child:job' will only start if the parent job has file data in it's meta field.

batch

batching is a way to create multiple tasks when the phase is run. This can be done with a date range or through different meta data.

for: a duration for the job to be run from. Ex: -60h
by: time iterator Ex: hour, day, month
meta: comma separated data associate with a key. Each item will generate a new task Ex: meta=key:a,b,c|key2=1,2,3
meta_file: a line deliminated data file. each row (line) will generate a new task.

# run every day at 2:01:00 for multiple items
# generates 3 tasks with the info of 
# ?name=a&value=1 AND ?name=b&value=2 AND ?name=c&value=3
[[Phase]]
task="worker:job-meta"
rule="cron=0 1 2 * * *&meta=name:a,b,c|value:1,2,3" 
template="?name={meta:name}&value={meta:value}&day={yyyy}-{mm}-{dd}"

# run every day at 5:05:00
# generates a task for every line in the file
[[Phase]]
task="worker:job-file"
rule="cron=0 5 5 * * *&meta_file=data.json" 
template="?name={meta:name}&value={meta:value}&day={yyyy}-{mm}-{dd}"

# run every day for the last week 
# generates 8 tasks from today to 7 days ago
[[Phase]] 
task="worker:lastweek"
rule="cron=0 7 7 * * *&for=-168h&by=day
template=?day={yyyy}-{mm}-{dd}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

flowlord taskmaster

workflow

Phase

Template

rule

cron

files

require

batch

Files

README.md

Latest commit

History

README.md

File metadata and controls

flowlord taskmaster

workflow

Phase

Template

rule

cron

files

require

batch