Repository for maintaining filesync
filesync is a harness for running copier.
DISCLAIMER: This code repository provided for code sharing purposes and is so provided "as-is". Mezmo, Inc. does not guarantee its functionality. Additionally, some pieces must be supplied in order for builds to be successful, especially base container images for building.
During an update, it performs the following actions, in order:
- Validates its configuration
- Ensures the
clone-rootdirectory exists to clone repos into - Clones the template provided as a command line argument
- Loads and validates the template's config
- Using the template config, determines which repos will have the template applied
- For each repo, performs the following
- Checks the following to see if the repo needs its templated files updated
- Does the repo already have an open PR for this version of the template?
- If yes, skip this repo
- Has the repo been configured to have the template applied to it?
- If no, skip this repo
- Has the latest version of the template already been applied to the repo?
- If yes, skip this repo
- Does the repo already have an open PR for this version of the template?
- Finds and closes any unmerged PRs and their associated branches from older versions of the template
- Creates a new update branch (See Branch Names for details)
- Runs
copierto apply the latest version of the template to the update branch - Pushes the changes applied by
copierto the update branch - Opens a PR from the update branch
- Checks the following to see if the repo needs its templated files updated
- If
autocleanis set (it's set by default), removes all repos from theclone-rootdirectory
Branch names are deterministic. They follow this pattern:
branch-prefix/repo-template-name/repo-template-sha
branch-prefixis configurable, as is thebranch-separator(though/, shown above, is the default)repo-template-nameis the name of the template that was applied to generate the branchrepo-template-shais the commit hash of the HEAD of the repo template at the time it was applied
Example:
Since filesync uses a deterministic branch name, it can find and clean up branches it has opened that were never merged. It can also identify the PRs associated with those branches and close them before deleting the branches.
The exception to this is if the repo's branch-prefix or branch-separator configuration changes. In that case, filesync may create a new branch for the same version of a repo's template or fail to clean up old branches and PRs, because the branch names it's looking for no longer match.
Currently, the commit message is controlled by filesync/commit_template.py
ci: update files from {template}
bringing common files in template up to date
template: {template}
branch: {branch}
commit: {commit}
templateis the name of the template that was appliedbranchis the branch of the template (mainormastertypically)commitis the hash of HEAD of the branch
The Jenkinsfile in a template repo should be configured to run filesync's update command on itself any time changes to it are merged to its main / master branch.
docker <registry>/<org>/filesync:latest --help
Example:
python filesync --help
Requirements are defined in requirements.txt if you want to pip install -r requirements.txt manually. The preferred method of running natively is with
pipx:
git clone [email protected]:mezmo/filesync $WORKDIR/tooling-filesync
cd $WORKDIR/tooling-filesync
pipx install -e .
filesync --help
When running inside a container, the following environment variables must all be set:
GITHUB_TOKEN†GIT_AUTHOR_NAMEGIT_AUTHOR_EMAILGIT_COMMITTER_NAMEGIT_COMMITTER_EMAIL
When running natively, if you have a .gitconfig with user.name and
user.email defined, and accessible to Python, only GITHUB_TOKEN is
required.
† GITHUB_TOKEN is actually configurable in the config yaml you pass at
runtime. The others are core to git and can not be changed.
$ filesync --help
Usage: filesync [OPTIONS] TEMPLATE COMMAND [ARGS]...
Options:
--autoclean / --no-autoclean remove clones from disk after running
-r, --clone-root TEXT path to clone repos
-d, --dry-run don't push changes to cloned repos
-b, --template-branch TEXT branch of the template to sync from
-t, --template-config TEXT path inside the template repo where its
config is stored
-e, --token-variable-name TEXT name of the environment variable storing the
GitHub token
-l, --log-level TEXT
--logging-config FILE path to logging_config.yaml
--config FILE Read configuration from FILE.
--version Show the version and exit.
--help Show this message and exit.
Commands:
onboard onboard a repo to be updated by a template
update update repos already configured for a template
filesync is configured in multiple places via yaml
Anything that can be passed to filesync via command line options can be configured in a YAML file whose path can be passed via the --config flag.
autoclean: (default:true) determines whetherfilesyncremoves the cloned repos from disk after running. You probably want this to betrue, becausefilesyncdoes not attempt to change branches or pull before running. Setting tofalseshould probably only be used for troubleshooting or inspecting changes.clone-root: (default:/tmp/filesync_clones) is the root directory where repos will be cloned. You DO NOT want this to be$WORKDIR, for the reasons stated inautocleandescription.dry-run: (default:false) enabledry-runmode for all repos.dry-runmode is special and has its own configuration sub-section below.template-branch: (default:mainormasterdepending on which the template repo uses) which branch of the template should be checked out and run to update desintation repostemplate-config: (default:filesync.yaml) the path inside the template where the template's filesync config lives (see Template Config)token-variable-name: (default:GITHUB_TOKEN) the name of the environment variable where you've stored your Github API token.log-level: (default: varies) the log level. if dry-run mode is enabled, defaults toDEBUG. if sub-command isupdatedefaults toINFO. if sub-command isonboard, defaults toERROR.logging-config: allows extra control over logging (see Logging Config)
Lives in the template repo (default location filesync.yaml). Config options:
answers-file: (default:.copier-answers.yml) the path in the destination repo where the config for how this template is applied to it bycopieris storedautoscan: (default:False) if enabled,autoscanclones every repo in the defaultorgthat isn't a fork, and isn't archived. if that repo has ananswers-fileit is added to the list of repos that will have the template run against them.branch-prefix: (default:filesync) See Branch Names abovebranch-separator: (default:/) See Branch Names abovedry-run: (default:False) runfilesyncin dry-run modeorg: the default GitHub organization if one isn't supplied on the CLI or in thereposlist for a repo (see Determining Repos and Orgs)repos: The list of repos this template should be applied to. Each repo can be just the name of the repo, or a map with its own config custom to it, whose keys match the ones in the top level of this config.
With the exception of dependency-level, these settings match Python standard
config settings for logging.
These settings are explicitly passed to logging.basicConfig, so arbitrary
supported configuration options for the Python standard logging will not be
passed.
-
datefmt(default:"%Y-%m-%d %H:%M:%S") allows timestamp format to be changed independently of the rest of the loggingformat. -
filename(default: not set) path to the file to write logs. By leaving this unset, logs are printed to the console, which is preferable for running in containers and in Jenkins. -
format(default:"%(asctime)s %(levelname)-7s - %(name)s: %(message)s") The format of the logs.Example output:
2021-06-18 11:40:56 WARNING - FileSync: DRY RUN MODE ENABLED! 2021-06-18 11:40:56 INFO - FileSync: Nothing will be changed inside repos. No branches will be created, no PRs will be opened. Cloning and cleanup will happen as needed. 2021-06-18 11:40:56 INFO - FileSync: started! -
level(default:info) log level forfilesync -
dependency-level(default:warn) log level for imported dependencies. Note that these are manually re-configured in the code, so if dependencies are added, they may not be impacted bydependency-levelwithout code changes.
A repo's org can be defined from org at the top level of the template config, or from org in its own repo config, but it can also be defined in-line with the repo name:
org: a-github-org
repos:
- some-repo:
org: a-different-github-org
- a-third-github-org/some-other-repo