Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project Doctor #158

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from
Open

Project Doctor #158

wants to merge 3 commits into from

Conversation

grayside
Copy link
Contributor

@grayside grayside commented Mar 5, 2018

Here's a reboot on the project doctor work. This is based on #157 and should be reviewed after that.

With some tweaks to the built-in conditions we could merge this as-is, but there are a few things I'd like to address:

Example

screen shot 2018-03-05 at 1 19 30 pm

Uncomfortable Things

  • Extract the two defined conditions into working examples. I found it really difficult to parameterize for container running checks, but if we go with the sync container name in the outrigger.example.yml it will work fine.

Future Enhancements

  • Add a rig project sync:name command to supply the sync volume and sync container name.
  • Whether as rig commands or as gotemplate macros, it's really tricky to script good checks for whether container is running. I'd like to facilitate that so projects can add checks that their required services are live.
  • Integrate with a future rig project sync:doctor command which verifies the health of the sync process. (I'm planning to tackle that to split the current checking rig project sync does as part of setup into a separately callable API method and command.)
  • Concurrently execution doctor checks. This requires some stuff:
    • Addressing timeouts
    • Not outputting the results of the condition execution even in verbose mode as interleaving will be too confusing.
    • Potentially adding something like rig project doctor --analyze=conditionId to support testing that a condition behaves in isolation.

@grayside
Copy link
Contributor Author

  • Rebase onto changes in the Project Scripts API branch
  • See if we can merge more of the Condition struct particulars into the Script object. E.g., generalized forms of Diagnosis, Prescription, and Severity.
    • This will allow simplifying the code in project_doctor a little and move more handling to project_script.go
    • Secondarily, it will allow project scripts to be configured to have a success-but-warning mode, and have error and corrective messages for better user communication about next steps.
    • Third, while the naming conventions are "cute", standardizing naming might be more consistent & learnable.
    • Doing this will probably require a Condition struct to alias or shadow the Script struct

item2 := &Condition{
Id: "sync-volume-missing",
Name: "Sync Volume is Missing",
Test: []string{fmt.Sprintf("$(id=$(docker container ps -aq --filter 'name=^/%s$'); docker top $id &>/dev/null)", syncName)},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be the exact same test as the one that checks to see if the container is running?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it should be docker volume ls $syncName which is pretty easy to do as a shell test (assuming the volume name would get hard coded in outrigger.yml).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on a PR which adds a rig project sync:name to get that.

conditions["sync-container-not-running"] = item1
}

item2 := &Condition{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add a test to the effect of ps aux | grep unison | $syncName to check for the process on the local machine?

item1 := &Condition{
Id: "sync-container-not-running",
Name: "Sync Container Not Working",
Test: []string{fmt.Sprintf("$(id=$(docker container ps -aq --filter 'name=^/%s$'); docker top $id &>/dev/null)", syncName)},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some potential shell versions assuming you are ok with the sync volume being hard coded in outrigger.yml.

test -z $(docker container ps -aq --filter 'name=^/projectname-sync$') should tell you if the container exists but not much about if it is running.

test -z $(docker container ps -q --filter 'name=^/projectname-sync$') dropping the -a flag should let you know if it is running but can't tell you anything about the processes inside it. I don't think that is different than what the return code from docker top is doing for you but I'm not positive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is docker container ps never has a failing exit code, whereas docker top does.

After creating these examples of conditions, I've since gone on to work on have a more hard-coded set of checks around project sync, I hope to have a PR ready later today/tomorrow.

With all the issues around unison, I want to be able to tell people to upgrade rig and have a comprehensive solution in place for all projects using unison.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants