feature: add release fetcher and dispatcher #104

dipankardas011 · 2024-06-08T10:08:24Z

What type of PR is this?

kind/feature

What this PR does / why we need it:

TBD

Which issue(s) this PR fixes:

Fixes #98

Special notes for your reviewer (optional):

dipankardas011 · 2024-06-09T17:01:06Z

Questions

How are we planning to store what all projects are supported? (is it some environment variable or a file as a variable in repo?)
found a hard limit for github variable is 500 only
how to update the github repo variable

Plans

will using ${{ vars.<> }} to get gh variable once we get the version will do a curl request to the following repo to get the current latest version if we need to update it will update that via github rest api and once done need to trigger the following pipeline for respective projects

for the dispatch we need to call directly via the api call and then we can put concurrency with the group which can be same for the the subsequent project benchmarking tests

rossf7 · 2024-06-10T13:00:24Z

How are we planning to store what all projects are supported? (is it some environment variable or a file as a variable in repo?)

I think we should use a file. I included an example file in the proposal in the subscribing to releases section but it may need more data such as the GitHub repo URL .
https://github.com/cncf-tags/green-reviews-tooling/blob/main/docs/proposals/proposal-001-trigger-and-deploy.md#subscribing-to-releases

For the location I think it should be /projects/project.yaml? As underneath the projects dir we will have the per project manifests. e.g. /projects/falco/ebpf.yaml WDYT?

found a hard limit for github variable is 500 only

Good to check this. However if we use 1 variable per project I don't think we'll ever hit this limit. Or if we do hit it our automation will need to be more sophisticated and later we can switch to something else for managing state.

how to update the github repo variable

Can you use the REST API?

dipankardas011 · 2024-06-10T17:08:31Z

Can you use the REST API?
Forgot to update was able to work with that just that we need to use the same curl command to retrive the variable

planning to use python script to as much as possible no dependency install

dipankardas011 · 2024-06-10T17:09:20Z

For the location I think it should be /projects/project.yaml? As underneath the projects dir we will have the per project manifests. e.g. /projects/falco/ebpf.yaml WDYT?

SGTM, I also want to specifiy the organization of the project as well

dipankardas011 · 2024-06-11T11:19:12Z

@rossf7 Can you create a PAT token for only this repo so that we can see if it works or not?

dipankardas011 · 2024-06-11T11:20:26Z

Here are the variables I am looking for

we can still change the naming things ;)

secrets.REPO_ACCESS_TOKEN: for read and write of the repo variables
vars.FALCO_VERSION: some random string as value
vars.KEPLER_VERSION: some random string as value

rossf7 · 2024-06-11T11:47:15Z

planning to use python script to as much as possible no dependency install

@dipankardas011 Reasoning makes sense but as discussed we need agreement that python is the higher level language we want to use. I'll discuss with the other leads.

Can you create a PAT token for only this repo so that we can see if it works or not?

I don't have permission to create tokens. Can you test with your fork for now? Once we're aligned on the python question we can request the token.

dipankardas011 · 2024-06-11T13:06:14Z

I don't have permission to create tokens. Can you test with your fork for now? Once we're aligned on the python question we can request the token.

Tested it works 😁

rossf7

@dipankardas011 Thank you for your efforts here and being flexible on using Go for this rather than Python.

As discussed ideally we would use bash but the parsing of the releases endpoint response and error handling make this tricky so I think using Go makes sense.

Design wise my main question is how the deploy step will be triggered? I think we should use the REST API as its just another endpoint to call.

I'll reply separately on the repo config changes.

.github/workflows/falco.yml

projects/project.json

scripts/repo-variable-writer.go

rossf7 · 2024-06-14T10:42:48Z

Here are the variables I am looking for

secrets.REPO_ACCESS_TOKEN: for read and write of the repo variables
vars.FALCO_VERSION: some random string as value

For REPO_ACCESS_TOKEN we have an existing secret called FLUX_GITHUB_TOKEN. I think we should rename this to GITHUB_TOKEN and we can use it for both tasks.

https://github.com/cncf-tags/green-reviews-tooling/blob/main/.github/workflows/tofu.yaml#L23

FALCO_VERSION yes we will need to add this variable. I checked the endpoint and unfortunately there is no "upsert" support so this will be a setup task when we onboard new projects which we'll need to document.

dipankardas011 · 2024-06-14T14:13:08Z

Design wise my main question is how the deploy step will be triggered? I think we should use the REST API as its just another endpoint to call.

Yes I thought about it but then another problem cam arise as we are directly calling a different workflow we can't know for sure which parent called the child workflow. so thought to give workflow_call method as I have used previously and it works nice
not sure if it answers your question :)

dipankardas011 · 2024-06-14T14:19:38Z

I checked the endpoint and unfortunately there is no "upsert" support so this will be a setup task when we onboard new projects which we'll need to document.

📓 Keep a note

dipankardas011 · 2024-06-14T14:20:22Z

For REPO_ACCESS_TOKEN we have an existing secret called FLUX_GITHUB_TOKEN. I think we should rename this to GITHUB_TOKEN and we can use it for both tasks.

Does FLUX_GITHUB_TOKEN contains repo:variable:write perms?

I would say lets keep it specific to perms and role for that PAT token so that we can know what is it used for

also general aka all in one token doesn't seems correct as each token should have very limited scope

rossf7 · 2024-06-14T14:57:33Z

Yes I thought about it but then another problem cam arise as we are directly calling a different workflow we can't know for sure which parent called the child workflow. so thought to give workflow_call method as I have used previously and it works nice

@dipankardas011 This should be implemented via a workflow_dispatch event as was agreed in the proposal.
https://github.com/cncf-tags/green-reviews-tooling/blob/main/docs/proposals/proposal-001-trigger-and-deploy.md#trigger

The 3 inputs mean we know the project, version and which falco driver to deploy.

Does FLUX_GITHUB_TOKEN contains repo:variable:write perms?
I would say lets keep it specific to perms and role for that PAT token so that we can know what is it used for
also general aka all in one token doesn't seems correct as each token should have very limited scope

Good point. The current token doesn't have permission and better to scope it to the needed permissions.

So we will need to request a new token.

dipankardas011 · 2024-06-14T15:03:15Z

Yes I thought about it but then another problem cam arise as we are directly calling a different workflow we can't know for sure which parent called the child workflow. so thought to give workflow_call method as I have used previously and it works nice

@dipankardas011 This should be implemented via a workflow_dispatch event as was agreed in the proposal.
https://github.com/cncf-tags/green-reviews-tooling/blob/main/docs/proposals/proposal-001-trigger-and-deploy.md#trigger

The 3 inputs mean we know the project, version and which falco driver to deploy.

Does FLUX_GITHUB_TOKEN contains repo:variable:write perms?
I would say lets keep it specific to perms and role for that PAT token so that we can know what is it used for
also general aka all in one token doesn't seems correct as each token should have very limited scope

Good point. The current token doesn't have permission and better to scope it to the needed permissions.

So we will need to request a new token.

Okay so you meant instead of calling like
Uses: ./.GitHub......
We should use the GitHub api call instead to call the workflow dispatch ?

May be it's not that different still if we do it we might be better of calling the workflows from the go script itself

.github/workflows/get-latest-releases.yml

rossf7 · 2024-06-14T15:19:15Z

Okay so you meant instead of calling like
Uses: ./.GitHub......
We should use the GitHub api call instead to call the workflow dispatch ?

May be it's not that different still if we do it we might be better of calling the workflows from the go script itself

Yes, its similar but we can call the workflow via this endpoint

https://docs.github.com/en/rest/actions/workflows?apiVersion=2022-11-28#create-a-workflow-dispatch-event

So all the logic is contained in the go script.

dipankardas011 · 2024-06-15T18:09:17Z

@rossf7 should we add context as well for each fmt.Errorf to get which component failed ?

I mean error wrapping

leonardpahlke

We have some Go 🙌 😄 - added a couple of comments. We need to make sure 1. the names and functions are clear. 2. the project is split in files so it makes sense. 3. general quality of code. See comments.

leonardpahlke · 2024-06-16T11:39:47Z

scripts/repo-variable-writer.go

+func toPtr[T any](v T) *T {
+	return &v
+}


There is no need for this function. We can just use &v.

ran some checks:

scripts/repo-variable-writer.go:36: File is not `gofumpt`-ed (gofumpt) body io.Reader, headers map[string]string) (*http.Response, error) { scripts/repo-variable-writer.go:105: File is not `gofumpt`-ed (gofumpt) scripts/repo-variable-writer.go:126: File is not `gofumpt`-ed (gofumpt) variableName string, workflowFileName string) (*string, map[string]string, error) { scripts/repo-variable-writer.go:154: File is not `gofumpt`-ed (gofumpt) scripts/repo-variable-writer.go:196: File is not `gofumpt`-ed (gofumpt) scripts/repo-variable-writer.go:231: File is not `gofumpt`-ed (gofumpt) scripts/repo-variable-writer.go:297: File is not `gofumpt`-ed (gofumpt) scripts/repo-variable-writer.go:317: File is not `gofumpt`-ed (gofumpt) scripts/repo-variable-writer.go:35:48: `handleHTTPCall` - `timeOut` always receives `time.Minute` (`60000000000`) (unparam) func handleHTTPCall(method string, url string, timeOut time.Duration, ^ scripts/repo-variable-writer.go:104:52: unnecessary leading newline (whitespace) ^ scripts/repo-variable-writer.go:153:83: unnecessary leading newline (whitespace) ^ scripts/repo-variable-writer.go:195:80: unnecessary leading newline (whitespace) ^ scripts/repo-variable-writer.go:230:94: unnecessary leading newline (whitespace) ^ scripts/repo-variable-writer.go:296:14: unnecessary leading newline (whitespace) ^ scripts/repo-variable-writer.go:316:45: unnecessary leading newline (whitespace) ^ scripts/repo-variable-writer.go:352:4: unnecessary trailing newline (whitespace) ^ scripts/repo-variable-writer.go:108:15: fmt.Errorf can be replaced with errors.New (perfsprint) return nil, fmt.Errorf("the environment variable for github pat is missing") ^ scripts/repo-variable-writer.go:117:6: ST1003: type githubApiEndpointType should be githubAPIEndpointType (stylecheck) type githubApiEndpointType string ^ scripts/repo-variable-writer.go:124:28: ST1003: method genUrlAndHeaders should be genURLAndHeaders (stylecheck) func (obj *GithubVariable) genUrlAndHeaders( ^

[/Users/lvsp/Projects/GitHub/cncf-tags/green-reviews-tooling/scripts/repo-variable-writer.go:59] - G304 (CWE-22): Potential file inclusion via variable (Confidence: HIGH, Severity: MEDIUM) 58: func loadProjectMetadata(locProj string) (resp *Projects, err error) { > 59: _b, err := os.ReadFile(locProj) 60: if err != nil { Summary: Gosec : dev Files : 1 Lines : 373 Nosec : 0 Issues : 1

usually it is good to split between model, model functions and core logic. so we may want to split this in three files.

We also may want to move this project to cmd/github-x?/* idk the project name

Model would be the github data models (structs). right now, these are all over the file and its not really clear.

the model function would be the logic that collects the data or integrates the data

the core logic would be the main functio, how we read in flags etc.

we are missing tests. i would rather not merge any code which is not tested.

every function needs some kind of information what it does.

// some descriptive text what the function does and how it is used func handleHTTPCall(method string, url string, timeOut time.Duration, body io.Reader, headers map[string]string) (*http.Response, error) {

i expect that this code only reads and does not write anything right? it does not make any changes to a repository?

it would be good to create some kind of README that explains how to use the project, where it is used and what it does & why it exists

@leonardpahlke Thanks for your review

Will try to update my nvim conf for gofmt.

not able to configure to use gofumpt, neither goland nor nvim

scripts/repo-variable-writer.go

leonardpahlke · 2024-06-16T12:29:40Z

wem need to add some quality checks to the repo. this can be done in another PR. golint gosec etc.

Signed-off-by: Dipankar Das <[email protected]> added parallel processing Signed-off-by: Dipankar Das <[email protected]> Update scripts/repo-variable-writer.go Co-authored-by: Ross Fairbanks <[email protected]> Signed-off-by: Dipankar Das <[email protected]> updated the url generator Signed-off-by: Dipankar Das <[email protected]> improved the unnecessary copy of slice in for-range Signed-off-by: Dipankar Das <[email protected]> refactor cmd/http.go Signed-off-by: Dipankar Das <[email protected]> moved the package to a new folder Signed-off-by: Dipankar Das <[email protected]> refactor variable and function names Signed-off-by: Dipankar Das <[email protected]>

Signed-off-by: Dipankar Das <[email protected]>

leonardpahlke · 2024-06-17T15:15:12Z

@dipankardas011 i will propose some edits in a PR and open it towards your repository.

leonardpahlke · 2024-06-17T15:49:07Z

@dipankardas011 what is the expected output of the program? Can you write an outline. My assumption, it takes a list of GitHub repositories & outputs a list of latest releases? a release has a bunch of information which kind of information is required?

leonardpahlke · 2024-06-17T16:37:45Z

Alright, I was looking at the code. And also the plan we worked on so far.

My understanding, looking at the code.

load a project manifest json from the green-reviews repo which contains the information about all projects that we like to deploy. the file reference can be passed as flag. --repo-manifest https://github.com/cncf-tags/green-reviews-tooling/blob/main/repo-manifests.json

{
  "falco": {
    "repo": "https://github.com/falcosecurity/falco"
  }
}

loop over all repositories and ?

We need the helm charts or some other commands that we can execute to deploy Falco. Falco's helm charts seem to be located here https://github.com/falcosecurity/charts/tree/master/charts/falco if there is a new Falco release, we would deploy the helm chart. So the information we need is a bit more.

{
  "falco": {
    "v0.38": { // config used for version v0.38 or newer unless a higher version is specified. in case the deployment changes by release
      "repo": "https://github.com/falcosecurity/falco",
      "deployment": [
        "helm repo add falcosecurity https://falcosecurity.github.io/charts",
        "helm repo update",
        "helm install falco falcosecurity/falco --create-namespace --namespace falco"
      ],
      "verify": [],          // to check if the deployment succeeded
      "collectdata": [],     // idk how this is done. maybe automatically
      "destroy": []          // delete any deployed resources
    }
  }
}

You are right, @dipankardas011 we need some kind of storage to keep track of the latest release measured by project. A repository variable could do the trick "falco:v0.38,xyz:v0.1".

Right now, if i read the current plan right (plan), we have one pipeline per project (falcosecurity/falco) right now and use the workflow-dispatch action to trigger our deployment pipeline. Perhaps it's easier to trigger the deployment pipeline once a day. Don't have multiple pipelines. Don't scatter pipelines across repositories. We do not need to measure the minute a release is cut. This would make it easier. We can always manually trigger the pipeline if we wish to do so or increase the cadence. IMO, I would try to reduce complexity as much and involve more and more folks in a process only if that is really necessary.

leonardpahlke · 2024-06-17T16:46:05Z

This could be done in a bash script.

rossf7 · 2024-06-17T17:34:54Z

@leonardpahlke Thank you for reviewing. Adding my thoughts on these design changes.

Step 1: yes, the release trigger is to check if there is a new release of Falco. It should call the deploy workflow being added in #105

This was added so we could subscribe to releases without projects needing to call our deploy workflow from their CI/CD. Although they can do that if they wish.

Step 2: The Falco team wished to deploy using Kustomize and created this repo.
https://github.com/falcosecurity/cncf-green-review-testing

There are 3 Falco drivers they wish to test. For #100 my plan is to create an overlay per driver using kustomize.

This could be done in a bash script.

Yes, when I talked with Niki and Antonio our preference would be to use bash and only use go if absolutely necessary.

I like your idea of always triggering the pipeline once a day for the latest release. This would be useful for testing since Falco releases do not happen often and mean we don't need to store state.

I think simplifying the design so we can use bash is the best way to move forward.

rossf7 · 2024-06-18T08:09:51Z

@leonardpahlke @dipankardas011 I've been thinking more on how we could simplify the design. As I think I made it more complex than was necessary :(

Instead of the projects.json I think we could use inputs and default to the falco values.

Dipankar, I think your previous suggestion to use workflow_call instead of workflow_dispatch to call the deploy workflow is better and mean a PAT is not needed to trigger the deploy.

WDYT?

dipankardas011 · 2024-06-18T15:55:57Z

Dipankar, I think your previous suggestion to use workflow_call instead of workflow_dispatch to call the deploy workflow is better and mean a PAT is not needed to trigger the deploy.

Sol 1> For that we can make the go script which writes a file in json and parse and set environment variables

somthing like this (previous impl)

jobs:
  get-latest-version:
    runs-on: ubuntu-latest
    outputs:
      is-falco-updated: ${{ steps.check-updates.outputs.FALCO }}
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-go@v5
        with:
          go-version: '^1.22.3'

      - name: build the binary
        # env:
        #   GH_TOKEN: ${{ secrets.REPO_ACCESS_TOKEN }}
        working-directory: cmd
        run: |
          go get -d
          go build -v -o ../proj-trigger .

      - name: trigger updated projects
        # env:
        #   GH_TOKEN: ${{ secrets.REPO_ACCESS_TOKEN }}
        working-directory: cmd
        run: |
          go get -d
          go build -v -o ../proj-trigger .

      - name: Dispatch
        id: check-updates
        env:
          file: /tmp/updates.json # generated by go program
        run: |
          if jq -e --arg val "falco" '.proj_names | index($val) != null' "$file" > /dev/null; then
            echo "Element '$value' found in the proj_names array."
            echo "FALCO=true" >> $GITHUB_OUTPUT
          else
            echo "Element 'falco' not found in the proj_names array."
            echo "FALCO=false" >> $GITHUB_OUTPUT
          fi


  falco-proj:
    uses: .github/workflows/falco.yml
    needs: ["get-latest-version"]
    if: ${{ needs.check-updates.outputs.is-falco-updated }} == 'true'
    secrets: inherit
    with:
      version: ${{ vars.FALCO_VERSION }} # need to check if it gets the updated value

Sol 2> We can do this way to have a seperate bash script or another option which is go will write the bash script which contains these echo "FALCO=false" >> $GITHUB_OUTPUT
and we will just perform source and make a command like
set_updated_projects_envs() in the bash

go run ......
# next step
source ~/${some source} # written by go
set_updated_projects_envs()

by this the envs for these will be written I beloeve to the $GITHUB_OUTPUT

other than that not sure if we should move entirely to bash scripts (not very comfortable[would need help]) and remove go script. Still decision on you both

dipankardas011 · 2024-06-18T15:59:37Z

Also I have made to many files than necessary (actually was trying to undertand leo's inputs). may be that looks too much complicated

dipankardas011 · 2024-06-20T07:22:59Z

@leonardpahlke I have closed this PR and created a new one as its will potentially cause review problems so now Using Bash script with droped Idea of project version to github variables and instead to call it everyday to perform for all projects irrespective of if there is or not there any new version
#105

dipankardas011 changed the title ~~feature: add release fetcher and dispatcher~~ feature: add release fetcher and dispatcher [WIP] Jun 11, 2024

dipankardas011 force-pushed the 98-release-trigger-dispatcher branch from 85c3400 to 06669ee Compare June 11, 2024 18:55

dipankardas011 changed the title ~~feature: add release fetcher and dispatcher [WIP]~~ feature: add release fetcher and dispatcher Jun 12, 2024

dipankardas011 force-pushed the 98-release-trigger-dispatcher branch from 06669ee to b01c5d2 Compare June 12, 2024 06:44

dipankardas011 marked this pull request as ready for review June 12, 2024 13:30

dipankardas011 requested a review from a team as a code owner June 12, 2024 13:30

dipankardas011 force-pushed the 98-release-trigger-dispatcher branch 2 times, most recently from 719e71d to ee41331 Compare June 14, 2024 03:54

rossf7 reviewed Jun 14, 2024

View reviewed changes

.github/workflows/falco.yml Outdated Show resolved Hide resolved

projects/project.json Outdated Show resolved Hide resolved

scripts/repo-variable-writer.go Outdated Show resolved Hide resolved

scripts/repo-variable-writer.go Outdated Show resolved Hide resolved

dipankardas011 force-pushed the 98-release-trigger-dispatcher branch 2 times, most recently from fcb5fd5 to 36d49eb Compare June 14, 2024 14:41

dipankardas011 commented Jun 14, 2024

View reviewed changes

.github/workflows/get-latest-releases.yml Outdated Show resolved Hide resolved

dipankardas011 force-pushed the 98-release-trigger-dispatcher branch from a216367 to 4e4ea64 Compare June 15, 2024 18:12

leonardpahlke requested changes Jun 16, 2024

View reviewed changes

dipankardas011 force-pushed the 98-release-trigger-dispatcher branch from a6edbb0 to b506150 Compare June 17, 2024 06:55

updated the workflow_call

7da279c

Signed-off-by: Dipankar Das <[email protected]>

dipankardas011 closed this Jun 19, 2024

dipankardas011 mentioned this pull request Jun 19, 2024

feat: Trigger other Benchmark workflows #106

Merged

feature: add release fetcher and dispatcher #104

feature: add release fetcher and dispatcher #104

Conversation

dipankardas011 commented Jun 8, 2024

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer (optional):

dipankardas011 commented Jun 9, 2024 • edited Loading

Questions

Plans

rossf7 commented Jun 10, 2024

dipankardas011 commented Jun 10, 2024

dipankardas011 commented Jun 10, 2024

dipankardas011 commented Jun 11, 2024

dipankardas011 commented Jun 11, 2024 • edited Loading

rossf7 commented Jun 11, 2024

dipankardas011 commented Jun 11, 2024

rossf7 left a comment

Choose a reason for hiding this comment

rossf7 commented Jun 14, 2024

dipankardas011 commented Jun 14, 2024

dipankardas011 commented Jun 14, 2024

dipankardas011 commented Jun 14, 2024 • edited Loading

rossf7 commented Jun 14, 2024

dipankardas011 commented Jun 14, 2024

rossf7 commented Jun 14, 2024

dipankardas011 commented Jun 15, 2024 • edited Loading

leonardpahlke left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dipankardas011 Jun 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leonardpahlke commented Jun 16, 2024

leonardpahlke commented Jun 17, 2024

leonardpahlke commented Jun 17, 2024

leonardpahlke commented Jun 17, 2024 • edited Loading

leonardpahlke commented Jun 17, 2024

rossf7 commented Jun 17, 2024

rossf7 commented Jun 18, 2024

dipankardas011 commented Jun 18, 2024

dipankardas011 commented Jun 18, 2024

dipankardas011 commented Jun 20, 2024

dipankardas011 commented Jun 9, 2024 •

edited

Loading

dipankardas011 commented Jun 11, 2024 •

edited

Loading

dipankardas011 commented Jun 14, 2024 •

edited

Loading

dipankardas011 commented Jun 15, 2024 •

edited

Loading

dipankardas011 Jun 16, 2024 •

edited

Loading

leonardpahlke commented Jun 17, 2024 •

edited

Loading