Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement Request] Enabling testing of non-cloud packages with the latest software version. #1902

Open
agithomas opened this issue Jun 12, 2024 · 12 comments

Comments

@agithomas
Copy link
Contributor

agithomas commented Jun 12, 2024

Many of the integration package system tests do not use the product's latest (defined in variants.yml) docker image for testing.

The integration packages must be tested against all the latest software versions to remain compatible.

To tackle this, the below mentioned approach is expected

  • Developers, as part of the local system run (elastic-package system test) are intimated there exist a major version of the product available.
  • Developers are expected to do one of the following approach
    • Update variants.yml with the information of the latest (major version) of the software package. If the test is successful, create a PR to permanently include the major version of the software in the variants.yml
    • Add entry in the validations.yml to suppress the notification.

Current Environment.

  • Elastic-package is the most used tool by an integration developer as part of package development and testing.
  • Elastic-package has capabilities to validate the quality of the integration package. If a specific package does not meet the required capabilities, exclusions can be added to validation.yml file of the specific package
  • Elastic-package tool is aware of the folder structure of the integration package through package-spec

Requirement

  • Include (non-cloud) software version check as part of the Integration system test

Sample

        DOCKERFILE=Dockerfile
        IMAGE_LINE=$(grep -E '^FROM' $DOCKERFILE)
        IMAGE_NAME=$(echo $IMAGE_LINE | awk -F':' '{print $1}' | awk '{print $2}')
        IMAGE_TAG=$(echo $IMAGE_LINE | awk -F':' '{print $2}')

        echo "Image name: $IMAGE_NAME"
        echo "Current version: $IMAGE_TAG"

        LATEST_VERSION=$(curl -s "https://registry.hub.docker.com/v2/repositories/library/$IMAGE_NAME/tags?page_size=100" | jq -r '.results[].name' | grep -E '^[0-9]+\.[0-9]+\.[0-9]+$' | sort -V | tail -n1)

        echo "Latest version: $LATEST_VERSION"

        echo "::set-output name=image-name::$IMAGE_NAME"
        echo "::set-output name=current-version::$IMAGE_TAG"
        echo "::set-output name=latest-version::$LATEST_VERSION"
        
  • Add a new validation, with a validation code such that - if the package is not using the latest version of the software, an error is reported. This error can be ignored by adding a validation exclusion in the validation.yml file
  • Record the test outcome as an artifact file.
--report-format json --report-output file
--report-format xUnit --report-output file

Note:
The existing validation settings in elastic-package make use of elastic's package-spec. Since these settings are only related to test (system test) , making use of package-spec validation approach may not be needed i guess. However, making use of a common (validations.yml) file may be helpful

@jsoriano
Copy link
Member

I don't think we should do this in PR's CI. This would make builds unpredictable.

But we could have jobs explicitly dedicated to this detection. For example:

  • A daily job with the proposed algorithm that pings the team owning a package where an outdated image is found.
  • Use something like updatecli to automatically open PRs with the updates, pinging the package owners as reviewers. Thanks @mrodm for this idea.

In any case we need to take into account that each package can have the versions defined in different ways and places. The image version could be directly in the FROM, in an ARG or in a variants file. There may be docker compose files or Dockerfiles with multiple images referenced, each one with its own version.
We would also need to think in packages using variants what would be the variant to check or update.

We could make enhancements iteratively, first versions hard-coded in FROM, then versions in ARG, and then variants.

We would also need to discuss if this is implemented in Buildkite, or as an elastic-package subcommand. On a first thought I would prefer adding this in Buildkite, at least to start.

@agithomas
Copy link
Contributor Author

@shmsr has worked on a POC to detect - if there exists a major version of a software available.

Output shared below.

Screenshot 2024-06-07 at 11 28 41 AM

Some parts of the codebase of this project can be used here, i guess.

@shmsr , when you are good to share the details, please tag the PR with this issue

@agithomas
Copy link
Contributor Author

I don't think we should do this in PR's CI. This would make builds unpredictable.

The intention is not to block the CI. Adding a new flag may also be considered to run with system test.

A warning message may be shown so that the developers when working on a package are informed can extend the test coverage to the newly available software version. Since only the major version of the package is only targetted, this may not be too much distracting.

The second is elastic-package would have access to private repo of images such as docker.elastic.co, which might be a bit challenging when it comes to buildkite scheduled run.

A daily job with the proposed algorithm that pings the team owning a package where an outdated image is found.

Agree. This is what we have in mind. An artifact file may contain the information.

Use something like updatecli to automatically open PRs with the updates, pinging the package owners as reviewers.

@shmsr is working on this approach, but using dependabot, Github actions i believe.

@shmsr
Copy link
Member

shmsr commented Jul 2, 2024

@jsoriano @agithomas

I'm sorry for not getting back to you sooner. I had a lot on my plate the last couple of weeks, and that, along with the Spacetime in between, caused my late reply.

@jsoriano:

  • A daily job with the proposed algorithm that pings the team owning a package where an outdated image is found.
  • Use something like updatecli to automatically open PRs with the updates, pinging the package owners as reviewers. Thanks @mrodm for this idea.

Agree with this. Yes, how we open PRs is not finalized, I can look at updatecli. At a quick glance, looks like it can help us but not sure how much flexibility is given. Even if it is not suitable for us, we can create our separate GH action with the custom tool.

The problem with images we use is that it is not always semver. If you look at the screenshot in one of the previous comments of the tool I wrote, it gives the version of memcached as 1.16.28-alpine and not something like 1.16.29-bookworm (or bookworm or latest). The tool is capable of matching the custom pattern and sort the list based on that and finding the latest among the same version pattern.

@jsoriano:

In any case we need to take into account that each package can have the versions defined in different ways and places. The image version could be directly in the FROM, in an ARG, or in a variants file. There may be docker compose files or Dockerfiles with multiple images referenced, each one with its own version.

Totally, agreed. Version is defined in different ways in different packages. Initially, I also struggled while developing the tool. So, I decided my tool should have a config file for uncommon cases because no matter how much I wrote custom logic to parse versions in different ways, it sure's gonna break (if not today, but someday). Versions are spread across, FROM, ARG, in docker-compose files. Here is how I managed to create a config file: https://github.com/elastic/obs-integrations-robots/blob/bd9e07f2a13de266eb4f2ba01ad66305b8aba232/packages/svc-ver/hacking/config-subset-maker/main.go#L44 (FROM, ARG). Note that the script is very hacky, I used to quickly create a mapping for packages and service versions.

In my opinion, it's easy to address these issues and ensure consistency across packages. We can either resolve them ourselves or request that the package owners do so. Essentially, we should ask them to strictly use the variants.yaml file. Many packages are already using variants.yaml, so it would be best if the others adopt it and use it in a consistent approach. This way, we will only need to refer to variants.yml and nothing else.

Also, not only that. There are cases, where getting the base image of Dockerfile and just upgrading that won't work as they are not directly related to the service we are testing. For example, see vsphere. Here the base image is golang but what we are interested in upgrading is this. There are not many cases like this, but having a config file to track them could be useful.

@jsoriano:

We would also need to think in packages using variants what would be the variant to check or update.

Currently, I am using the default variant and extracting the version from the same. See: https://github.com/elastic/obs-integrations-robots/blob/bd9e07f2a13de266eb4f2ba01ad66305b8aba232/packages/svc-ver/variants.go#L109

Here is the tool: https://github.com/elastic/obs-integrations-robots/tree/main/packages/svc-ver

The tool can be configured to only look for major upgrades or major.minor upgrades. As major, minor, etc. is associated with semver versions, the update available is only set to true if there is a new version (major/major-minor) and is semver.

@agithomas:

A warning message may be shown so that the developers when working on a package are informed can extend the test coverage to the newly available software version. Since only the major version of the package is only targetted, this may not be too much distracting.
The second is elastic-package would have access to private repo of images such as docker.elastic.co, which might be a bit challenging when it comes to buildkite scheduled run.

Yes, Agi and I agreed that could be a nice addition.

@agithomas:

@shmsr is working on this approach, but using dependabot, Github actions i believe.

Not dependabot. But yes, GH actions. I'll look at the suggestion for updatecli. This could be helpful.

@shmsr
Copy link
Member

shmsr commented Jul 11, 2024

@jsoriano, I'd like to get your thoughts on the proposal. Should parts of https://github.com/elastic/obs-integrations-robots/tree/main/packages/svc-ver be included in the elastic-package so that the elastic-package itself can warn about the latest available major image?

Additionally, what are your thoughts on the proposal to have every package use variants.yml to standardize and enable tools like svc-ver and elastic-package to extract versions correctly?

I haven't had the time to look at updatecli yet, but I will explore it.

@jsoriano
Copy link
Member

https://github.com/elastic/obs-integrations-robots/tree/main/packages/svc-ver

I am getting 404s when accessing this repo, has it been removed or renamed?

@jsoriano, I'd like to get your thoughts on the proposal. Should parts of https://github.com/elastic/obs-integrations-robots/tree/main/packages/svc-ver be included in the elastic-package so that the elastic-package itself can warn about the latest available major image?

Each package will have different ways to update their services, and different places where to check the new versions from. If we include something in elastic-package, it should be generic enough so it can be used in all cases. This means that this should work for versions in any file, and in any format, and for versions obtained from any source. This is mostly supported by updatecli, maybe we could embed or wrap this tool.

We would also need to decide when elastic-package checks these versions. Maybe as a subcommand of elastic-package service?

Another option can be to use svc-ver in a daily job in the integrations repository, for packages adapted to use it.

Additionally, what are your thoughts on the proposal to have every package use variants.yml to standardize and enable tools like svc-ver and elastic-package to extract versions correctly?

I don't see any reliable way to enforce this, and without enforcement we will still need the coordination with the owners of the packages. I would say that any owner team can decide how to keep their version updated.
updatecli is generic enough to make replacements on any file, but it needs to be configured for that.

@shmsr
Copy link
Member

shmsr commented Jul 15, 2024

@jsoriano The repo was mistakenly private; now I have made it internal to Elastic. My bad. You should be able to access it now.

Each package will have different ways to update their services, and different places where to check the new versions from. If we include something in elastic-package, it should be generic enough so it can be used in all cases. This means that this should work for versions in any file, and in any format, and for versions obtained from any source. This is mostly supported by updatecli, maybe we could embed or wrap this tool.

We would also need to decide when elastic-package checks these versions. Maybe as a subcommand of elastic-package service?

Adding a subcommand sounds good to me. I am also exploring updatecli; so I'll check it once myself as well.

I don't see any reliable way to enforce this, and without enforcement we will still need the coordination with the owners of the packages. I would say that any owner team can decide how to keep their version updated.
updatecli is generic enough to make replacements on any file, but it needs to be configured for that.

Hmm, let me revisit this with something more concrete. I don't have strong opinions at the moment.

@shmsr
Copy link
Member

shmsr commented Jul 29, 2024

@jsoriano I did take a look at updatecli last week. Although seems like a nice tool after running and reading about the tool; I don't think we would get much for our use case unless we fork and add functionality ourselves.

For example, the Autodiscovery plugin for Dockerfile and docker-compose are there but the support is limited.

See:

  1. Add Dockerfile Autodiscovery updatecli/updatecli#1003
  2. Add Dockerfile Autodiscovery updatecli/updatecli#1003 (comment) (It does support a couple of few simple cases)

PS: Running update-cli on the root of integrations repo, results in panic in update-cli's process.

What do you think? Is there something that I might have missed?

@jsoriano
Copy link
Member

@shmsr not sure if we need this autodiscovery plugin. I tried a POC for the mysql variant and it seems to be possible to automate the update of versions.

PS: Running update-cli on the root of integrations repo, results in panic in update-cli's process.

We are using updatecli without problems in this and other repos, I haven't seen any panic yet 🤔

You can find the code for the POC here: elastic/integrations#10652
A sample dry-run execution here: https://github.com/elastic/integrations/actions/runs/10159694770/job/28094458530?pr=10652
image

This approach can be probably applied to many services in this repo, independently of how they configure their versions. We could try to find ways to autogenerate these configs as they are going to look mostly the same for many services. Updatecli supports different ways to parameterize its executions, maybe we can have a single file with different values per service.

@shmsr
Copy link
Member

shmsr commented Jul 30, 2024

@jsoriano Thanks for sharing this. I'll take a look again.

@shmsr
Copy link
Member

shmsr commented Aug 9, 2024

Hi @jsoriano, I apologize for the delayed response. I was occupied with other assigned tasks and couldn't dedicate time to this earlier. Thank you for sharing the example.

I took the example you shared and tried playing with it and now I am convinced that we should go with updatecli as it gives us more flexibility (although per package, we have to define the config).

Here's an improved version of the example you shared. It now handles only major version bumps as we wanted and some more improvements:

---
name: Bump latest MySQL service test version
pipelineid: 'bump-latest-mysql-variant-version'

sources:
  latest:
    kind: dockerimage
    spec:
      image: mysql
      architecture: "linux/amd64"
      kind: semver
      tagfilter: '^\d+.\d+.\d+$'
      versionfilter:
        kind: semver

  getCurrentVersion:
    kind: yaml
    spec:
      file: "packages/mysql/_dev/deploy/variants.yml"
      key: '$.variants.mysql_8.IMAGE'
    transformers:
      - trimprefix: 'mysql:'

conditions:
  isCurrentDockerImagePublished:
    name: |
      Is the Docker Image: mysql:{{ source `getCurrentVersion` }} published
    kind: dockerimage
    sourceid: getCurrentVersion
    spec:
      image: mysql
  isLatestDockerImagePublished:
    name: |
      Is the Docker Image: mysql:{{ source `latest` }} published
    kind: dockerimage
    sourceid: latest
    spec:
      image: mysql
  isAMajorUpgrade:
    kind: shell
    disablesourceinput: true
    spec:
      command: |
        [ $(echo {{ source `latest`  }} | cut -d. -f1) -gt $(echo {{ source `getCurrentVersion` }} | cut -d. -f1) ] && exit 0 || exit 1
targets:
  mysqlTag:
    name: "upgrade mysql variant"
    kind: yaml
    disablesourceinput: true
    spec:
      file: "packages/mysql/_dev/deploy/variants.yml"
      key: '$.variants.mysql_8.IMAGE'
      value: 'mysql:{{ source `latest` }}'

Even though updatecli's docs can be difficult to understand and some aspects can only be understood from its source code, considering the flexibility it offers, I believe updatecli is a better option. The tool I have developed is simple to use with minimal configuration. However, the challenge is that if anything changes in the future, the code will need to be rewritten due to its lower flexibility compared to updatecli.

We could try to find ways to autogenerate these configs as they are going to look mostly the same for many services. Updatecli supports different ways to parameterize its executions, maybe we can have a single file with different values per service.

Yes, agreed. Also, I came across this: updatecli/updatecli#914 (comment); similar to what you said. I think we can use templates with control structures (Go templates) like if, etc. to handle simple and custom cases. Should I give it a try and try handling at least 10-15 packages?

@jsoriano
Copy link
Member

jsoriano commented Aug 9, 2024

Should I give it a try and try handling at least 10-15 packages?

Yeah, it would be better to give a try with some packages before fully committing to this solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants