Current Rakudo (possibly MoarVM as well) development process hinders releasing #206

Altai-man · 2020-06-22T08:18:52Z

Here I will describe a couple of situations which has happened during last months to show a particular flaw of the current development process this issue brings up. I have zero intention to bash our volunteer developers, so take it as an offense towards development process / culture in a whole. I am happy people are willing to spend their time and efforts toward making Raku implementation better, but I am sure it can be done in a safer, less painful and thus more enjoyable process for everyone.

Feb 13, this commit to MoarVM bumps version of the dyncall library shipped with MoarVM. The new version contained a critical bug which lead to build failures when linking against musl, glibc alternative notably used by Alpine distro popular for CI containers. It makes into a release and leads to going off to developers, waiting for a patch and re-taking a release, because there were no checks for something we expect to support to detect it.
Mar 15, this PR to MoarVM is merged. This JIT contained a critical bug observable on Windows. Later, revision with this bug is brought in Rakudo and master branch CI checks start to fail. They fail until a fix is provided at Apr 26, that makes a 41 days straight the master branch was broken which (along with the dispatch situation which was solved by a revert) blocked the release completely, because the bump itself was not checked for green lights and then failures were ignored until it was very hard to say where the issue can be.
Apr 3, this commit introduces a regression in relocatable builds, which goes on unnoticed, released and May 8 a solution was given, resulting in a release re-take because we had no checks for something we expect to support to detect it.
Jun 5, this PR resolves a long standing Windows issue. All checks are failed due to, possibly, our setup not supporting changes which involve PRs in MoarVM/nqp/rakudo being needed simultaneously. After merge, Windows builds of master start to fail and again they fail for 14 days until someone provides a two-line patch after some asking for it for a couple of days before release.
Jun 6, this commit uses declarations not compatible with older gcc. This commit offends our Circle CI check after a bump since around here and nobody is bothered by the check failing. When I tried to go for it, the website redirects me to sign-in page. Seeing this unusual behavior (as usually you don't need to do anything to view CI public logs, maybe it wants one to sign-in and enable them?), assuming we migrated to Azure and the check will be dropped soon anyway, we go into release. Now, master fails for 15 days straight and counting. A single person's, release manager, misunderstanding and common ignoring failures on master has lead us to yet another point release.

As it was stated on irc, there are Expectations from our releases. As was said another time, Raku is not "is it a vaporwave?" anymore and it went a long path from a project where a bunch of folks were committing code to something used in production and it is just "expected" by people we support different platforms (even when there are no checks for them).

If we want to ensure our releases met expectations, current development process which is, as shown above, prone to creating problematic situations, must be addressed.

Altai-man · 2020-06-22T08:25:07Z

Possible solutions

There are not so many solutions I can suggest, but I have one.

In the described problems, there are two sources of evil: 1)no check for some case we "suppose" we support; 2)the check shows red and was ignored.

To address these, we need to fix 1 and fix 2, changing current development culture, including:

Development migrates into PR-first mode instead of committing to master. Master branch is protected from a PR merge if the CI checks are not green. Want to change something -> PR -> checks green ?? Review and merge !! Re-take.
A check failure on master branch is considered as an extreme situation and we don't move forward until it is resolved.
Do not rely on "We assume we support some rarer than usual platforms and try not to break them, but there are no real checks around" anymore. Establish a complete list of platforms and tools we Officialy Met Expectations for and add a clear CI check for every missing point of this list.

I know it may bring in some disapproval, saying that such restrictions are not fun for developers anymore, but I am sure that it is certainly not fun for developers to debug issues introduced 40 days ago and it is not fun for people to have troubles with packaging new releases and doing other wiring and it is not fun for us all to spend more time on fixing the consequences rather than spending less time on keeping our master branch healthy preventing the consequences in the first place.

lizmat · 2020-06-22T09:21:27Z

On 22 Jun 2020, at 10:25, Altai-man ***@***.***> wrote: Possible solutions There are not so many solutions I can suggest, but I have one. Change current development culture, including: • Development migrates into PR-first mode instead of committing to master. Master branch is protected from a PR merge if the CI checks are not green. Want to change something -> PR -> checks green ?? Review and merge !! Re-take. • A check failure on master branch is considered as an extreme situation and we don't move forward until it is resolved. • Do not rely on "We assume we support some rarer than usual platforms and try not to break them, but there are no real checks around" anymore. Establish a complete list of platforms and tools we Officialy Met Expectations for and add a clear CI check for every missing point of this list. I know it may bring in some disapproval, saying that such restrictions is not fun for developers anymore, but I am sure that it is certainly not fun for developers to debug issues introduced 40 days ago and it is not fun for people to have troubles with packaging new releases and doing other wiring and it is not fun for us all to spend more time on fixing the consequences rather than spending less time on keeping our master branch healthy preventing the consequences in the first place.

I could live with that, provided that the CI is actually reliable. So far, I have seen *waay* more false positives from CI than I have seen false negatives. It's the false positives (when CI says there's something wrong, and it's the CI that is wrong) that are impeding development.

AlexDaniel · 2020-06-22T10:16:22Z

See rakudo/rakudo#3700 (comment).

Once finished, it should help with these issues: Mar 15, Jun 5, and maybe Jun 6 because CI status should become more helpful.

AlexDaniel · 2020-06-22T10:23:15Z

Do not rely on "We assume we support some rarer than usual platforms and try not to break them, but there are no real checks around" anymore. Establish a complete list of platforms and tools we Officialy Met Expectations for and add a clear CI check for every missing point of this list.

I agree with this, but technically it also means running Blin on all of these platforms. Did you know we support mipsel? We should definitely strive towards more platforms being tested, but it's probably not possible to achieve the perfection here.

melezhik · 2020-06-22T13:28:43Z

We should definitely strive towards more platforms being tested, but it's probably not possible to achieve the perfection here.

Establish a complete list of platforms and tools we Officialy Met Expectations for and add a clear CI check for every missing point of this list.

It's relatively easy with RakuDist. It has pluggable design in mind where spining up a new docker image and adding it to the system is not a big deal. So far it does tests for:

alpine
debian
centos
ubuntu

We could add more images to the list ( including old CentOS and any exotic that docker support, Sparrow is super flexible as well do deal with such a variety )

The question is what kind of test do we want to ensure another commit to MoarWM/rakudo does not break stuff. If one tells me what kind of test we need here I could start implementing this on RakuDist side.

Current test workflow is that:

download Rakudo from whateverable
run zef install for a certain module

But yes, we can support more testing scenarios, including building Rakudo/MoarWM from source, whatever ...

melezhik · 2020-06-22T13:46:38Z

Further thoughts. I am thinking about pottage.raku.org - a service for Rakudo/MoarVM smoke testing for various Linuxes, so that for every commit we:

build moarvm
build rakudo
run lightweight tests to ensure OS compatibly

Every test should run for every OS/distro in the list and should not take more say 5-10 minutes, so we could catch architecture/platform dependent bugs as soon as possible and report any problems as quick as possible.

I have all the components in place (UI/Job runner - Sparky/Backend and CM tool - docker + Sparrow ) - similar to RakuDist, so there is good base to start with ...

AlexDaniel · 2020-06-22T13:53:36Z

Further thoughts. I am thinking about pottage.raku.org - a service for Rakudo/MoarVM smoke testing for various Linuxes, so that for every commit we:
* build moarvm

* build rakudo

* run lightweight tests  to ensure OS compatibly

Isn't it exactly what we do with current CI setups?

melezhik · 2020-06-22T13:59:47Z

@AlexDaniel I don't know, maybe. The idea to do it for as many OS/distros/Envs as possible. Is that the case now?

rba · 2020-06-22T14:12:09Z

AFAIK the overhaul of the build pipelines using Azure CI by @patrickbkr die cover the build of moarvm and rakudo and some testing.

I would like to extend it to automate the star release in the future too.

niner · 2020-06-22T14:28:36Z

We don't need more CI tests. We don't need more target platforms.

What we need is reliable CI tests. They get ignored because most of the time we look at those results is wasted because it was yet another false positive. We've had Travis reporting to the IRC channel and we'd jump whenever it reported failure. But most of them were because of Travis not being able to clone a repository or other benign issues. So someone wrote an IRC bot to tell us whether a failure was a false positive or not but that vanished, too.

What we also need is more cooperation on existing CI infrastructure. We've had Travis for Linux and OS X and AppVeyor for Windows. Someone was dissatisfied because of some issues and we got CircleCI. So we've had Travis, AppVeyor and CirecleCI all with their issues. I've added https://build.opensuse.org/project/show/home:niner9:rakudo-git because I wanted coverage for issues that would appear only in packaged versions and also coverage of important modules. This regularily reports failures like t/nqp/111-spawnprocasync.t (Wstat: 6 Tests: 4 Failed: 0) and got ignored completely and instead we got this Azure Pipelines thing.

Really, please stop adding additional systems.

Instead, just make reports worth getting looked at then make it so we don't have to check 5 different websites with different user interfaces to get at the results and then start looking at the reports, point at broken commits and create reduced test cases.

Altai-man · 2020-06-22T14:50:26Z

@lizmat

So far, I have seen waay more false positives from CI than I have seen false negatives. It's the false positives (when CI says there's something wrong, and it's the CI that is wrong)

Then we have to eliminate false positives instead of ignoring the checks. I mean, false positives did happen, but somehow other communities do use CI, likely preventing problems with releases and broken master for weeks as we have. There are flappers in roast but nobody pushes the code without running it saying "There are flappers, so I won't" (or so I hope). :)

@AlexDaniel

I agree with this, but technically it also means running Blin on all of these platforms

Still better than suddenly breaking someone's code in the wild because we don't check. Working on a language as we do is full of Technical Difficulties anyway, this is one of them, I guess.

Did you know we support mipsel?

I had no idea and this is precisely the problem. Tomorrow someone's code on mipsel will break next release and we will be "Hmm, well, it is not stated anywhere, but I guess we kind of support that, let's patch and release again". I am not talking about perfection and I am sure suggested scheme won't eliminate point releases. However, I don't see how catching issues earlier can be seen as something wrong.

Even a wiki page stating explicitly "We support this, that, this and that" will help tremendously.

@rba

AFAIK the overhaul of the build pipelines using Azure CI by @patrickbkr die cover the build of moarvm and rakudo and some testing

Yes, this is an awesome piece of work, because we got testing for JVM and eventually relocability too. It's just that older gcc were not on the plate, which is hopefully fix-able. Then we can have one system to rule them all and given rakudo is reliably enough to not torture us with races, we would have a great tool in our toolbox.

@niner

Really, please stop adding additional systems.

We won't (I hope). More so, migration to Azure eliminated usages of Travis and AppVeyor, quite successfully, making the things easier.

Instead, just make reports worth getting looked at then make it so we don't have to check 5 different websites with different user interfaces to get at the results and then start looking at the reports, point at broken commits and create reduced test cases.

Yes. The intention here is to 1)make CI worth being respected; 2)make people see merits of it..

Saying "Current CI is bad, so one wouldn't want to use it" is odd compared to "Current CI is bad, so we should improve it and use it".

melezhik · 2020-06-22T14:59:18Z

I agree with this, but technically it also means running Blin on all of these platforms
do we run Blin for ALL modules in eco system? So it takes hours for one run, does it?

My idea is to have lightweight smoke tests run ( with average run time no more 5-10 minutes ) for all supported platforms per every commit.
Is it a case now for any of the mentioned CI (Azure Devops/Circle/Travis )?

Altai-man · 2020-06-22T15:01:59Z

TL;DR:
1)I don't want to cover everything in the world, add platforms, add tools. I want us to clarify what we support and what not currently and make our current tools to check if the release is worthy using this checklist.
2)I don't want to spend precious time of developers more than required, so our CI should be healthy and it should be the means for avoiding breakage of master (which is not so uncommon right now as was shown in the examples).

lizmat · 2020-06-22T15:02:13Z

Flappers are acceptable red flags for me. Stupid things like connectivity issues breaking builds, are not. :-) And I've looked at way too many of those.

melezhik · 2020-06-22T15:07:52Z

@Altai-man I understand all that, and I agree with all you've said.
But I still need some clarification here (from you or from others), for example you said "a critical bug which lead to build failures when linking against musl, glibc alternative notably used by Alpine distro popular for CI containers"

So, do we have a test where check a source code compatibility with Alpine? So on so forth ( you can think of some other examples, say some CentOS distros that we claim to support ).

melezhik · 2020-06-22T17:01:12Z

I've found a set of OS supported in Azure Pipelienes CI for Rakudo build

https://github.com/rakudo/rakudo/blob/master/azure-pipelines.yml#L41-L97

Don't see an Alpine/CentOS/Debian here

The same for moarvm - https://github.com/MoarVM/MoarVM/blob/master/azure-pipelines.yml

cc @patrickbkr

I am not picking holes 😄 , it probably works fine for the purpose of testing moar backend / rakudo in general. But probably does not cover some OS dependent issues mentioned here ...

patrickbkr · 2020-06-25T11:26:46Z

lizmat · 2020-06-25T11:31:32Z

MacOS 11 with ARM processor as soon as there is one available?

patrickbkr · 2020-06-25T11:32:04Z

Also I do agree with niner and lizmat that our biggest problem with CI currently is reliability. We need to have a stable CI that people are willing to not ignore.

In that regard I'd like to focus on Azure and get rid of the others. Currently Azure isn't fully reliable (see this comment). I hope we'll manage to iron these failures out. - Soonish.

niner · 2020-06-25T11:42:34Z

- [ ] Some big endian system - The platform we actually want to support here is IBM System z. No chance we get our hands on one of those.

That is simply not true. As I've pointed out repeatedly, the Open Build Service supports a long list of platforms out of the box on its 12000 machine strong build farm. This includes openSUSE Factory zSystems. It was literally 3 mouse clicks to activate the build an System z and if you're interested in the results, they are right here: https://build.opensuse.org/package/live_build_log/home:niner9:rakudo-git/ moarvm/openSUSE_Factory_zSystems/s390x Why this gets ignored is completely beyond me. To make it absolutely crystal clear, this is the full list of _currently_ available buid targets of the Open Build Service: openSUSE Tumbleweed openSUSE Leap 15.2 openSUSE Leap 15.1 openSUSE Leap 15.1 ARM openSUSE Leap 15.1 PowerPC openSUSE Factory ARM openSUSE Factory PowerPC openSUSE Factory zSystems openSUSE Backports for SLE 15 SP1 openSUSE Backports for SLE 15 openSUSE Backports for SLE 12 SP5 openSUSE Backports for SLE 12 SP4 openSUSE Backports for SLE 12 SP3 openSUSE Backports for SLE 12 SP2 openSUSE Backports for SLE 12 SP1 openSUSE Backports for SLE 12 SP0 SUSE SLE-15-SP1 SUSE SLE-15 SUSE SLE-12-SP5 SUSE SLE-12-SP4 SUSE SLE-12-SP3 SUSE SLE-12-SP2 SUSE SLE-12-SP1 SUSE SLE-12 SUSE SLE-11 SP 4 SUSE SLE-10 Arch Extra Arch Community Raspbian 10 Raspbian 9.0 Debian Unstable Debian Testing Debian 10 Debian 9.0 Debian 8.0 Debian 7.0 Fedora Rawhide (unstable) Fedora 32 Fedora 31 Fedora 30 Fedora 29 ScientificLinux 7 ScientificLinux 6 RedHat RHEL-7 RedHat RHEL-6 RedHat RHEL-5 CentOS CentOS-8-Stream CentOS CentOS-8 CentOS CentOS-7 CentOS CentOS-6 Ubuntu 20.04 Ubuntu 19.10 Ubuntu 19.04 Ubuntu 18.04 Ubuntu 16.04 Ubuntu 14.04 Univention UCS 4.4 Univention UCS 4.3 Univention UCS 4.2 Univention UCS 4.1 Univention UCS 4.0 Univention UCS 3.2 Mageia Cauldron (unstable) Mageia 7 Mageia 6 IBM PowerKVM 3.1 AppImage KIWI image build (to be used for appliance and product builds with kiwi)

patrickbkr · 2020-06-25T11:47:09Z

@niner From my understanding OBS is a build service and not a CI service. Did I misunderstand? Is it viable to try to use OBS as a CI?

niner · 2020-06-25T11:50:20Z

On Donnerstag, 25. Juni 2020 13:47:24 CEST Patrick Böker wrote: @niner From my understanding OBS is a build service and not a CI service. Did I misunderstand?

Yes

Is it viable to try to use OBS as a CI?

Yes. I've explicitly cleared this with the OBS folks at FOSDEM and have been using it as CI service since January.

patrickbkr · 2020-06-25T12:55:17Z

Judging by the above list the OBS could be used as a CI and build platform for about everything except MacOS and Windows.

@niner It seems OBS doesn't really market itself as a CI. The user documentation has near to nothing on the topic of using it as such. I suspect one has to bend the system into being a CI a bit. Am I right? Things I didn't find any information about:

Building PRs
Reporting results back to GitHub
Viewing testresults ordered by commit in OBS itself

There is a 2013 talk by Ralf Dannert mentioning a Jenkins integration, but information on that is just as sparse.

Edit: I am interested in looking into this more. I'd really appreciate some more information on the topic though.

niner · 2020-06-25T13:46:17Z

On Donnerstag, 25. Juni 2020 14:55:31 CEST Patrick Böker wrote: @niner It seems OBS doesn't really market itself as a CI. The user documentation has near to nothing on the topic of using it as such. I suspect one has to bend the system into being a CI a bit. Am I right?

I'm using a cron job and a modified version of my packaging scripts to push every commit to MoarVM, nqp and rakudo to the OBS for testing. The OBS will then rebuild those 3 and the 21 modules (most importantly Inline::Perl5 and Cro) I'm most interested in. There's also something where you can point it at a git repo, but since I already had working scripts, I didn't dig into this: https://openbuildservice.org/help/manuals/obs-user-guide/cha.obs.best-practices.scm_integration.html https://openbuildservice.org/help/manuals/obs-user-guide/ cha.obs.source_service.html Advantages I see there are the flexibility of being able to build pretty much whatever I want, including PRs and branches and even patched versions, that the OBS takes care of dependencies, i.e. building stuff in order and that the build itself doesn't have to do any git operations and doesn't access the network at all. That means that builds simply cannot fail due to some git host not answering.

- Building PRs - Reporting results back to GitHub

The OBS has both a pretty decent command line client and a REST API: ``` nine@ns1:~/home:niner9:rakudo-git/moarvm> osc results --vertical openSUSE_Tumbleweed i586 succeeded openSUSE_Tumbleweed x86_64 succeeded openSUSE_Leap_15.2 x86_64 succeeded openSUSE_Leap_15.1 x86_64 succeeded openSUSE_Factory_zSystems s390x failed openSUSE_Factory_PowerPC ppc64 succeeded openSUSE_Factory_PowerPC ppc64le succeeded openSUSE_Factory_ARM armv7l succeeded openSUSE_Factory_ARM aarch64 succeeded nine@ns1:~/home:niner9:rakudo-git/moarvm> osc results --xml <resultlist state="b7680636458e1e15dfa277cb5c133ee5"> <result project="home:niner9:rakudo-git" repository="openSUSE_Tumbleweed" arch="i586" code="published" state="published"> <status package="moarvm" code="succeeded"/> </result> <result project="home:niner9:rakudo-git" repository="openSUSE_Tumbleweed" arch="x86_64" code="published" state="published"> <status package="moarvm" code="succeeded"/> </result> <result project="home:niner9:rakudo-git" repository="openSUSE_Leap_15.2" arch="x86_64" code="published" state="published"> <status package="moarvm" code="succeeded"/> </result> <result project="home:niner9:rakudo-git" repository="openSUSE_Leap_15.1" arch="x86_64" code="published" state="published"> <status package="moarvm" code="succeeded"/> </result> <result project="home:niner9:rakudo-git" repository="openSUSE_Factory_zSystems" arch="s390x" code="published" state="published"> <status package="moarvm" code="failed"/> </result> <result project="home:niner9:rakudo-git" repository="openSUSE_Factory_PowerPC" arch="ppc64" code="published" state="published"> <status package="moarvm" code="succeeded"/> </result> <result project="home:niner9:rakudo-git" repository="openSUSE_Factory_PowerPC" arch="ppc64le" code="published" state="published"> <status package="moarvm" code="succeeded"/> </result> <result project="home:niner9:rakudo-git" repository="openSUSE_Factory_ARM" arch="armv7l" code="published" state="published"> <status package="moarvm" code="succeeded"/> </result> <result project="home:niner9:rakudo-git" repository="openSUSE_Factory_ARM" arch="aarch64" code="published" state="published"> <status package="moarvm" code="succeeded"/> </result> </resultlist> nine@sphinx:~> lwp-request https://api.opensuse.org/build/home:niner9:rakudo-git/openSUSE_Tumbleweed/x86_64/rakudo/_status Enter username for Use your SUSE developer account at api.opensuse.org:443: niner9 Password: <status package="rakudo" code="succeeded"> <details></details> </status> ```

AlexDaniel · 2020-06-25T16:34:31Z

@melezhik it's not out of scope, that's just how OBS works. If we create a new rakudo package for every commit, it can trigger a rebuild of all module packages (on all architectures). That's essentially what Blin does, except that OBS can do it for all supported architectures without requiring us to create our own infrastructure for it. It actually sounds a bit too good to be true, but according to @niner we are allowed to do something like this, so let's try it.

nxadm · 2020-06-25T16:45:25Z

When I was looking on how to create rakudo packages, OBS was the first thing I looked at. Huge platform support, backed by a FOSS company, etc. However, I found it very complicated.

This does not mean I think OBS is a bad idea. In fact, I prefer it to Microsoft Azure. I am just stating the importance of documentation and transfer of knowledge, because I suspect that @niner++ is the only expert of the platform.

melezhik · 2020-06-25T16:46:19Z

@AlexDaniel I see what you're saying and with all respect what @niner has been doing with OSB, just my thoughts:

If we create a new rakudo package for every commit, it can trigger a rebuild of all module packages (on all architectures).

you don't need this to test Moar/Rakudo, it's only makes a sense if one is going to support Raku modules for certain platform

OBS can do it for all supported architectures without requiring us to create our own infrastructure for it.

Let's be real. There is no such a tool that automatically generates all platform specific packages from META specs. Even though there is AFIK progress in that way with rhel/centos presented by @niner , we should understand that it's way too harder the we could expect now, it's even hard to to it for a certain platform, there are too many bumps on road we might be not aware of now. Again do we still need it? If we are going to maintain native packages for different Linux, then it makes a sense. However I personally don't want to build a native CentOS package for Rakudo package just to test it ... But there is somewhat in the middle approach I am currently working on discussed here one might be interested ...

AlexDaniel · 2020-06-25T16:48:27Z

There is no such a tool that automatically generates all platform specific packages from META specs

I'll submit PRs to all modules that need native dependencies. No problem.

niner · 2020-06-25T17:07:33Z

On Donnerstag, 25. Juni 2020 18:46:34 CEST Alexey Melezhik wrote: Let's be real. There is no such a tool that *automatically* generates all platform specific packages from META specs.

There's no such tool yet. But we're closer than ever.

Even though there is AFIK progress in that way with rhel/centos presented by @niner , we should understand that it's way too harder the we could expect now, it's even hard to to it for a certain platform, there are too many bumps on road we might be not aware of now.

I have thought about and worked on this for at least 5 years now. I think by now I know the way.

Again do we still need it? If we are going to maintain native packages for different Linux, then it makes a sense. However I personally don't want to build a native CentOS package for Rakudo package just to test it ...

That's the point. Getting native dependency information into META6.json files gives us the base for all of these: * fully automated checking of dependencies by zef * automated installation of native dependencies by zef * fully automated packaging of all Raku modules for Linux distributions * fully automated CI for all Raku modules Really, the only part that we haven't actually specified and that we need for module CI is actually how to run tests. But there is a sort of quasi standard with t directories and scripts outputting TAP, which is what zef runs, so that's pretty much covered.

AlexDaniel · 2020-06-25T17:16:28Z

@niner++ I love your work.

melezhik · 2020-06-25T17:59:34Z

I guess almost if not all issues the discussion has started with have nothing to do with Raku modules native packages. How having those packages would help us?

AlexDaniel · 2020-06-25T19:19:09Z

@melezhik yes, you're actually right, but you have to consider the big picture. OBS can allow us massive testing of everything on all architectures. It does potentially fix some of the specific points raised in this ticket (alpine issue, stability of CI, old gcc stuff), but for others (relocatability, windows) we will have to do something extra.

niner · 2020-06-26T10:03:50Z

Aaaand we now successfully build on s390x, i.e. IBM System z :) https://build.opensuse.org/project/show/home:niner9:rakudo-git

patrickbkr · 2020-06-28T17:04:27Z

I'm having a hard time diving into OBS with respect to setting up a MoarVM, NQP and Rakudo CI.

@niner: As you have the deepest understanding of OBS: Can you create a write up of how a CI integration with OBS could look like for us? Some guide that gives the large picture of how such a CI should work and roughly what chunks of work need to be tackled. - An actionable plan. I imagine this would help us "OBS outsiders" a lot.

melezhik · 2020-07-11T02:12:08Z

just an idea. abit another dimension in this discussion. we can spin up amazon instances ( on my free tier account this is even free ) on demand using terraform, then run tests and tear instances down. it's cheap and efficient approach. just a thought.

Sparrowdo/Sparky has recently started to support such a dynamic configurations ....

niner · 2020-07-11T21:00:15Z

The Open Build Service was created as the name suggests to build software and that's first and foremost what it still does. As testing can be considered a part of the build process that's already the most important bit covered. The thing that's missing is telling the OBS that there's something to build and what exactly. The OBS supports several target systems and distributions and uses the native tools to build packages. For distributions like openSUSE or CentOS, the native tool is rpmbuild. On Debian based systems it's dpkg or dpkg-buildpackage to be precise. These tools need the software's sources and descriptions of how to build them. For rpmbuild, the sources usually come in the form of tar balls but actually can be any collection of files. The build description is given as SPEC file [1]. This contains meta data like name of the package, version, dependency information and license and the list of source files (usually tar balls and patches). In addition there are sections for the actual build steps, a list of files the result package contains and a changelog. So to build a Raku module on openSUSE on the OBS one needs to create a spec file first with the described information. Luckily, our META6.json files already contain everything one would need for that. A spec file can easily created for that and there's already a meta2rpm [2] tool that does this in a fully automated way. I used this tool to create the spec files for the modules that get built for every rakudo commit in my rakudo-git OBS project [3]. There's also a rather rudimentary script for uploading the resulting tar ball and spec files to the OBS. For a fully automated meta2rpm and equally automated build we still need information about dependencies on native libraries and programs. Me, ugexe and tony-o have been working on this for a couple of years and we're as good as there. The META6.json format supports native dependencies and meta2rpm can translate them to spec file "Requires" lines. The caveat is that currently the native library must already be installed on the machine that meta2rpm runs on. It needs to be extended to query the distro's online repository, which is really just a matter of programming. Note that handling native dependencies is yet an open issue for all possible build systems. There is still a need for tools similar to meta2rpm (or a generalization of this tool) to get build descriptions for other target distributions like Debian or Arch. That's a relatively simple matter of someone finding out (or already knowing) what those descriptions look like and implementing a writer. The final piece of the puzzle is triggering a new build. For the rakudo-git repository I have a simple systemd timer (what used to be cron jobs) that runs an ugly shell script [4] derived from the one I use for packaging. The script checks if there are new commits in the repository and if so, create a tar ball that follows the name-version.tar.xz convention, updates the version in the .spec file and even turns the git log into a changelog (which isn't necessary per se but I already had the code and it's nice to have). A simple `osc commit -m "Update to version $version"` pushes the result to the OBS which starts building everything. The OBS has a thing called source services [5] which may replace the timer and shell script, but I haven't tried it yet. [1] https://rpm-packaging-guide.github.io/ [2] https://github.com/niner/meta2rpm/ [3] https://build.opensuse.org/project/show/home:niner9:rakudo-git [4] https://gist.github.com/niner/98f26b41f9f7a5f79cb8b9b8c4b6048d [5] https://openbuildservice.org/help/manuals/obs-user-guide/ cha.obs.source_service.html

patrickbkr · 2020-07-13T16:24:17Z

@niner Thanks for the overview! Your writeup even covers testing Raku modules!
You didn't mention GitHub integration. From a quick search I came up with a Github guide to build your own CI server integration but didn't find any information of a preexisting hook for OBS. My guess is there isn't any and we need to build our own. Do you know otherwise?

You currently use the pull paradigm (OBS regularly looks for new commits). The GitHub guide suggests using the push paradigm (GitHub pushes change notifications to a CI backend). But I suspect the GitHub API is flexible enough to get it to work in a pull fashion as well. Would you recommend one over the other?

niner · 2020-07-13T16:31:46Z

Do you know otherwise?

Sorry, I don't. No idea how those status thingies get onto GitHub pages.

You currently use the pull paradigm (OBS regularly looks for new commits). The GitHub guide suggests using the push paradigm (GitHub pushes change notifications to a CI backend). But I suspect the GitHub API is flexible enough to get it to work in a pull fashion as well. Would you recommend one over the other?

Maybe one can use GitHub push in combination with the source services. Long term, push is certainly preferrable as it reduces latency and won't cause issues with request limits or something like that. Whipping up a little service that takes GitHub notifications and triggers an OBS update sounds like a nice little exercise as we already have code for handling GitHub notifications in https://github.com/Raku/geth

JJ · 2020-07-14T06:53:34Z

El lun., 13 jul. 2020 a las 18:32, niner (<[email protected]>) escribió:

> Do you know otherwise? Sorry, I don't. No idea how those status thingies get onto GitHub pages. Status thingies can be created as badges; they are also available from the

API. Don't know if they could be integrated.

enough to get it to work in a pull fashion as well. Would you recommend one > over the other?

If you self-host it, you can do whatever you want. The current GitHub actions framework allows periodic actions, or push actions. You can make periodic actions pull stuff, of course.

patrickbkr · 2020-07-20T18:46:48Z

I want to work on creating a CI setup based on AzureCI to cover MacOS and Windows, and OBS for Linux. I have not yet created a detailed concept. From what I understood it's impossible to directly couple OBS and GitHub. Some intermediary that watches for changes in GitHub or is notified by GitHub on changes and pushes those changes to OBS will be necessary. I'm considering using that same intermedary to trigger Azure builds. But I'll decide that later on once I have some progress attaching OBS.

But this ticket is not about fixing up our CI, but about stability with our development process in general. @Altai-man has provided a possible solution in the very first comment of this ticket. I'd like to reignite the discussion with respect to his proposal.

What is the stance of the core committers towards the idea of a only pull requests development model? I've read some positive voices. Are there objections?
I think "Failure on master is an extreme condition that needs to be resolved prio 1." has a very large cultural aspect. Changing that culture won't be done by simply stating so in some ticket. What can we do to support the change?
- One point that has been brought up is stability of our CI. Especially false positives are discouraging respecting the CI.

patrickbkr · 2020-07-22T07:30:14Z

A PR-only-workflow could be made easier by a helper bot.

We introduce a tag merge-on-green or ready-for-merge or some such. The bot will find PRs with that tag, wait for CI to finish and merge if the tests are all green or remove the tag and ping the creator if they turn out red. This way one doesn't have to touch each PR twice, just create the PR and add the tag in one step.

patrickbkr · 2020-07-22T07:35:34Z

The bot could also be integrated into the CI and then be able to do things like retriggering the CI on command.

Prior art: Redhat seems to have a bot supported PR workflow: https://github.com/kubernetes/community/blob/master/contributors/guide/owners.md#the-code-review-process

MasterDuke17 · 2020-07-22T07:44:15Z

I believe Rust also does a lot with bots interacting with PRs.

…

Sent from my iPhone

On Jul 22, 2020, at 8:35 AM, Patrick Böker ***@***.***> wrote: The bot could also be integrated into the CI and then be able to do things like retriggering the CI on command. Prior art: Redhat seems to have a bot supported PR workflow: https://github.com/kubernetes/community/blob/master/contributors/guide/owners.md#the-code-review-process — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

AlexDaniel · 2020-07-22T12:15:22Z

If it matters, I like the idea of PR-based development. That's one of the reasons I added support for branches to whateverable. You can use committable to run code on any branch to show the upcoming behavior, you can use bisectable if needed, etc. Blin also supports branches so you can pretest your commits on the ecosystem to see if anything breaks.

Altai-man · 2020-08-05T08:36:21Z

Establish a complete list of platforms and tools we Officialy Met Expectations for and add a clear CI check for every missing point of this list.

So after a month and a half of discussions we still don't have one and people are asking about it.

patrickbkr · 2020-08-05T10:45:18Z

@Altai-man: I think the problem with writing such a list now (instead of later), is that we in principle want to support as many platforms as possible and the limitting factor is having a good CI and build toolchain for the platforms. So it's currently mostly not a question of what we want to support or what is sensible to support, but what we technically can manage to have a CI for.

So the best we can do at the moment is listing which platforms we currently support well. This is what we have:

Configurations we test via our CI (only looking at Azure, ignoring Travis and CircleCI):

moar, non-reloc|reloc, x86-64, Windows 10,   MSVC,                dyncall
moar, non-reloc|reloc, x86-64, MacOS 10.15,  clang 10.0,          dyncall
moar, non-reloc|reloc, x86-64, Ubuntu 18.04, gcc 7.3.0|clang 6.0, dyncall|libffi, glibc-2.27

Platforms without CI, but for which releases are built:

moar, reloc,           x86-64, CentOS 6,     gcc 4.4.7,           dyncall,        glibc-2.12

I think with the above setup we can claim to support:

MoarVM
on x86-64
Windows 10
- MSVC
MacOS
- clang
Linux
- glibc >= 2.12
- gcc >= 4 and clang

We can improve our coverage for the above systems (e.g. an older MacOS version, CentOS 6), but above is what we have now.

I am working on improving our CI infrastructure. I hope to be able to integrate OBS as a CI system. If I am successful we have the possibility of adding more platforms to the list.

Do we somehow have to give the above list our blessing?
Where could the list live? Somewhere in the Rakudo doc/ folder?

melezhik · 2020-08-05T14:39:38Z

If it helps, Rakudist now runs community modules tests for a variety of Rakudo versions (whateverable) for the following platforms:

Ubuntu
Debian
Centos
Alpine

Again, it's just a matter of spinning up a new docker image into the RakuDist, to get a new OS tested ...

"Current Rakudo (possibly MoarVM as well) development process hinders releasing" Fixes Raku#206

melezhik · 2021-06-22T23:12:55Z

I have recently started a tool aiming, at least partly mitigate the issues mentioned here. If someone is interested - https://github.com/melezhik/r3tool

patrickbkr · 2021-06-23T07:19:41Z

@melezhik Can you elaborate a bit what r3tool does, how it is to be used and how it relates to the already in use ecosystem tester blin?

melezhik · 2021-06-23T20:38:29Z

Hi @patrickbkr ! R3 does not interferes with blin, rather the it is an additional tool.

The main idea is to register fresh rakudo issues into database (test tags) and codify test cases for them as soon as possible, even before an issue is fixed. The approach could be applied to PRs as well, but right now just say for bugs.
Bugs test scenarios are created in a simplest possible way - by just shell/bash code ( using Raku directly is also possible for more complex scenarios). 80-90 % of bugs get golfed as raku -e 'some broken code that exits with none zero', so it's easy to achieve. This would allow for people not familiar with roast, Raku rest eco system to codify their bugs.
Thus bug tests always get mapped 1 to 1 to related Rakudo issues - https://github.com/rakudo/rakudo/issues and bug/ bug test is a first class citizen in this approach
Now a release manager or developer could run test cases using various filters, skipping or exclusively inducing certain groups of bugs. Examples:

Release 2021_06, open issues:

tomty --only=rc_2021_06+open --color --show-failed

Release 2021_06, closed issues but skip slow tests:

tomty --only=rc_2021_06+closed --skip=slow --color --show-failed

Release 2021_05 or 2021_05, issues:

tomty --only=rc_2021_05,rc_2021_06 --color --show-failed

Test for rakudo release 2021_06 against all bugs, but skipping open,example,slow and requiring unicode support in terminal:

tomty --env=2021_06 --skip=slow,example,open,unicode --color --show-failed

This is in essence, other useful features - easy to test against whateverable rakudos, I am not going to go in all details here, checkout r3, please.

Advantages for rakudo development/release process:

issues/bugs oriented approach, we can always say - what/how many issues are closed in this release, and can always run tests for certain sub sets of cases/issues. Grouping tests, so on.
issues codifications as tests, mitigating testneeded tagged issues or cases if people forget / are lazy to add tests to roaster at all. codifying issues via tests should be neither difficult nor time consuming process. with r3 it takes me seconds to codify new issues arriving to rakudo/issues. from code implementation to commit. since that everyone should be able to run this test against their local environment. maximal efficiency
ready TDD for developers - people fixing issues might use r3 for quick tests against their local branches
more confidence before release - regression tests are covered by CI, but we want to run slow tests, run specific tests ( for some reasons bit supported by CI ), run performance tests, run critical issues tests, so on, so forth
automatic statistic for a release - typically would have answers for such questions as - how many issues are closed in this release? how many issues are closed between this and that release? What kind of bug are closed? performance issues? Critical modifications potentially affecting a lot of uses ? and so on, so forth ...

Altai-man added the fallback If no other label fits label Jun 22, 2020

Altai-man assigned jnthn Jun 22, 2020

Altai-man mentioned this issue Jun 22, 2020

Clarify state of circle-ci checks rakudo/rakudo#3764

Closed

lizmat unassigned jnthn Jun 22, 2020

AlexDaniel mentioned this issue Jun 24, 2020

Rakudo 2017.10 fails to build on Debian big endian systems rakudo/rakudo#1257

Open

2 tasks

patrickbkr mentioned this issue Jun 24, 2020

Are CI builds with nqp-master or moar-master needed? rakudo/rakudo#3700

Open

patrickbkr mentioned this issue Jul 22, 2020

Fix release pipeline to not delete the linux build rakudo/rakudo#3814

Merged

patrickbkr added a commit to patrickbkr/problem-solving that referenced this issue Aug 14, 2020

Provide a solution document for Raku#206

885b53e

"Current Rakudo (possibly MoarVM as well) development process hinders releasing" Fixes Raku#206

patrickbkr added a commit to patrickbkr/problem-solving that referenced this issue Aug 14, 2020

Provide a solution document for Raku#206

fedec01

"Current Rakudo (possibly MoarVM as well) development process hinders releasing" Fixes Raku#206

patrickbkr linked a pull request Aug 14, 2020 that will close this issue

Provide a solution document for #206 #219

Draft

Current Rakudo (possibly MoarVM as well) development process hinders releasing #206

Current Rakudo (possibly MoarVM as well) development process hinders releasing #206

Comments

Altai-man commented Jun 22, 2020

Altai-man commented Jun 22, 2020 • edited Loading

Possible solutions

lizmat commented Jun 22, 2020 via email

AlexDaniel commented Jun 22, 2020

AlexDaniel commented Jun 22, 2020

melezhik commented Jun 22, 2020 • edited Loading

melezhik commented Jun 22, 2020 • edited Loading

AlexDaniel commented Jun 22, 2020

melezhik commented Jun 22, 2020

rba commented Jun 22, 2020 • edited Loading

niner commented Jun 22, 2020 • edited Loading

Altai-man commented Jun 22, 2020 • edited Loading

melezhik commented Jun 22, 2020

Altai-man commented Jun 22, 2020

lizmat commented Jun 22, 2020

melezhik commented Jun 22, 2020 • edited Loading

melezhik commented Jun 22, 2020

patrickbkr commented Jun 25, 2020

lizmat commented Jun 25, 2020

patrickbkr commented Jun 25, 2020

niner commented Jun 25, 2020 via email

patrickbkr commented Jun 25, 2020

niner commented Jun 25, 2020 via email

patrickbkr commented Jun 25, 2020 • edited Loading

niner commented Jun 25, 2020 via email

AlexDaniel commented Jun 25, 2020

nxadm commented Jun 25, 2020

melezhik commented Jun 25, 2020

AlexDaniel commented Jun 25, 2020

niner commented Jun 25, 2020 via email

AlexDaniel commented Jun 25, 2020

melezhik commented Jun 25, 2020 • edited Loading

AlexDaniel commented Jun 25, 2020

niner commented Jun 26, 2020

patrickbkr commented Jun 28, 2020

melezhik commented Jul 11, 2020 • edited Loading

niner commented Jul 11, 2020 via email

patrickbkr commented Jul 13, 2020

niner commented Jul 13, 2020 via email

JJ commented Jul 14, 2020 via email

patrickbkr commented Jul 20, 2020

patrickbkr commented Jul 22, 2020

patrickbkr commented Jul 22, 2020

MasterDuke17 commented Jul 22, 2020 via email

AlexDaniel commented Jul 22, 2020

Altai-man commented Aug 5, 2020

patrickbkr commented Aug 5, 2020

melezhik commented Aug 5, 2020 • edited Loading

melezhik commented Jun 22, 2021

patrickbkr commented Jun 23, 2021

melezhik commented Jun 23, 2021

Altai-man commented Jun 22, 2020 •

edited

Loading

melezhik commented Jun 22, 2020 •

edited

Loading

melezhik commented Jun 22, 2020 •

edited

Loading

rba commented Jun 22, 2020 •

edited

Loading

niner commented Jun 22, 2020 •

edited

Loading

Altai-man commented Jun 22, 2020 •

edited

Loading

melezhik commented Jun 22, 2020 •

edited

Loading

patrickbkr commented Jun 25, 2020 •

edited

Loading

melezhik commented Jun 25, 2020 •

edited

Loading

melezhik commented Jul 11, 2020 •

edited

Loading

melezhik commented Aug 5, 2020 •

edited

Loading