-
Notifications
You must be signed in to change notification settings - Fork 855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try to validate commit headers. #6630
base: master
Are you sure you want to change the base?
Conversation
0770006
to
127107a
Compare
d17b7ab
to
a3f84da
Compare
well late :D will need rebase. |
a3f84da
to
5651995
Compare
rebased, removed string templates and fixed a bug |
LGTM |
5651995
to
1e4d17b
Compare
removed the test commits and updated the error messages a bit |
1e4d17b
to
975a58a
Compare
disabled warnings, kept the errors. Noticed another recent PR (#7776) with invalid mail while testing.
|
7599da2
to
341da28
Compare
# last job depends on everything so that it is forced to run last even if a long job fails early | ||
# cleanup job depends on everything so that it is forced to run last even if a long job fails early. | ||
# 'paperwork' is left out intentionally, since it doesn't run unit tests (hopefully doesn't need restarts) | ||
# and shouldn't prevent cleanup on validation failure - which might be common during dev time | ||
cleanup: | ||
name: Cleanup Workflow Artifacts | ||
needs: | ||
- base-build | ||
- commit-validation | ||
- paperwork |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is new: the paperwork job will no longer prevent CI pipeline artifact cleanup. This could otherwise create a situation where workspaces are never cleaned up (due to failing paperwork job) until shortly before PR integration - which wouldn't be ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I the idea, we still need to keep in mind, that this might produce false positives, but that should be rare.
I verified, that this works by rewriting it into plain java. I will not veto this, but I think it is an error to use experimental features just because. I see no benefit from using JDK 23 features here. The gatherers and the unfamiliar format make this hard to read for me.
FWIW is the JDK 21 Version (this would be JDK 11 compatible if HttpClient would not be explicitly closed):
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpClient.Redirect;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse.BodyHandlers;
import java.nio.charset.StandardCharsets;
import java.time.Duration;
import java.util.ArrayList;
import java.util.List;
// checks commit headers for valid author, email and commit msg formatting
// java CommitHeaderChecker.java https://github.com/apache/netbeans/pull/${{ github.event.pull_request.number }}
public class scratch {
public static void main(String[] args) throws IOException, InterruptedException {
if (args.length != 1 || !args[0].startsWith("https://github.com/")) {
throw new IllegalArgumentException("PR URL expected");
}
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(args[0] + ".patch"))
.timeout(Duration.ofSeconds(10))
.build();
System.out.println("checking PR patch file...");
List<String> headerbuffer = new ArrayList<>(5);
int index = 0;
boolean allHeadersOk = true;
try (HttpClient client = HttpClient.newBuilder()
.followRedirects(Redirect.NORMAL).build()) {
try(InputStream is = client.send(request, BodyHandlers.ofInputStream()).body();
InputStreamReader isr = new InputStreamReader(is, StandardCharsets.UTF_8);
BufferedReader br = new BufferedReader(isr)
) {
for(String line = br.readLine(); line != null; line = br.readLine()) {
headerbuffer.add(line);
if(headerbuffer.size() > 5) {
headerbuffer.remove(0);
}
if (headerbuffer.size() == 5
&& headerbuffer.get(0).matches("From [a-fA-F0-9]+.*") // Commit hint
&& headerbuffer.get(1).startsWith("From: ")
&& headerbuffer.get(2).startsWith("Date: ")
&& headerbuffer.get(3).startsWith("Subject: ")) {
index++;
boolean headerResult = checkCommit(index, headerbuffer.get(1), headerbuffer.get(2), headerbuffer.get(3), headerbuffer.get(4));
allHeadersOk &= headerResult;
}
}
}
}
System.out.println(index + " commit(s) checked");
System.exit(allHeadersOk ? 0 : 1);
}
static boolean checkCommit(int index, String from, String subject, String date, String blank) {
return checkNameAndEmail(index, from)
& checkSubject(index, subject)
& checkBlankLineAfterSubject(index, blank);
}
static boolean checkNameAndEmail(int i, String from) {
// From: Duke <[email protected]>
int start = from.indexOf('<');
int end = from.indexOf('>');
String mail = end > start ? from.substring(start + 1, end) : "";
String author = start > 6 ? from.substring(6, start).strip() : "";
boolean green = true;
if (mail.isBlank() || !mail.contains("@") || mail.contains("noreply") || mail.contains("localhost")) {
System.out.println("::error::invalid email in commit " + i + " '" + from + "'");
green = false;
}
// single word author -> probably the nickname/account name/root etc
// mime encoded is probably fine, since gh account names can't be encoded
if (author.isBlank() || (!author.contains(" ") && !(author.startsWith("=?") && author.endsWith("?=")))) {
System.out.println("::error::invalid author in commit " + i + " '" + author + "' (full name?)");
green = false;
}
return green;
}
// https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-commit.html#_discussion
static boolean checkSubject(int i, String subject) {
// Subject: [PATCH] msg
subject = subject.substring(subject.indexOf(']') + 1).strip();
// single word subjects are likely not intended or should be squashed before merge
if (!subject.contains(" ")) {
System.out.println("::error::invalid subject in commit " + i + " '" + subject + "'");
return false;
}
return true;
}
// there should be a blank line after the subject line, some subjects can overflow though.
static boolean checkBlankLineAfterSubject(int i, String blank) {
// disabled since this would produce too many warnings due to overflowing subject lines
// if (!blank.isBlank()) {
// println("::warning::blank line after subject recommended in commit " + i + " (is subject over 50 char limit?)");
//// return false;
// }
return true;
}
}
the stream-gatherers API is final in JDK 24 - I would be available to update the script in the event if any changes are needed. CI JDK for this script is locked to JDK 23 right now and should continue working no matter what so it doesn't create a must-update deadline before our CI breaks. I think the new single-file-program preview features do make especially sense for CI scripts since CI is very flexible in locking something to specific releases. Risk is close to 0 too. I do also use those CI scripts to actually test (sometimes improve) NetBeans support for them - which is a positive side effect I believe. (e.g #7605 only happened because of this script)
yes, but I am more worried about the opposite case: that it misses cases and committers might start completely relying on it without going through the pre-merge checklist. |
341da28
to
baad0ae
Compare
updates:
|
- simple script to check if PR commits have proper email addresses - looks for empty or single word subject messages - can warn about missing blank line after subject (disabled) - logs errors/warnings to the github actions summary page, fails on error paperwork job: - the paperwork job will now no longer prevent the github workspace from being cleaned on failure
baad0ae
to
e037b87
Compare
added a few examples for regression testing, planning to merge this soon |
Back when I setup the paperwork job, I didn't know that gh can generate patch files yet. The problem was to figure out what commits are actually new in the PR. I gave it a few attempts but got distracted with more important things.
this adds:
paperwork job:
output in workflow summary:
output in job log: