Skip to content

Commit

Permalink
MINOR Move scripts into committer-tools (#17162)
Browse files Browse the repository at this point in the history
Moving reviewers.py and kafka-merge-pr.py into committer-tools. Also include a new find-unfinished-test.py 
script which can be used for finding hanging tests on Jenkins or Github Actions.

Reviewers: Chia-Ping Tsai <[email protected]>
  • Loading branch information
mumrah authored Sep 11, 2024
1 parent c62c389 commit a1f2857
Show file tree
Hide file tree
Showing 5 changed files with 134 additions and 21 deletions.
89 changes: 68 additions & 21 deletions committer-tools/README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,13 @@
# Refresh Collaborators Script
# Committer Tools

The Refresh Collaborators script automates the process of fetching contributor
data from GitHub repositories, filtering top contributors who are not part of
the existing committers, and updating a local configuration file (.asf.yaml) to
include these new contributors.

## Table of Contents

- [Requirements](#requirements)
- [Installation](#installation)
- [Usage](#usage)
This directory contains scripts to help Apache Kafka committers with a few chores.
Some of the scripts require a GitHub API token with write permissions. Only
committers will be able to utilize such scripts.

## Requirements

- Python 3.x and pip
- A valid GitHub token with repository read access
- The GitHub CLI

## Installation

Expand All @@ -23,14 +16,14 @@ include these new contributors.
Check if Python and pip are installed in your system.

```bash
python3 --version
pip3 --version
python --version
pip --version
```

### 2. Set up a virtual environment (optional)

```bash
python3 -m venv venv
python -m venv venv

# For Linux/macOS
source venv/bin/activate
Expand All @@ -39,15 +32,40 @@ source venv/bin/activate
# .\venv\Scripts\activate
```

3. Install the required dependencies
### 3. Install the required dependencies

```bash
pip3 install -r requirements.txt
pip install -r requirements.txt
```

## Usage
### 4. Install the GitHub CLI

See: https://cli.github.com/

```bash
brew install gh
```

### 1. Set up the environment variable for GitHub Token
## Find Reviewers

The reviewers.py script is used to simplify the process of producing our "Reviewers:"
Git trailer. It parses the Git log to gather a set of "Authors" and "Reviewers".
Some simple string prefix matching is done to find candidates.

Usage:

```bash
python reviewers.py
```

## Refresh Collaborators

The Refresh Collaborators script automates the process of fetching contributor
data from GitHub repositories, filtering top contributors who are not part of
the existing committers, and updating a local configuration file (.asf.yaml) to
include these new contributors.

> This script requires the Python dependencies and a GitHub auth token.
You need to set up a valid GitHub token to access the repository. After you
generate it (or authenticate via GitHub CLI), this can be done by setting the
Expand All @@ -63,8 +81,37 @@ export GITHUB_TOKEN="$(gh auth token)"
# .\venv\Scripts\activate
```

### 2. Run the script
Usage:

```bash
python refresh_collaborators.py
```

## Approve GitHub Action Workflows

This script allows a committer to approve GitHub Action workflow runs from
non-committers. It fetches the latest 20 workflow runs that are in the
`action_required` state and prompts the user to approve the run.

> This script requires the `gh` tool
Usage:

```bash
python approve-workflows.py
```

## Find Hanging Tests

This script is used to infer hanging tests from the Gradle output. It looks for
tests that were STARTED but do not have a corresponding FINISHED or FAILED.

Usage:

```bash
python3 refresh_collaborators.py
python find-unfinished-test.py ~/Downloads/logs_28218821016/5_build\ _\ JUnit\ tests\ Java\ 11.txt

Found tests that were started, but not finished:

2024-09-10T20:31:26.6830206Z Gradle Test Run :streams:test > Gradle Test Executor 47 > StreamThreadTest > shouldReturnErrorIfProducerInstanceIdNotInitialized(boolean, boolean) > "shouldReturnErrorIfProducerInstanceIdNotInitialized(boolean, boolean).stateUpdaterEnabled=true, processingThreadsEnabled=true" STARTED
```
File renamed without changes.
66 changes: 66 additions & 0 deletions committer-tools/find-unfinished-test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import argparse
from datetime import datetime


def pretty_time_duration(seconds: float) -> str:
time_min, time_sec = divmod(int(seconds), 60)
time_hour, time_min = divmod(time_min, 60)
time_fmt = ""
if time_hour > 0:
time_fmt += f"{time_hour}h"
if time_min > 0:
time_fmt += f"{time_min}m"
time_fmt += f"{time_sec}s"
return time_fmt


if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Parse Gradle log output to find hanging tests")
parser.add_argument("file", type=argparse.FileType("r"), help="Text file containing Gradle stdout")
args = parser.parse_args()

started = dict()
last_test_line = None
for line in args.file.readlines():
if "Gradle Test Run" not in line:
continue
last_test_line = line

toks = line.strip().split(" > ")
name, status = toks[-1].rsplit(" ", 1)
name_toks = toks[2:-1] + [name]
test = " > ".join(name_toks)
if status == "STARTED":
started[test] = line
else:
started.pop(test)

last_timestamp, _ = last_test_line.split(" ", 1)
last_dt = datetime.fromisoformat(last_timestamp)

if len(started) > 0:
print("Found tests that were started, but apparently not finished")

for started_not_finished, line in started.items():
print("-"*80)
timestamp, _ = line.split(" ", 1)
dt = datetime.fromisoformat(timestamp)
dur_s = (last_dt - dt).total_seconds()
print(f"Test: {started_not_finished}")
print(f"Duration: {pretty_time_duration(dur_s)}")
print(f"Raw line: {line}")
File renamed without changes.
File renamed without changes.

0 comments on commit a1f2857

Please sign in to comment.