-
Notifications
You must be signed in to change notification settings - Fork 467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add generate PR description workflow #3042
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughWalkthroughA new GitHub Actions workflow, Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
Doesn't @coderabbitai do this already? |
Tip For best results, initiate chat on the files or code changes. @strickvl, yes, CodeRabbit can automatically generate PR descriptions if it's configured to do so. According to your current configuration, the Additionally, you can use the Let me know if you have any questions or need further assistance! Analysis chain@strickvl, you're correct—CodeRabbit can automatically generate PR descriptions if it's configured to do so. Let me check the current configuration to see if this feature is enabled for this repository. Scripts executedThe following scripts were executed for the analysis:
|
@strickvl can you show me how? Maybe I missed how it does it for sure! |
@coderabbitai review |
@coderabbitai summary |
Actions performedReview triggered.
|
Actions performedSummary regeneration triggered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Outside diff range and nitpick comments (5)
.github/workflows/generate-pr-description.yml (1)
11-18
: Job setup and permissions look good, with a minor suggestion.The job configuration is well-structured:
- Running only on non-draft PRs is efficient.
- The permissions are correctly set for the required operations.
Consider adding a comment explaining why these specific permissions are needed. This can help future maintainers understand the security implications:
permissions: contents: read # Needed to check out the repository pull-requests: write # Needed to update the PR description issues: write # Needed to add comments to the PRscripts/generate_pr_description.py (4)
1-5
: Add a module docstring and consider organizing imports.The imports look good and are necessary for the script's functionality. However, consider adding a module docstring to explain the purpose of this script.
Consider adding a module docstring at the beginning of the file:
""" This module generates pull request descriptions automatically based on file changes in a GitHub repository. It uses the GitHub API to fetch PR information and the OpenAI API to generate descriptions. """Also, consider organizing imports according to PEP 8 guidelines:
import os import openai import requestsTools
Ruff
1-1: Missing docstring in public module
(D100)
7-21
: Approve function with suggestion for improved docstring.The
truncate_changes
function is well-implemented and serves its purpose effectively. It ensures that the changes summary doesn't exceed the maximum character limit while maintaining readability.Consider expanding the docstring to provide more details about the function's behavior:
def truncate_changes(changes_summary): """ Truncates the changes summary to fit within MAX_CHARS. Args: changes_summary (list): A list of strings representing changes. Returns: list: A truncated list of changes, ensuring the total character count doesn't exceed MAX_CHARS. If truncation occurs, the last included change will end with "...". """
90-91
: Approve main block with suggestion for error handling.The main execution block is correct and straightforward. However, it could benefit from basic error handling to catch and log any unexpected exceptions.
Consider adding error handling to the main block:
if __name__ == "__main__": try: result = generate_pr_description() if result: print("PR description updated successfully.") else: print("No changes were made to the PR description.") except Exception as e: print(f"An error occurred: {e}")
1-91
: Overall assessment: Good implementation with room for improvement.The script successfully implements PR description generation using GitHub and OpenAI APIs. It handles file changes, truncates long summaries, and updates PR descriptions effectively. The overall structure and logic are sound.
However, there are several areas for improvement:
- Error handling: Add comprehensive error handling for API calls and potential exceptions.
- Code organization: Break down the
generate_pr_description
function into smaller, more focused functions.- Security: Implement proper validation and error handling for environment variables.
- Documentation: Add more detailed docstrings to functions and the module.
- Type hints: Consider adding type hints to improve code readability and catch potential type-related errors.
To improve the overall architecture and maintainability of the script, consider:
- Creating a
GitHubClient
class to encapsulate GitHub API interactions.- Implementing a
PRDescriptionGenerator
class to handle the main logic of generating and updating PR descriptions.- Using a configuration file or environment variables with proper validation for API keys and other sensitive information.
- Implementing logging instead of using print statements for better debugging and monitoring.
Example structure:
class GitHubClient: # Handle GitHub API interactions class PRDescriptionGenerator: # Handle PR description generation and updating def main(): # Set up clients and run the main logic if __name__ == "__main__": main()These changes would make the script more modular, easier to test, and more maintainable in the long run.
Tools
Ruff
1-1: Missing docstring in public module
(D100)
23-23: Missing docstring in public function
(D103)
25-25: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
26-26: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
27-27: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
28-28: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
28-28: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
29-29: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
33-33: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
33-33: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
40-40: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
46-46: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
47-47: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
49-49: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
51-51: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
53-53: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
54-54: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
57-57: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
57-57: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
61-61: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
69-69: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
79-79: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
82-82: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
84-84: f-string without any placeholders
Remove extraneous
f
prefix(F541)
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files selected for processing (2)
- .github/workflows/generate-pr-description.yml (1 hunks)
- scripts/generate_pr_description.py (1 hunks)
Additional context used
actionlint
.github/workflows/generate-pr-description.yml
29-29: shellcheck reported issue in this script: SC2086:info:2:8: Double quote to prevent globbing and word splitting
(shellcheck)
36-36: shellcheck reported issue in this script: SC2086:info:5:23: Double quote to prevent globbing and word splitting
(shellcheck)
36-36: shellcheck reported issue in this script: SC2086:info:7:24: Double quote to prevent globbing and word splitting
(shellcheck)
Ruff
scripts/generate_pr_description.py
1-1: Missing docstring in public module
(D100)
23-23: Missing docstring in public function
(D103)
25-25: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
26-26: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
27-27: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
28-28: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
28-28: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
29-29: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
33-33: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
33-33: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
40-40: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
46-46: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
47-47: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
49-49: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
51-51: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
53-53: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
54-54: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
57-57: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
57-57: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
61-61: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
69-69: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
79-79: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
82-82: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
84-84: f-string without any placeholders
Remove extraneous
f
prefix(F541)
Additional comments not posted (4)
.github/workflows/generate-pr-description.yml (4)
1-5
: LGTM: Workflow name and trigger are well-defined.The workflow name "Auto PR Description" clearly describes its purpose, and the trigger events (pull request opened or edited) are appropriate for automatically generating PR descriptions.
7-9
: Excellent use of concurrency settings.The concurrency configuration effectively prevents race conditions by ensuring only one instance of the workflow runs per PR. Cancelling in-progress runs is a good practice to always process the latest changes.
20-26
: Setup steps are well-configured.The use of recent versions for both the checkout action (v4.1.1) and Python setup action (v5.0.0) is commendable. Python 3.11 is a good choice for its performance improvements and new features.
1-70
: Overall, this is a well-structured workflow with some areas for improvement.The "Auto PR Description" workflow is a creative solution for automating PR descriptions. Here's a summary of the main points:
- The workflow structure, triggers, and concurrency settings are well-defined.
- Consider standardizing the dependency installation process using a
requirements.txt
file.- Address the shellcheck warnings by using double quotes in shell commands.
- Provide more information about the OpenAI API usage and consider adding error handling for API calls.
- Enhance the error checking step with more detailed troubleshooting information.
These improvements will make the workflow more robust, maintainable, and user-friendly. Great job on implementing this automated solution!
Tools
actionlint
29-29: shellcheck reported issue in this script: SC2086:info:2:8: Double quote to prevent globbing and word splitting
(shellcheck)
36-36: shellcheck reported issue in this script: SC2086:info:5:23: Double quote to prevent globbing and word splitting
(shellcheck)
36-36: shellcheck reported issue in this script: SC2086:info:7:24: Double quote to prevent globbing and word splitting
(shellcheck)
- name: Check for previous successful run | ||
id: check_comment | ||
run: | | ||
PR_NUMBER="${{ github.event.pull_request.number }}" | ||
COMMENT=$(gh api -X GET "/repos/${{ github.repository }}/issues/${PR_NUMBER}/comments" | jq '.[] | select(.body | contains("Auto PR description generated successfully")) | .id') | ||
if [ -n "$COMMENT" ]; then | ||
echo "Workflow has already run successfully for this PR." | ||
echo "skip=true" >> $GITHUB_OUTPUT | ||
else | ||
echo "skip=false" >> $GITHUB_OUTPUT | ||
fi | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
|
||
- name: Wait for potential edits | ||
if: steps.check_comment.outputs.skip == 'false' | ||
run: sleep 300 # Wait for 5 minutes | ||
|
||
- name: Generate PR description | ||
if: steps.check_comment.outputs.skip == 'false' | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }} | ||
run: python scripts/generate_pr_description.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Issues Identified in generate_pr_description.py
:
-
Incorrect OpenAI Model Name:
- The script specifies the model as
"gpt-4o-mini"
, which is likely a typo. It should be updated to a valid model name, such as"gpt-4"
.
- The script specifies the model as
-
Lack of Error Handling:
- The OpenAI API calls do not include error handling. This can lead to unhandled exceptions if the API request fails or returns an error.
- Consider adding try-except blocks to manage potential errors gracefully.
-
Rate Limiting and Retry Mechanism:
- The script does not account for API rate limits or implement a retry mechanism. Implementing exponential backoff strategies can improve reliability.
-
Excessive Truncation Limit:
MAX_CHARS
is set to400000
, which may be unnecessarily high for PR descriptions. A more reasonable limit can ensure concise and relevant descriptions.
Category:
Analysis chain
Good workflow logic, but some improvements needed.
The steps for checking previous runs, waiting, and generating the PR description are well thought out. However, there are a few points to address:
-
In the "Check for previous successful run" step, address the shellcheck warnings by using double quotes:
run: | PR_NUMBER="${{ github.event.pull_request.number }}" COMMENT=$(gh api -X GET "/repos/${{ github.repository }}/issues/${PR_NUMBER}/comments" | jq '.[] | select(.body | contains("Auto PR description generated successfully")) | .id') if [ -n "$COMMENT" ]; then echo "Workflow has already run successfully for this PR." echo "skip=true" >> "$GITHUB_OUTPUT" else echo "skip=false" >> "$GITHUB_OUTPUT" fi
-
The use of OpenAI's API for generating PR descriptions is intriguing.
Could you provide more information about how this is implemented? Here's a script to check the content of the Python script:
- Consider adding error handling and rate limiting considerations when using external APIs like OpenAI.
Would you like me to propose some error handling improvements for the Python script or open a GitHub issue to track this task?
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check the content of the generate_pr_description.py script
cat scripts/generate_pr_description.py
Length of output: 3663
Tools
actionlint
36-36: shellcheck reported issue in this script: SC2086:info:5:23: Double quote to prevent globbing and word splitting
(shellcheck)
36-36: shellcheck reported issue in this script: SC2086:info:7:24: Double quote to prevent globbing and word splitting
(shellcheck)
scripts/generate_pr_description.py
Outdated
token = os.environ['GITHUB_TOKEN'] | ||
repo = os.environ['GITHUB_REPOSITORY'] | ||
pr_number = os.environ['GITHUB_EVENT_NUMBER'] | ||
headers = {'Authorization': f'token {token}'} | ||
api_url = f'https://api.github.com/repos/{repo}/pulls/{pr_number}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve security and error handling for API setup.
The current implementation directly uses environment variables without any validation. This could lead to runtime errors if the variables are not set. Additionally, there's no error handling for the API calls.
Consider adding validation for environment variables and error handling:
import os
from typing import Dict
def get_github_headers() -> Dict[str, str]:
token = os.environ.get('GITHUB_TOKEN')
if not token:
raise ValueError("GITHUB_TOKEN environment variable is not set")
return {'Authorization': f'token {token}'}
def get_pr_api_url() -> str:
repo = os.environ.get('GITHUB_REPOSITORY')
pr_number = os.environ.get('GITHUB_EVENT_NUMBER')
if not repo or not pr_number:
raise ValueError("GITHUB_REPOSITORY or GITHUB_EVENT_NUMBER environment variable is not set")
return f'https://api.github.com/repos/{repo}/pulls/{pr_number}'
# Use these functions in generate_pr_description()
headers = get_github_headers()
api_url = get_pr_api_url()
Tools
Ruff
25-25: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
26-26: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
27-27: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
28-28: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
28-28: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
29-29: Single quotes found but double quotes preferred
Replace single quotes with double quotes
(Q000)
Auto PR description generated successfully |
I hope this doesn't accidentally overwrite actual pr descriptions, but other than that at least it won't do any harm |
@schustmi actually this is my only concern. Im wondering how I can make that foolproof. Currently, it will only write if the default PR description is matched and if it hasn't successfully done it before... but there is an edge case that if somebody opens up a PR and then edits their PR description while this is running, there's a chance it gets overwritten |
Yep exactly. One option would be to just run this when a certain tag is applied? Similar to the slow CI? |
@schustmi hmm but then nobody will do it I think :-( I like the automated feel of this. What I can do is that in the beginning of the script get the PR description, wait 5 mins, and then get it again and see if anything changes. This will mitigate but not eliminate the risk. The other thing I can do is just append the auto generated description rather than replacing anything |
@schustmi added some logic for the latter option. check it out |
Yeah let's see, still seems pretty fragile but only in some pretty rare cases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice as a concept. I would love to take it for a spin 😄
Since our PRs usually go through significant changes during development, I would love to have a mechanism in place that can also refresh the description.
Proposal for a solution: Perhaps we can change the trigger for the workflow. Instead of triggering it when the PR is opened, maybe we can trigger it with each change (or when it is marked ready for release). This would also solve some of the problem I mentioned in my comments.
scripts/generate_pr_description.py
Outdated
filename = file['filename'] | ||
status = file['status'] | ||
|
||
if status == 'added': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually, if there is a new file, there is some meaningful code in there. That's why it might be nice to add the patch
for the new files as well.
source $HOME/.cargo/env | ||
uv pip install --system requests openai | ||
|
||
- name: Check for previous successful run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This workflow would run only when the PR is opened
. So, it will only be run once. That's why the step here does not make much sense. When it runs for the first and only time, the comment won't be there, and thus the code will execute.
jobs: | ||
auto-describe: | ||
runs-on: ubuntu-latest | ||
if: github.event.pull_request.draft == false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I open the PR as a draft and then later change it to "Ready for Review", this code will not run again.
Describe changes
Pull Request Description
This PR introduces a new GitHub Actions workflow and a script for automating pull request description generation.
.github/workflows/generate-pr-description.yml
: This file sets up a GitHub Actions workflow to run the PR description generation script.scripts/generate_pr_description.py
: This script generates concise and informative pull request descriptions based on the changes made in the repository.These additions aim to streamline the pull request process and improve clarity in PR documentation.
Pre-requisites
Please ensure you have done the following:
develop
and the open PR is targetingdevelop
. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes
Summary by CodeRabbit
New Features
Bug Fixes