-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create broken link checker cron jobs #1018
Conversation
Temporarily putting this PR on hold because of the below GitHub error which is preventing the link checker from being used. We will need to either enable group permissions or rewrite some parts of the code. Error: .github#L1 |
flatpages=$(curl https://cantusdatabase.org/flatpages-list/ | awk '{ gsub (" ", "\",\"", $0); print}') | ||
articles=$(curl https://cantusdatabase.org/articles-list/ | awk '{ gsub (" ", "\",\"", $0); print}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it help if the lists of links were comma-separated rather than space-separated, so we don't need to use awk
and gsub
here? I can make this change easily if it would clean up this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code works in a personal repository. The error appears to have nothing to do with this. It's the fact that a template is used for the lychee link checker that is not allowed. I'll investigate and let you know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right - I didn't mean to suggest that this was causing the link checker to not run. This just jumped out as a place where we could easily clean up the code with a little bit of coordination.
cc3ef0d
to
024d16d
Compare
024d16d
to
41750d8
Compare
|
||
# If link checker doesn't have any errors exit gracefully | ||
if not Path(FILE_LOCATION).exists(): | ||
print("# ✅ No Broken Link") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print("# ✅ No Broken Link") | |
print("# ✅ No Broken Links") |
print("# ✅ No Broken Link") | ||
sys.exit(0) | ||
else: | ||
print("# Broken Link found, parsing needed", file=sys.stderr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print("# Broken Link found, parsing needed", file=sys.stderr) | |
print("# Broken Links found, parsing needed", file=sys.stderr) |
listOfFailure = link_checker_result['fail_map'] | ||
|
||
if not listOfFailure: | ||
print("# ✅ No Broken Link") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print("# ✅ No Broken Link") | |
print("# ✅ No Broken Links") |
skipErrors = [] | ||
|
||
for failureWebSite in listOfFailure: # looping through tested websites | ||
for failure in listOfFailure[failureWebSite]: # looping through broken links |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just loop through failureWebSite
directly?
skipErrors.append(failure) | ||
|
||
if RealErrors: | ||
print("# Broken Link") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print("# Broken Link") | |
print("# Broken Links") |
print(f"* {error['url']}: {error['status']['code']}") | ||
|
||
if skipErrors: | ||
print("# Skippable error Link") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print("# Skippable error Link") | |
print("# Skippable error Links") |
Broken link checker cron job that will run at 08h08 every day and check for broken links on flatpages and created articles obtained from https://cantusdatabase.org/flatpages-list/ and https://cantusdatabase.org/articles-list/ websites.