Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: add spellcheck.dic validation #7062

Merged
merged 2 commits into from
Jan 10, 2025
Merged

Conversation

drink7036290
Copy link
Contributor

Motivation

As per #7061, the spellcheck.dic file had an incorrect word count.

Solution

This PR corrects the word count in spellcheck.dic and adds a new validation step in ci.yml to prevent similar issues in the future. The validation checks that:

  1. The first line is an integer.
  2. The last line is completely empty (no spaces).
  3. The number of lines between the first and last matches the integer specified in the first line.

Fixes: #7061

@Darksonn Darksonn added the A-ci Area: The continuous integration setup label Jan 2, 2025
spellcheck.dic Outdated
@@ -298,3 +298,4 @@ Wakers
wakeup
wakeups
workstealing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file currently ends with two newline characters rather than one. Is that intentional?

$ hexdump spellcheck.dic
0000000 3932 0a39 0a26 0a2b 0a3c 0a3d 0a3e 0a5c
...
0000900 656b 7075 0a73 6f77 6b72 7473 6165 696c
0000910 676e 0a0a
0000914

Note that 0a is a newline character.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's the displaying arrangement by hexdump with or without the "-C" option ?

Though these two "0a" sit at different positions shown below, they all have the same appearance: a last empty line.

$ cat -A spellcheck.dic | tail -3
wakeups$
workstealing$
$

// commit da745ff, the last commit that pass the validations

$ hexdump spellcheck.dic | tail -3
00008c0 6b61 7565 7370 770a 726f 736b 6574 6c61
00008d0 6e69 0a67 000a                         
00008d5
$ hexdump -C spellcheck.dic | tail -3
000008c0  61 6b 65 75 70 73 0a 77  6f 72 6b 73 74 65 61 6c  |akeups.worksteal|
000008d0  69 6e 67 0a 0a                                    |ing..|
000008d5

// current PR

$ hexdump spellcheck.dic | tail -3
0000900 656b 7075 0a73 6f77 6b72 7473 6165 696c
0000910 676e 0a0a                              
0000914
$ hexdump -C spellcheck.dic | tail -3
00000900  6b 65 75 70 73 0a 77 6f  72 6b 73 74 65 61 6c 69  |keups.worksteali|
00000910  6e 67 0a 0a                                       |ng..|
00000914

Thanks for the reminding, I'm also wonder why hexdump will reorder char's position while no "-C" option, take the word "workstealing" as an example:

00008c0 6b61 7565 7370 770a 726f 736b 6574 6c61
                        w\n  r o  s k  e t  l a
00008d0 6e69 0a67 000a
         n i \n g   \n

Comment on lines 1098 to 1103
# Verify the last line is completely empty (no spaces).
last_line=$(tail -n 1 "$FILE")
if [ -n "$last_line" ]; then
echo "Error: The last line of $FILE must be empty (without spaces)."
exit 1
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this check to avoid the double newline?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it’s possible to remove the last-line check, but that would change the original design. Currently:

sed '1d;$d' "$FILE"

removes both the first line (the integer) and the last line (which is empty).

To avoid the double-newline issue from the existing approach, we’d need to:

  • Remove only the first line by changing the command to:

    sed '1d' "$FILE"

  • Eliminate the empty line at the end of spellcheck.dic.

  • Remove the check that enforces a trailing empty line.

From a user standpoint, having no trailing empty line can be simpler and less prone to breaking the rules (and thus reduce the need for an extra CI check).

I can add a new commit to this PR with those changes if you agree. Let me know if there’s anything else you’d like to adjust!

Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that sounds good. Let's do that.

Remove only the first line only.
Eliminate the empty line at the end of spellcheck.dic.
Remove the check that enforces a trailing empty line.
Copy link
Contributor

@Darksonn Darksonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

@Darksonn Darksonn merged commit dabae57 into tokio-rs:master Jan 10, 2025
84 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ci Area: The continuous integration setup
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unmatched Number of Words vs. Actual Entries in spellcheck.dic
2 participants