-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test major functions with the Big List of Naughty Strings #726
Comments
@bact i think that making a library to be robust enough to handle variety of character combinations is a very great idea!. Did you mean that adding strings from the Big List of Naughty Strings as an additional test case with all related functions inside |
@pavaris-pm Exactly. We can start with |
Cool! i will try with that first. Do we need to test will all naughty string in that repo? or just some sample of it is ok. Since the Big List of Naughty Strings repo itself has a |
But if it takes too long time in the test (which may affect productivity) we can focus on relevant categories. I would say these categories are more relevant Group 1: (non-)whitespaces and control characters - as they occurred a lot and sometimes our regular expressions may not well covered them:
Group 2: string-length related: some non-careful string manipulation may breaks some of these strings. For this group, I think the expected behavior for the testing is for any
|
Detailed description
Add test with strings from the Big List of Naughty Strings, to test robustness of the library.
Context
The Big List of Naughty String is "an evolving list of strings which have a high probability of causing issues when used as user-input data." For example, a string with zero-width space (U+200B).
As a text processing library that has to deal with strings of all sorts, both from user-input and from data archive, it is expected that the library should be robust enough to handle variety of character combinations.
Possible implementation
The text was updated successfully, but these errors were encountered: