Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow periods at the start and end of new usernames #9336

Merged
merged 3 commits into from
Feb 11, 2025

Conversation

robertknight
Copy link
Member

@robertknight robertknight commented Feb 10, 2025

Work towards resolving an ambiguity when parsing mentions out of text like "I like this @bob." by disallowing periods at the start and end of usernames.

There are a small number (~2000 or <0.1%) of existing accounts which do contain periods at the start or end of usernames. We need to continue to allow those accounts to be used, but we can get away with not supporting extracting mentions for those users from text.

Fixes #9335.

Testing:

  1. Try to create a new user account that starts or ends with a period. This should be rejected, with a helpful validation error.
  2. Try to log in to an existing user account that starts or ends with a period. This should work as before.
  3. Try to rename a user account to start or end with a period. This should be rejected.

Work towards resolving an ambiguity when parsing mentions out of text like "I
like this @bob." by disallowing periods at the start and end of usernames.

There are a small number (~2000 or <0.1%) of existing accounts which do contain
periods at the start or end of usernames. We need to continue to allow those
accounts to be used, but we can get away with not supporting extracting mentions
for those users from text.

Fixes #9335
schema.deserialize({"username": "a"})
assert exc.value.asdict()["username"] == ("Must be 3 characters or more.")
schema.deserialize({"username": "ab"})
assert "Must be 3 characters or more." in exc.value.asdict()["username"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The USERNAME_PATTERN regex now implicitly sets a minimum length of 3 (at least one start, middle and end char) so it also fails. I think that's fine, but for this test only check for the length message specifically.

@robertknight robertknight requested a review from seanh February 10, 2025 13:37
@robertknight
Copy link
Member Author

Requested a review from @seanh in case there are any issues with changing the username pattern that I've missed.

Copy link
Contributor

@seanh seanh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

h/models/user.py Outdated
Comment on lines 23 to 25
# nb. This pattern is used in Python code, JSON schemas and HTML forms, so it
# needs to use portable syntax.
USERNAME_PATTERN = "^[A-Za-z0-9_][A-Za-z0-9._]+[A-Za-z0-9_]$"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this affect the user-update API, for those users whose existing usernames start or end with a .?

Copy link
Member Author

@robertknight robertknight Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user update API does not validate the format of the username that I can see, which makes it compatible with old usernames. I have added a note about the change in the comment for USERNAME_PATTERN to make future readers aware.

@@ -35,6 +36,7 @@ def test_users_index(pyramid_request):
"username": None,
"authority": None,
"user": None,
"username_pattern": Any.string(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI we're trying to stop using h-matchers nowadays as there's various problems with it (it's implemented in a very clever way that makes it impossible to maintain, it has a lot of features that we never use, and it ruins pytest's assertion failure messages). I'm not sure what the plain Python / pytest alternative to usages like this would be, however. You could use mock.ANY but that doesn't assert that it's a string. Alternatively a simple classic-style matcher could be implemented.

(mock.ANY is implemented using the classic-style matchers approach. The original request for h-matchers was just to cut-paste existing classic-style matchers from app tests into a shared library, but we got h-matchers instead.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we used to have an instance_of matcher for just this purpose:

class instance_of(Matcher): # noqa: N801
"""An object __eq__ to any object which is an instance of `type_`."""
def __init__(self, type_):
self.type = type_
def __eq__(self, other):
return isinstance(other, self.type)
def __repr__(self):
return '<instance of {!r}>'.format(self.type)
. Could build some alias matchers on top of that, like any_string() etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI we're trying to stop using h-matchers nowadays as there's various problems with it (it's implemented in a very clever way that makes it impossible to maintain, it has a lot of features that we never use, and it ruins pytest's assertion failure messages).

Could we perhaps distill h-matchers into a more limited subset of the functionality that covers the main use cases? Preferably in such a way that the API migration is easy. The "legacy" API could be moved into a submodule in h-matchers to facilitate incremental migration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have swapped Any.string() for mock.ANY here, since that is fine for the current use case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we perhaps distill h-matchers into a more limited subset of the functionality that covers the main use cases? Preferably in such a way that the API migration is easy. The "legacy" API could be moved into a submodule in h-matchers to facilitate incremental migration.

I think unfortunately not easily, given the way h-matchers is implemented.

There's only a handful of types of assertion that h-matchers tends to be used for across our tests, and each of those types of assertion can be replaced with a simpler approach, e.g. just a plain assertion, or mock.ANY, or a simple matcher class, etc. In that way I think we can phase out h-matchers.

@robertknight robertknight merged commit 89c82e5 into main Feb 11, 2025
11 checks passed
@robertknight robertknight deleted the disallow-username-start-end-period branch February 11, 2025 11:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Disallow periods at the start and end of new usernames
3 participants