Skip to content
This repository has been archived by the owner on Sep 30, 2022. It is now read-only.

Add whitelist option to URL lock #27

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Add whitelist option to URL lock #27

wants to merge 3 commits into from

Conversation

GaryBeez
Copy link

Also adds missing filters to lock text links.

@SphericalKat
Copy link
Contributor

@garciaW Oo pretty good one

Copy link
Owner

@PaulSonOfLars PaulSonOfLars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so far so good, but a bunch of issues i can see arising.

@@ -80,6 +83,59 @@ def locktypes(bot: Bot, update: Update):
update.effective_message.reply_text("\n - ".join(["Locks: "] + list(LOCK_TYPES) + list(RESTRICTION_TYPES)))


@user_admin
def add_whitelist(bot: Bot, update: Update):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be @run_async to ensure as few blocks as possible
Also needs to be @loggable to ensure log channels get info -> which means returning a log at the end as well

if sql.add_whitelist(chat.id, url):
added.append(url)
if added:
message.reply_text("Added {} to whitelist.".format(', '.join(w for w in added)))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary list comprehension.
Also, consider using newlines starting with - to delimit the different urls; putting everything on one line will be messy

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, fixed.

def add_whitelist(bot: Bot, update: Update):
chat = update.effective_chat # type: Optional[Chat]
message = update.effective_message # type: Optional[Message]
entities = message.parse_entities(MessageEntity.URL)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

message.reply_text("No URLs were added to the whitelist")


@user_admin
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as before; @run_async and @loggable. applies for all funcs

removed.append(url)
if removed:
message.reply_text("Removed `{}` from whitelist.".format('`, `'.join(escape_markdown(w) for w in removed)),
parse_mode=ParseMode.MARKDOWN)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if possible, use HTML formatting to avoid message parsing issues. escape_markdown is more fragile than escape_html.
You can also get away with removing the list comp and calling escape_{whichever}() on the joined list instead of each item.

@@ -279,11 +345,14 @@ def __chat_settings__(chat_id, user_id):

__help__ = """
- /locktypes: a list of possible locktypes
- /whitelisted: lists urls in this chat's whitelist
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indent this properly (one more space!)

def add_whitelist(chat_id, url):
global CHAT_WHITELIST
with WHITELIST_LOCK:
url = re.search(r'(^http:\/\/|^https:\/\/|^ftp:\/\/|^)(www\.)?(\S*)', url, flags=re.I).group(3).lower()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given this regex pattern doesnt change, compile it and use it as a global.
Also, what happens if group(3) is None? lower() will die

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not convinced this should be here, given it isnt sql logic (and youre wasting CPU cycles given this bit doesnt need the lock yet)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only called for strings Telegram classified as a URL entity, and group(3) matches with \S* which is "any non-whitespace character zero or more times", so it really never should be None.

Will compile and move the pattern out to a global variable. I decided to have all regexp patterns in a single file so it's easy to change them all, if needed.

whitelisted = URLWhitelist(str(chat_id), url)
SESSION.add(whitelisted)
SESSION.commit()
chat_whitelist = CHAT_WHITELIST.setdefault(str(chat_id), {})
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldnt this be indented too? since if not prev, this is already loaded?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more a case of "double bagging" :-) If for whatever reason the URL was in the DB but not in the dictionary, now it sure is.

At least the return True statement I would leave unindented, so the bot's confirmation message will include the URL even if it was already previously added.

row.chat_id = str(new_chat_id)
SESSION.commit()

__load_chat_whitelist()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a newline at EOF for pep8

CHAT_WHITELIST[str(row.chat_id)].update(
{row.url: re.compile(r'(^http:\/\/|^https:\/\/|^ftp:\/\/|^)(www\.)?'+re.escape(row.url)+'($|\W)',
flags=re.I
)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: does this bracket really need its own line?

chat = update.effective_chat # type: Optional[Chat]
user = update.effective_user # type: Optional[User]
message = update.effective_message # type: Optional[Message]
entities = message.parse_entities([MessageEntity.URL])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline this variable; it isn't used anywhere other than the for loop.

entities = message.parse_entities([MessageEntity.URL])
added = []
for url in entities.values():
if sql.add_whitelist(chat.id, url):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be turned into a list comprehension.
added = [url for url in message.parse_entities([MessageEntity.URL]).values() if sql.add_whitelist(chat.id, url)]

if sql.add_whitelist(chat.id, url):
added.append(url)
if added:
message.reply_text("Added to whitelist:\n- "+'\n- '.join(added))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep string characters consistent; use "" for the join

"\n<b>Admin:</b> {}" \
"\nWhitelisted:\n<pre>- {}</pre>".format(html.escape(chat.title),
mention_html(user.id, user.first_name),
html.escape('\n- '.join(added)))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since youre doing this join twice, make it a variable

return "<b>{}:</b>" \
"\n#WHITELIST" \
"\n<b>Admin:</b> {}" \
"\nWhitelisted:\n<pre>- {}</pre>".format(html.escape(chat.title),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont add the - to the code block; the code is there so you can copy paste it just by tapping.

# url. So I must add all entities that have a 'url' field separately
entities = entities | set(entity.url for entity in message.entities if entity.url)
#if all URLs are any of the whitelisted ones, accept the message
if all( any(regexp.search(text) for regexp in sql.get_whitelist(chat.id).values())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you handle invalid regex expressions? if any are broken, this will blow up and entirely break whitelisting for that chat. I would personally not allow regex for this, as most users have very limited knowledge of it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users don't get to interact with any of this as regex. All they do is send URLs. Only text that Telegram considers a URL is added (via entities). Also, the URLs themselves are escaped with re.escape before the regex is compiled.



PERM_LOCK = threading.RLock()
RESTR_LOCK = threading.RLock()
WHITELIST_LOCK = threading.RLock()
CHAT_WHITELIST = {}
URL_REGEXP = re.compile(r'(^http:\/\/|^https:\/\/|^ftp:\/\/|^)(www\.)?(\S*)', flags=re.I)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic should be a part of the module, not the sql

chat_whitelist.update(
{url: re.compile(r'(^http:\/\/|^https:\/\/|^ftp:\/\/|^)(www\.)?'+re.escape(url)+'($|\W)',
flags=re.I)})
return True
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

session.close() before returning



def add_whitelist(chat_id, url):
url = URL_REGEXP.search(url).group(3).lower()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple chaining will blow up in case of None return from the search, or the group.



def __load_chat_whitelist():
#whitelist for each group is a dict(url: compiled_regexp for url in group)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pep8 comments have a space after the #

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants