Add whitelist option to URL lock #27

GaryBeez · 2018-05-25T04:30:51Z

Also adds missing filters to lock text links.

SphericalKat · 2018-05-25T04:35:13Z

@garciaW Oo pretty good one

PaulSonOfLars

so far so good, but a bunch of issues i can see arising.

PaulSonOfLars · 2018-05-25T14:21:41Z

tg_bot/modules/locks.py

@@ -80,6 +83,59 @@ def locktypes(bot: Bot, update: Update):
    update.effective_message.reply_text("\n - ".join(["Locks: "] + list(LOCK_TYPES) + list(RESTRICTION_TYPES)))


+@user_admin
+def add_whitelist(bot: Bot, update: Update):


This should be @run_async to ensure as few blocks as possible
Also needs to be @loggable to ensure log channels get info -> which means returning a log at the end as well

PaulSonOfLars · 2018-05-25T14:23:50Z

tg_bot/modules/locks.py

+        if sql.add_whitelist(chat.id, url):
+            added.append(url)
+    if added:
+        message.reply_text("Added {} to whitelist.".format(', '.join(w for w in added)))


Unnecessary list comprehension.
Also, consider using newlines starting with - to delimit the different urls; putting everything on one line will be messy

Agree, fixed.

PaulSonOfLars · 2018-05-25T14:25:16Z

tg_bot/modules/locks.py

+def add_whitelist(bot: Bot, update: Update):
+    chat = update.effective_chat  # type: Optional[Chat]
+    message = update.effective_message  # type: Optional[Message]
+    entities = message.parse_entities(MessageEntity.URL)


parse_entities takes a list
https://python-telegram-bot.readthedocs.io/en/stable/telegram.message.html?highlight=parse_entities

PaulSonOfLars · 2018-05-25T14:26:04Z

tg_bot/modules/locks.py

+        message.reply_text("No URLs were added to the whitelist")
+
+
+@user_admin


same as before; @run_async and @loggable. applies for all funcs

PaulSonOfLars · 2018-05-25T14:27:40Z

tg_bot/modules/locks.py

+            removed.append(url)
+    if removed:
+        message.reply_text("Removed `{}` from whitelist.".format('`, `'.join(escape_markdown(w) for w in removed)),
+            parse_mode=ParseMode.MARKDOWN)


if possible, use HTML formatting to avoid message parsing issues. escape_markdown is more fragile than escape_html.
You can also get away with removing the list comp and calling escape_{whichever}() on the joined list instead of each item.

PaulSonOfLars · 2018-05-25T14:36:40Z

tg_bot/modules/locks.py

@@ -279,11 +345,14 @@ def __chat_settings__(chat_id, user_id):

 __help__ = """
 - /locktypes: a list of possible locktypes
+- /whitelisted: lists urls in this chat's whitelist


nit: indent this properly (one more space!)

PaulSonOfLars · 2018-05-25T14:39:24Z

tg_bot/modules/sql/locks_sql.py

+def add_whitelist(chat_id, url):
+    global CHAT_WHITELIST
+    with WHITELIST_LOCK:
+        url = re.search(r'(^http:\/\/|^https:\/\/|^ftp:\/\/|^)(www\.)?(\S*)', url, flags=re.I).group(3).lower()


given this regex pattern doesnt change, compile it and use it as a global.
Also, what happens if group(3) is None? lower() will die

I'm also not convinced this should be here, given it isnt sql logic (and youre wasting CPU cycles given this bit doesnt need the lock yet)

This is only called for strings Telegram classified as a URL entity, and group(3) matches with \S* which is "any non-whitespace character zero or more times", so it really never should be None.

Will compile and move the pattern out to a global variable. I decided to have all regexp patterns in a single file so it's easy to change them all, if needed.

PaulSonOfLars · 2018-05-25T14:41:32Z

tg_bot/modules/sql/locks_sql.py

+            whitelisted = URLWhitelist(str(chat_id), url)
+            SESSION.add(whitelisted)
+            SESSION.commit()
+        chat_whitelist = CHAT_WHITELIST.setdefault(str(chat_id), {})


shouldnt this be indented too? since if not prev, this is already loaded?

This is more a case of "double bagging" :-) If for whatever reason the URL was in the DB but not in the dictionary, now it sure is.

At least the return True statement I would leave unindented, so the bot's confirmation message will include the URL even if it was already previously added.

PaulSonOfLars · 2018-05-25T14:44:12Z

tg_bot/modules/sql/locks_sql.py

+            row.chat_id = str(new_chat_id)
+        SESSION.commit()
+
+__load_chat_whitelist()


add a newline at EOF for pep8

PaulSonOfLars · 2018-05-25T14:46:12Z

tg_bot/modules/sql/locks_sql.py

+            CHAT_WHITELIST[str(row.chat_id)].update(
+                    {row.url: re.compile(r'(^http:\/\/|^https:\/\/|^ftp:\/\/|^)(www\.)?'+re.escape(row.url)+'($|\W)',
+                                         flags=re.I
+                                         )


nit: does this bracket really need its own line?

PaulSonOfLars · 2018-07-16T08:54:58Z

tg_bot/modules/locks.py

+    chat = update.effective_chat  # type: Optional[Chat]
+    user = update.effective_user  # type: Optional[User]
+    message = update.effective_message  # type: Optional[Message]
+    entities = message.parse_entities([MessageEntity.URL])


Inline this variable; it isn't used anywhere other than the for loop.

PaulSonOfLars · 2018-07-16T08:57:07Z

tg_bot/modules/locks.py

+    entities = message.parse_entities([MessageEntity.URL])
+    added = []
+    for url in entities.values():
+        if sql.add_whitelist(chat.id, url):


This can be turned into a list comprehension.
added = [url for url in message.parse_entities([MessageEntity.URL]).values() if sql.add_whitelist(chat.id, url)]

PaulSonOfLars · 2018-07-16T08:58:04Z

tg_bot/modules/locks.py

+        if sql.add_whitelist(chat.id, url):
+            added.append(url)
+    if added:
+        message.reply_text("Added to whitelist:\n- "+'\n- '.join(added))


keep string characters consistent; use "" for the join

PaulSonOfLars · 2018-07-16T08:58:35Z

tg_bot/modules/locks.py

+               "\n<b>Admin:</b> {}" \
+               "\nWhitelisted:\n<pre>- {}</pre>".format(html.escape(chat.title),
+                                                         mention_html(user.id, user.first_name),
+                                                         html.escape('\n- '.join(added)))


Since youre doing this join twice, make it a variable

PaulSonOfLars · 2018-07-16T08:59:32Z

tg_bot/modules/locks.py

+        return "<b>{}:</b>" \
+               "\n#WHITELIST" \
+               "\n<b>Admin:</b> {}" \
+               "\nWhitelisted:\n<pre>- {}</pre>".format(html.escape(chat.title),


Dont add the - to the code block; the code is there so you can copy paste it just by tapping.

PaulSonOfLars · 2018-07-16T09:11:30Z

tg_bot/modules/locks.py

+                    # url. So I must add all entities that have a 'url' field separately
+                    entities = entities | set(entity.url for entity in message.entities if entity.url)
+                    #if all URLs are any of the whitelisted ones, accept the message
+                    if all( any(regexp.search(text) for regexp in sql.get_whitelist(chat.id).values())


How do you handle invalid regex expressions? if any are broken, this will blow up and entirely break whitelisting for that chat. I would personally not allow regex for this, as most users have very limited knowledge of it.

Users don't get to interact with any of this as regex. All they do is send URLs. Only text that Telegram considers a URL is added (via entities). Also, the URLs themselves are escaped with re.escape before the regex is compiled.

PaulSonOfLars · 2018-07-16T09:12:56Z

tg_bot/modules/sql/locks_sql.py



 PERM_LOCK = threading.RLock()
 RESTR_LOCK = threading.RLock()
+WHITELIST_LOCK = threading.RLock()
+CHAT_WHITELIST = {}
+URL_REGEXP = re.compile(r'(^http:\/\/|^https:\/\/|^ftp:\/\/|^)(www\.)?(\S*)', flags=re.I)


this logic should be a part of the module, not the sql

PaulSonOfLars · 2018-07-16T09:13:52Z

tg_bot/modules/sql/locks_sql.py

+            chat_whitelist.update(
+                    {url: re.compile(r'(^http:\/\/|^https:\/\/|^ftp:\/\/|^)(www\.)?'+re.escape(url)+'($|\W)',
+                                     flags=re.I)})
+        return True


session.close() before returning

PaulSonOfLars · 2018-07-16T09:14:49Z

tg_bot/modules/sql/locks_sql.py

+
+
+def add_whitelist(chat_id, url):
+    url = URL_REGEXP.search(url).group(3).lower()


Multiple chaining will blow up in case of None return from the search, or the group.

PaulSonOfLars · 2018-07-16T09:15:36Z

tg_bot/modules/sql/locks_sql.py

+
+
+def __load_chat_whitelist():
+    #whitelist for each group is a dict(url: compiled_regexp for url in group)


nit: pep8 comments have a space after the #

Add whitelist option to URL lock

0d9f2b8

PaulSonOfLars suggested changes May 25, 2018

View reviewed changes

shibotto added 2 commits May 26, 2018 11:59

Improve lock URL whitelist

6fb9b9a

Merge branch 'master' into white

10cbadf

PaulSonOfLars suggested changes Jul 16, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add whitelist option to URL lock #27

Add whitelist option to URL lock #27

GaryBeez commented May 25, 2018

SphericalKat commented May 25, 2018

PaulSonOfLars left a comment

PaulSonOfLars May 25, 2018

PaulSonOfLars May 25, 2018

GaryBeez May 27, 2018

PaulSonOfLars May 25, 2018

PaulSonOfLars May 25, 2018

PaulSonOfLars May 25, 2018

PaulSonOfLars May 25, 2018

PaulSonOfLars May 25, 2018

PaulSonOfLars May 25, 2018

GaryBeez May 26, 2018

PaulSonOfLars May 25, 2018

GaryBeez May 26, 2018

PaulSonOfLars May 25, 2018

PaulSonOfLars May 25, 2018

PaulSonOfLars Jul 16, 2018

PaulSonOfLars Jul 16, 2018

PaulSonOfLars Jul 16, 2018

PaulSonOfLars Jul 16, 2018

PaulSonOfLars Jul 16, 2018

PaulSonOfLars Jul 16, 2018

GaryBeez Jul 16, 2018

PaulSonOfLars Jul 16, 2018

PaulSonOfLars Jul 16, 2018

PaulSonOfLars Jul 16, 2018

PaulSonOfLars Jul 16, 2018

		message.reply_text("No URLs were added to the whitelist")


		@user_admin



		def add_whitelist(chat_id, url):
		url = URL_REGEXP.search(url).group(3).lower()



		def __load_chat_whitelist():
		#whitelist for each group is a dict(url: compiled_regexp for url in group)

Add whitelist option to URL lock #27

Are you sure you want to change the base?

Add whitelist option to URL lock #27

Conversation

GaryBeez commented May 25, 2018

SphericalKat commented May 25, 2018

PaulSonOfLars left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment