string distance / fuzzy matching instead of hard substring keyword searching #9

bkil · 2022-09-09T11:12:22Z

It could be worthwhile to also implement some simple edit-distance based fuzzy typo allowance & fuzzy keyword matching might be set as well. And also, if a message contains too (many) characters not participating in valid words of the sentence, that would be a red flag.

Each room is limited to a single language in 99% of the cases, thus posting foreign spam is already a red flag. This is important in the dozens of local language rooms where the indiscriminate English spammer sometimes joins as well. But also, dictionaries exist (see your package manager, or Wiktionary, Wikipedia, etc). Or you could just go through the chat log to collect words and sentences used by non-troll members in the past (=ham) to help discriminate it from unusual content (spam).

jjj333-p · 2022-09-09T11:13:52Z

this is an interesting issue, but it is far beyond my skillset. I would however love to see something like this come through, and i would love for if someone else knows how to do this they could contribute

jjj333-p · 2024-05-31T08:23:45Z

update, this might be doable in some manner, perhaps using string distance. still on the backburner but this might be the solution i to something

jjj333-p added enhancement New feature or request help wanted Extra attention is needed later something that isn't the priority but it will be dealt with someday. labels Sep 9, 2022

jjj333-p added wontfix This will not be worked on and removed help wanted Extra attention is needed later something that isn't the priority but it will be dealt with someday. labels Jan 7, 2024

jjj333-p changed the title ~~Consider how we could protect against homograph attacks~~ string distance / fuzzy matching instead of hard substring keyword searching May 31, 2024

jjj333-p added later something that isn't the priority but it will be dealt with someday. wontfix This will not be worked on and removed wontfix This will not be worked on labels May 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

string distance / fuzzy matching instead of hard substring keyword searching #9

string distance / fuzzy matching instead of hard substring keyword searching #9

bkil commented Sep 9, 2022

jjj333-p commented Sep 9, 2022

jjj333-p commented May 31, 2024

string distance / fuzzy matching instead of hard substring keyword searching #9

string distance / fuzzy matching instead of hard substring keyword searching #9

Comments

bkil commented Sep 9, 2022

jjj333-p commented Sep 9, 2022

jjj333-p commented May 31, 2024