Filter or remove rules to filter/remove by regexp/wildcard

Can we have filter or remove rules to filter/remove via regexp or wildcard???

E.g.:

## 1.
Zero width space and/or Non-breaking space:
`<a href="https://bla-bla-bla">&ZeroWidthSpace;&ZeroWidthSpace;</a>text-text-text` produce:
```
[​​](https://bla-bla-bla)text-text-text
```
**Is there any way to filter out (remove) html with zero visual content?**
Something like:
```
turndownService.addRule('al_spaces', {
 regexFilter: '<[^<>]+?>[[:space:]]<\/[^<>]+?>',
 replacement: function (content) {
 return ''
 }
})
```
List of spaces for reference:
| Number | Character name |
| --- | --- |
| \\u0020 | space |
| \\u00A0 | no-break space |
| \\u1680 | Ogham space mark |
| \\u180E | Mongolian vowel separator |
| \\u2000 | en quad |
| \\u2001 | em quad |
| \\u2002 | en space (nut) |
| \\u2003 | em space (mutton) |
| \\u2004 | three-per-em space (thick space) |
| \\u2005 | four-per-em space (mid space) |
| \\u2006 | six-per-em space |
| \\u2007 | figure space |
| \\u2008 | punctuation space |
| \\u2009 | thin space |
| \\u200A | hair space |
| \\u200B | zero width space |
| \\u202F | narrow no-break space |
| \\u205F | medium mathematical space |
| \\u3000 | ideographic space |
| \\uFEFF | zero width no-break space |
| \\uFFFC | object replacement Character |

## 2.
Line break which breaks markdown's markup:
`bla-bla-bla &nbsp; text-text-text` produce:
```
**bla-bla-bla
** 
text-text-text
```
**Is there any way to filter out (remove) all line breaks that precedes the closing tag?**
Something like:
```
turndownService.removeAllBefore(' ', '</*>')
```

### Here is regex examples:
Remove the anchor with zero-width spaces (you can't see them until you paste it in dev console):
```
selectedHTML='bla<a href="https://bla-bla-bla">​​​​​​​</a>text-text-textbla'
selectedHTML.replace(/<[^<>]+?>[\u00A0\u1680\u180E\u2000-\u200B\u202F\u205F\u3000\uFEFF\u0020\uFFFC]+<\/[^<>]+?>/gm, '')
```

Remove the line break that precedes closing tag:
```
selectedHTML='blabla-bla-bla &nbsp; text-text-textbla'
selectedHTML.replace(/( )+(<\/[^<>]+?>)/gi, '$2')
```

Swap the line break that precedes closing tag and the closing tag with:
```
selectedHTML='blabla-bla-bla &nbsp; text-text-textbla'
selectedHTML.replace(/(( )+)(<\/[^<>]+?>)/gi, '$3$1')
```

It would be nice if regex filter will skip the content of `code` and `pre` tags.

P.S. 
And also:

```
// Drop anchor html tags which contains only dots, commas
selectedHTML = '<a href="#">,</a>'
selectedHTML.replace(/<a [^<>]+?>[.,]+<\/a>/gim, '')
```

And

```
// Drop emoji images, keep emoji unicode (from alt attr)
selectedHTML = '<img src="img-apple-64/1f914.png" class="emoji" alt="🤔">'
selectedHTML.replace(/<img [^<>]+?alt=['"]([\p{Emoji}\u200d]+)['"][^<>]*?\/?>/gimu, '$1')
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter or remove rules to filter/remove by regexp/wildcard #423

1.

2.

Here is regex examples:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Number	Character name
\u0020	space
\u00A0	no-break space
\u1680	Ogham space mark
\u180E	Mongolian vowel separator
\u2000	en quad
\u2001	em quad
\u2002	en space (nut)
\u2003	em space (mutton)
\u2004	three-per-em space (thick space)
\u2005	four-per-em space (mid space)
\u2006	six-per-em space
\u2007	figure space
\u2008	punctuation space
\u2009	thin space
\u200A	hair space
\u200B	zero width space
\u202F	narrow no-break space
\u205F	medium mathematical space
\u3000	ideographic space
\uFEFF	zero width no-break space
\uFFFC	object replacement Character

Filter or remove rules to filter/remove by regexp/wildcard #423

Description

1.

2.

Here is regex examples:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions