remove multi-line block/content #17

vrody · 2015-04-27T16:18:20Z

Hi,
can be use subs_filter, remove a block that contains specified tex?

any content

any content any content, "Hello world", other any content

eg remove the block DIV that contains tex "Hello world"?

if the unit is in a line that is easy to remove,

any content, Hellow World, other any content

subs_filter '

(.*)Hellow World(.*)

' '' r;

but if more lines, and there are gaps in them, then I can not remove the entire unit
help please

siochs · 2015-09-15T10:30:44Z

I have got the exact same problem. Tried so far <form.+<\/form>, (*CRLF)<form.*<\/form>, <form[\s\S]+<\/form>, <form(\n|\r|\r\n|\R|\v|\s|\pZs|.)+\/form>, (?s)<form.*?<\/form> , ...
Just nothing matches.
Do you have any ideas?
Thanks!

kevinquinnyo · 2015-12-23T02:14:02Z

I just discovered this issue as well. I think that it works on a line-by-line basis much like apache's mod_substitute. I don't know much about the internals of nginx, but I'm guessing it's definition of "line" in a response body of html payload is similar to a "line" in unix, in that it's terminated by a '\0' null byte or something similar.

I think the real answer is, if you're needing something as complicated as parsing html via complex regex (always a bad idea to be honest), or multi-line substitutions, it should probably be handled upstream (in the application), unfortunately.

Correct me if I'm wrong on this @yaoweibin

jochenwezel · 2017-09-29T08:55:24Z

would be great if a multi-line expression could be supported,
e.g.

subs_filter 'content in firstline.*content in a following line' 'replacement' rm

where "rm" could also be another value like "m" to indicate that regular expression with multi-line support should be used instead of standard regular expression engine configured for line-by-line

simeonackermann · 2020-01-09T11:43:38Z

I successfully removed a html tag with any content with:

subs_filter '<div(.|\n)*</div>' '' rg;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remove multi-line block/content #17

remove multi-line block/content #17

vrody commented Apr 27, 2015

siochs commented Sep 15, 2015

kevinquinnyo commented Dec 23, 2015

jochenwezel commented Sep 29, 2017

simeonackermann commented Jan 9, 2020

remove multi-line block/content #17

remove multi-line block/content #17

Comments

vrody commented Apr 27, 2015

siochs commented Sep 15, 2015

kevinquinnyo commented Dec 23, 2015

jochenwezel commented Sep 29, 2017

simeonackermann commented Jan 9, 2020