-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
With HTML::Parser v3.76.
Consider the following chunk of data:
Hello world! <span class="highlight">Isn't this wonderful</span> really?
Creating an object, such as:
# using curry and assuming this runs within a module that acts as a simple wrapper, nothing fancy.
my $p = HTML::Parser->new(
api_version => 3,
start_h => [ $self->curry::add_start, 'self, tagname, attr, attrseq, text, column, line, offset, offset_end'],
end_h => [ $self->curry::add_end, 'self, tagname, attr, attrseq, text, column, line, offset, offset_end' ],
marked_sections => 1,
comment_h => [ $self->curry::add_comment, 'self, text, column, line, offset, offset_end'],
declaration_h => [ $self->curry::add_declaration, 'self, text, column, line, offset, offset_end'],
default_h => [ $self->curry::add_default, 'self, tagname, attr, attrseq, text, column, line, offset, offset_end'],
text_h => [ $self->curry::add_text, 'self, text, column, line, offset, offset_end'],
empty_element_tags => 1,
end_document_h => [ $self->curry::end_document, 'self, skipped_text'],
);
$p->parse( $html );
sub add_text
{
my $self = shift( @_ );
print( "got '", $_[1], "'\n" );
}
And this would yield:
got 'Hello world! '
got 'Isn't this wonderful'
However, ' really?' is not being reported.
One has to explicitly call $p->eof
to have the trailing text reported.
If this is an intended feature, then it ought to be made clear in the documentation. However, I think one should not have to call eof
to get that last trailing text.
Metadata
Metadata
Assignees
Labels
No labels