-
Notifications
You must be signed in to change notification settings - Fork 78
Fix(html): Handle <br> elements to insert line breaks in text
#1950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
<br> elements to insert line breaks in text
PR Health
Breaking changes
|
| Package | Change | Current Version | New Version | Needed Version | Looking good? |
|---|---|---|---|---|---|
| html | Breaking | 0.15.6 | 0.15.7-wip | 0.16.0 Got "0.15.7-wip" expected >= "0.16.0" (breaking changes) |
This check can be disabled by tagging the PR with skip-breaking-check.
Changelog Entry ✔️
| Package | Changed Files |
|---|
Changes to files need to be accounted for in their respective changelogs.
This check can be disabled by tagging the PR with skip-changelog-check.
Coverage ✔️
| File | Coverage |
|---|---|
| pkgs/html/lib/dom.dart | 💚 65 % ⬆️ 1 % |
This check for test coverage is informational (issues shown here will not fail the PR).
This check can be disabled by tagging the PR with skip-coverage-check.
API leaks ⚠️
The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.
| Package | Leaked API symbol | Leaking sources |
|---|---|---|
| html | HtmlTokenizer | html/parser.dart::HtmlParser::tokenizer |
| html | Token | tokenizer.dart::HtmlTokenizer tokenizer.dart::HtmlTokenizer::tokenQueue tokenizer.dart::HtmlTokenizer::currentToken tokenizer.dart::HtmlTokenizer::currentToken token.dart::TagToken token.dart::DoctypeToken token.dart::StringToken tokenizer.dart::HtmlTokenizer::current token.dart::StartTagToken token.dart::CommentToken html/parser.dart::Phase::processComment html/parser.dart::Phase::processDoctype token.dart::CharactersToken html/parser.dart::Phase::processCharacters token.dart::SpaceCharactersToken html/parser.dart::Phase::processSpaceCharacters html/parser.dart::Phase::processStartTag html/parser.dart::Phase::startTagHtml token.dart::EndTagToken html/parser.dart::Phase::processEndTag html/parser.dart::HtmlParser::inForeignContent::token html/parser.dart::HtmlParser::parseRCDataRawtext::token html/parser.dart::BeforeHeadPhase::startTagOther html/parser.dart::BeforeHeadPhase::endTagImplyHead html/parser.dart::InHeadPhase::startTagOther html/parser.dart::InHeadPhase::endTagHtmlBodyBr html/parser.dart::AfterHeadPhase::startTagOther html/parser.dart::AfterHeadPhase::endTagHtmlBodyBr html/parser.dart::InBodyPhase::startTagProcessInHead html/parser.dart::InBodyPhase::startTagButton html/parser.dart::InBodyPhase::startTagOther html/parser.dart::InBodyPhase::endTagHtml html/parser.dart::InTablePhase::startTagCol html/parser.dart::InTablePhase::startTagImplyTbody html/parser.dart::InTablePhase::startTagTable html/parser.dart::InTablePhase::startTagStyleScript html/parser.dart::InCaptionPhase::startTagTableElement html/parser.dart::InCaptionPhase::startTagOther html/parser.dart::InCaptionPhase::endTagTable html/parser.dart::InCaptionPhase::endTagOther html/parser.dart::InColumnGroupPhase::startTagOther html/parser.dart::InColumnGroupPhase::endTagOther html/parser.dart::InTableBodyPhase::startTagTableCell html/parser.dart::InTableBodyPhase::startTagTableOther html/parser.dart::InTableBodyPhase::startTagOther html/parser.dart::InTableBodyPhase::endTagTable html/parser.dart::InTableBodyPhase::endTagOther html/parser.dart::InRowPhase::startTagTableOther html/parser.dart::InRowPhase::startTagOther html/parser.dart::InRowPhase::endTagTable html/parser.dart::InRowPhase::endTagTableRowGroup html/parser.dart::InRowPhase::endTagOther html/parser.dart::InCellPhase::startTagTableOther html/parser.dart::InCellPhase::startTagOther html/parser.dart::InCellPhase::endTagImply html/parser.dart::InCellPhase::endTagOther html/parser.dart::InSelectPhase::startTagInput html/parser.dart::InSelectPhase::startTagScript html/parser.dart::InSelectPhase::startTagOther html/parser.dart::InSelectInTablePhase::startTagTable html/parser.dart::InSelectInTablePhase::startTagOther html/parser.dart::InSelectInTablePhase::endTagTable html/parser.dart::InSelectInTablePhase::endTagOther html/parser.dart::AfterBodyPhase::startTagOther html/parser.dart::AfterBodyPhase::endTagHtml::token html/parser.dart::AfterBodyPhase::endTagOther html/parser.dart::InFramesetPhase::startTagNoframes html/parser.dart::InFramesetPhase::startTagOther html/parser.dart::AfterFramesetPhase::startTagNoframes html/parser.dart::AfterAfterBodyPhase::startTagOther html/parser.dart::AfterAfterFramesetPhase::startTagNoFrames |
| html | HtmlInputStream | tokenizer.dart::HtmlTokenizer::stream |
| html | TagToken | tokenizer.dart::HtmlTokenizer::currentTagToken token.dart::StartTagToken token.dart::EndTagToken html/parser.dart::InTableBodyPhase::startTagTableOther::token html/parser.dart::InTableBodyPhase::endTagTable::token |
| html | DoctypeToken | tokenizer.dart::HtmlTokenizer::currentDoctypeToken treebuilder.dart::TreeBuilder::insertDoctype::token html/parser.dart::Phase::processDoctype::token |
| html | StringToken | tokenizer.dart::HtmlTokenizer::currentStringToken token.dart::StringToken::add treebuilder.dart::TreeBuilder::insertComment::token token.dart::CommentToken token.dart::CharactersToken token.dart::SpaceCharactersToken html/parser.dart::InBodyPhase::processSpaceCharactersDropNewline::token html/parser.dart::InTableTextPhase::characterTokens |
| html | TreeBuilder | html/parser.dart::HtmlParser::tree html/parser.dart::Phase::tree html/parser.dart::HtmlParser::new::tree |
| html | ActiveFormattingElements | treebuilder.dart::TreeBuilder::activeFormattingElements |
| html | StartTagToken | treebuilder.dart::TreeBuilder::insertRoot::token treebuilder.dart::TreeBuilder::createElement::token treebuilder.dart::TreeBuilder::insertElement::token treebuilder.dart::TreeBuilder::insertElementNormal::token treebuilder.dart::TreeBuilder::insertElementTable::token html/parser.dart::Phase::processStartTag::token html/parser.dart::Phase::startTagHtml::token html/parser.dart::HtmlParser::adjustMathMLAttributes::token html/parser.dart::HtmlParser::adjustSVGAttributes::token html/parser.dart::HtmlParser::adjustForeignAttributes::token html/parser.dart::BeforeHeadPhase::startTagHead::token html/parser.dart::BeforeHeadPhase::startTagOther::token html/parser.dart::InHeadPhase::startTagHead::token html/parser.dart::InHeadPhase::startTagBaseLinkCommand::token html/parser.dart::InHeadPhase::startTagMeta::token html/parser.dart::InHeadPhase::startTagTitle::token html/parser.dart::InHeadPhase::startTagNoScriptNoFramesStyle::token html/parser.dart::InHeadPhase::startTagScript::token html/parser.dart::InHeadPhase::startTagOther::token html/parser.dart::AfterHeadPhase::startTagBody::token html/parser.dart::AfterHeadPhase::startTagFrameset::token html/parser.dart::AfterHeadPhase::startTagFromHead::token html/parser.dart::AfterHeadPhase::startTagHead::token html/parser.dart::AfterHeadPhase::startTagOther::token html/parser.dart::InBodyPhase::addFormattingElement::token html/parser.dart::InBodyPhase::startTagProcessInHead::token html/parser.dart::InBodyPhase::startTagBody::token html/parser.dart::InBodyPhase::startTagFrameset::token html/parser.dart::InBodyPhase::startTagCloseP::token html/parser.dart::InBodyPhase::startTagPreListing::token html/parser.dart::InBodyPhase::startTagForm::token html/parser.dart::InBodyPhase::startTagListItem::token html/parser.dart::InBodyPhase::startTagPlaintext::token html/parser.dart::InBodyPhase::startTagHeading::token html/parser.dart::InBodyPhase::startTagA::token html/parser.dart::InBodyPhase::startTagFormatting::token html/parser.dart::InBodyPhase::startTagNobr::token html/parser.dart::InBodyPhase::startTagButton::token html/parser.dart::InBodyPhase::startTagAppletMarqueeObject::token html/parser.dart::InBodyPhase::startTagXmp::token html/parser.dart::InBodyPhase::startTagTable::token html/parser.dart::InBodyPhase::startTagVoidFormatting::token html/parser.dart::InBodyPhase::startTagInput::token html/parser.dart::InBodyPhase::startTagParamSource::token html/parser.dart::InBodyPhase::startTagHr::token html/parser.dart::InBodyPhase::startTagImage::token html/parser.dart::InBodyPhase::startTagIsIndex::token html/parser.dart::InBodyPhase::startTagTextarea::token html/parser.dart::InBodyPhase::startTagIFrame::token html/parser.dart::InBodyPhase::startTagRawtext::token html/parser.dart::InBodyPhase::startTagOpt::token html/parser.dart::InBodyPhase::startTagSelect::token html/parser.dart::InBodyPhase::startTagRpRt::token html/parser.dart::InBodyPhase::startTagMath::token html/parser.dart::InBodyPhase::startTagSvg::token html/parser.dart::InBodyPhase::startTagMisplaced::token html/parser.dart::InBodyPhase::startTagOther::token html/parser.dart::InTablePhase::startTagCaption::token html/parser.dart::InTablePhase::startTagColgroup::token html/parser.dart::InTablePhase::startTagCol::token html/parser.dart::InTablePhase::startTagRowGroup::token html/parser.dart::InTablePhase::startTagImplyTbody::token html/parser.dart::InTablePhase::startTagTable::token html/parser.dart::InTablePhase::startTagStyleScript::token html/parser.dart::InTablePhase::startTagInput::token html/parser.dart::InTablePhase::startTagForm::token html/parser.dart::InTablePhase::startTagOther::token html/parser.dart::InCaptionPhase::startTagTableElement::token html/parser.dart::InCaptionPhase::startTagOther::token html/parser.dart::InColumnGroupPhase::startTagCol::token html/parser.dart::InColumnGroupPhase::startTagOther::token html/parser.dart::InTableBodyPhase::startTagTr::token html/parser.dart::InTableBodyPhase::startTagTableCell::token html/parser.dart::InTableBodyPhase::startTagOther::token html/parser.dart::InRowPhase::startTagTableCell::token html/parser.dart::InRowPhase::startTagTableOther::token html/parser.dart::InRowPhase::startTagOther::token html/parser.dart::InCellPhase::startTagTableOther::token html/parser.dart::InCellPhase::startTagOther::token html/parser.dart::InSelectPhase::startTagOption::token html/parser.dart::InSelectPhase::startTagOptgroup::token html/parser.dart::InSelectPhase::startTagSelect::token html/parser.dart::InSelectPhase::startTagInput::token html/parser.dart::InSelectPhase::startTagScript::token html/parser.dart::InSelectPhase::startTagOther::token html/parser.dart::InSelectInTablePhase::startTagTable::token html/parser.dart::InSelectInTablePhase::startTagOther::token html/parser.dart::InForeignContentPhase::adjustSVGTagNames::token html/parser.dart::AfterBodyPhase::startTagOther::token html/parser.dart::InFramesetPhase::startTagFrameset::token html/parser.dart::InFramesetPhase::startTagFrame::token html/parser.dart::InFramesetPhase::startTagNoframes::token html/parser.dart::InFramesetPhase::startTagOther::token html/parser.dart::AfterFramesetPhase::startTagNoframes::token html/parser.dart::AfterFramesetPhase::startTagOther::token html/parser.dart::AfterAfterBodyPhase::startTagOther::token html/parser.dart::AfterAfterFramesetPhase::startTagNoFrames::token html/parser.dart::AfterAfterFramesetPhase::startTagOther::token |
| html | TagAttribute | token.dart::StartTagToken::attributeSpans |
| html | CommentToken | html/parser.dart::Phase::processComment::token |
| html | CharactersToken | html/parser.dart::Phase::processCharacters::token html/parser.dart::InTablePhase::insertText::token |
| html | SpaceCharactersToken | html/parser.dart::Phase::processSpaceCharacters::token |
| html | EndTagToken | html/parser.dart::Phase::processEndTag::token html/parser.dart::Phase::popOpenElementsUntil::token html/parser.dart::BeforeHeadPhase::endTagImplyHead::token html/parser.dart::BeforeHeadPhase::endTagOther::token html/parser.dart::InHeadPhase::endTagHead::token html/parser.dart::InHeadPhase::endTagHtmlBodyBr::token html/parser.dart::InHeadPhase::endTagOther::token html/parser.dart::AfterHeadPhase::endTagHtmlBodyBr::token html/parser.dart::AfterHeadPhase::endTagOther::token html/parser.dart::InBodyPhase::endTagP::token html/parser.dart::InBodyPhase::endTagBody::token html/parser.dart::InBodyPhase::endTagHtml::token html/parser.dart::InBodyPhase::endTagBlock::token html/parser.dart::InBodyPhase::endTagForm::token html/parser.dart::InBodyPhase::endTagListItem::token html/parser.dart::InBodyPhase::endTagHeading::token html/parser.dart::InBodyPhase::endTagFormatting::token html/parser.dart::InBodyPhase::endTagAppletMarqueeObject::token html/parser.dart::InBodyPhase::endTagBr::token html/parser.dart::InBodyPhase::endTagOther::token html/parser.dart::TextPhase::endTagScript::token html/parser.dart::TextPhase::endTagOther::token html/parser.dart::InTablePhase::endTagTable::token html/parser.dart::InTablePhase::endTagIgnore::token html/parser.dart::InTablePhase::endTagOther::token html/parser.dart::InCaptionPhase::endTagCaption::token html/parser.dart::InCaptionPhase::endTagTable::token html/parser.dart::InCaptionPhase::endTagIgnore::token html/parser.dart::InCaptionPhase::endTagOther::token html/parser.dart::InColumnGroupPhase::endTagColgroup::token html/parser.dart::InColumnGroupPhase::endTagCol::token html/parser.dart::InColumnGroupPhase::endTagOther::token html/parser.dart::InTableBodyPhase::endTagTableRowGroup::token html/parser.dart::InTableBodyPhase::endTagIgnore::token html/parser.dart::InTableBodyPhase::endTagOther::token html/parser.dart::InRowPhase::endTagTr::token html/parser.dart::InRowPhase::endTagTable::token html/parser.dart::InRowPhase::endTagTableRowGroup::token html/parser.dart::InRowPhase::endTagIgnore::token html/parser.dart::InRowPhase::endTagOther::token html/parser.dart::InCellPhase::endTagTableCell::token html/parser.dart::InCellPhase::endTagIgnore::token html/parser.dart::InCellPhase::endTagImply::token html/parser.dart::InCellPhase::endTagOther::token html/parser.dart::InSelectPhase::endTagOption::token html/parser.dart::InSelectPhase::endTagOptgroup::token html/parser.dart::InSelectPhase::endTagSelect::token html/parser.dart::InSelectPhase::endTagOther::token html/parser.dart::InSelectInTablePhase::endTagTable::token html/parser.dart::InSelectInTablePhase::endTagOther::token html/parser.dart::AfterBodyPhase::endTagOther::token html/parser.dart::InFramesetPhase::endTagFrameset::token html/parser.dart::InFramesetPhase::endTagOther::token html/parser.dart::AfterFramesetPhase::endTagHtml::token html/parser.dart::AfterFramesetPhase::endTagOther::token |
This check can be disabled by tagging the PR with skip-leaking-check.
License Headers ⚠️
// Copyright (c) 2025, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.
| Files |
|---|
| pkgs/html/lib/dom.dart |
| pkgs/html/test/parser_feature_test.dart |
All source files should start with a license header.
Unrelated files missing license headers
| Files |
|---|
| pkgs/bazel_worker/benchmark/benchmark.dart |
| pkgs/benchmark_harness/integration_test/perf_benchmark_test.dart |
| pkgs/boolean_selector/example/example.dart |
| pkgs/clock/lib/clock.dart |
| pkgs/clock/lib/src/clock.dart |
| pkgs/clock/lib/src/default.dart |
| pkgs/clock/lib/src/stopwatch.dart |
| pkgs/clock/lib/src/utils.dart |
| pkgs/clock/test/clock_test.dart |
| pkgs/clock/test/default_test.dart |
| pkgs/clock/test/stopwatch_test.dart |
| pkgs/clock/test/utils.dart |
| pkgs/coverage/lib/src/coverage_options.dart |
| pkgs/html/example/main.dart |
| pkgs/html/lib/dom_parsing.dart |
| pkgs/html/lib/html_escape.dart |
| pkgs/html/lib/parser.dart |
| pkgs/html/lib/src/constants.dart |
| pkgs/html/lib/src/encoding_parser.dart |
| pkgs/html/lib/src/html_input_stream.dart |
| pkgs/html/lib/src/list_proxy.dart |
| pkgs/html/lib/src/query_selector.dart |
| pkgs/html/lib/src/token.dart |
| pkgs/html/lib/src/tokenizer.dart |
| pkgs/html/lib/src/treebuilder.dart |
| pkgs/html/lib/src/utils.dart |
| pkgs/html/test/dom_test.dart |
| pkgs/html/test/parser_test.dart |
| pkgs/html/test/query_selector_test.dart |
| pkgs/html/test/selectors/level1_baseline_test.dart |
| pkgs/html/test/selectors/level1_lib.dart |
| pkgs/html/test/selectors/selectors.dart |
| pkgs/html/test/support.dart |
| pkgs/html/test/tokenizer_test.dart |
| pkgs/html/test/trie_test.dart |
| pkgs/html/tool/generate_trie.dart |
| pkgs/pubspec_parse/test/git_uri_test.dart |
| pkgs/stack_trace/example/example.dart |
| pkgs/watcher/test/custom_watcher_factory_test.dart |
| pkgs/yaml_edit/example/example.dart |
This check can be disabled by tagging the PR with skip-license-check.
|
Hey, thanks for reviewing this! 🙌 |
|
@Dhruv-Maradiya Just a friendly ping as I am looking through PRs - is there intention to land this? |
|
Hey @mosuem, sorry for the delay! I’ll try to wrap this up ASAP, most likely today. |
|
Friendly ping :) (No pressure, just happened to walk by this tab in my browser) |
Implements DOM spec textContent algorithm with optional convertBRsToNewlines parameter. Adds isElementBr() helper for namespace-aware BR detection. Maintains backward compatibility with existing .text getter.
Package publishing
Documentation at https://github.com/dart-lang/ecosystem/wiki/Publishing-automation. |
Fixes #1090 by updating the DOM parser to handle
<br>elements and insert line breaks (\n) when converting HTML content to plain text.Initially, I thought adding a simple condition might not be a reliable solution. So, I decided to check how HTML-to-text conversion is handled in Chromium and found a similar approach. Here's the link.
Contribution guidelines:
dart format.Note that many Dart repos have a weekly cadence for reviewing PRs - please allow for some latency before initial review feedback.