Skip to content

Conversation

@renovate
Copy link
Contributor

@renovate renovate bot commented Jun 29, 2025

This PR contains the following updates:

Package Type Update Change OpenSSF
org.jsoup:jsoup (source) compile minor 1.20.1 -> 1.21.2 OpenSSF Scorecard

Release Notes

jhy/jsoup (org.jsoup:jsoup)

v1.21.2

Changes
  • Deprecated internal (yet visible) methods Normalizer#normalize(String, bool) and Attribute#shouldCollapseAttribute(Document.OutputSettings). These will be removed in a future version.
  • Deprecated Connection#sslSocketFactory(SSLSocketFactory) in favor of the new Connection#sslContext(SSLContext). Using sslSocketFactory will force the use of the legacy HttpUrlConnection implementation, which does not support HTTP/2. #​2370
Improvements
  • When pretty-printing, if there are consecutive text nodes (via DOM manipulation), the non-significant whitespace between them will be collapsed. #​2349.
  • Updated Connection.Response#statusMessage() to return a simple loggable string message (e.g. "OK") when using the HttpClient implementation, which doesn't otherwise return any server-set status message. #​2356
  • Attributes#size() and Attributes#isEmpty() now exclude any internal attributes (such as user data) from their count. This aligns with the attributes' serialized output and iterator. #​2369
  • Added Connection#sslContext(SSLContext) to provide a custom SSL (TLS) context to requests, supporting both the HttpClient and the legacy HttUrlConnection implementations. #​2370
  • Performance optimizations for DOM manipulation methods including when repeatedly removing an element's first child (element.child(0).remove(), and when using Parser#parseBodyFragement() to parse a large number of direct children. #​2373.
Bug Fixes
  • When parsing from an InputStream and a multibyte character happened to straddle a buffer boundary, the stream would not be completely read. #​2353.
  • In NodeTraversor, if a last child element was removed during the head() call, the parent would be visited twice. #​2355.
  • Cloning an Element that has an Attributes object would add an empty internal user-data attribute to that clone, which would cause unexpected results for Attributes#size() and Attributes#isEmpty(). #​2356
  • In a multithreaded application where multiple threads are calling Element#children() on the same element concurrently, a race condition could happen when the method was generating the internal child element cache (a filtered view of its child nodes). Since concurrent reads of DOM objects should be threadsafe without external synchronization, this method has been updated to execute atomically. #​2366
  • When parsing HTML with svg:script elements in SVG elements, don't enter the Text insertion mode, but continue to parse as foreign content. Otherwise, misnested HTML could then cause an IndexOutOfBoundsException. #​2374
  • Malformed HTML could throw an IndexOutOfBoundsException during the adoption agency. #​2377.

v1.21.1

Changes
  • Removed previously deprecated methods. #​2317
  • Deprecated the :matchText pseduo-selector due to its side effects on the DOM; use the new ::textnode selector and the Element#selectNodes(String css, Class type) method instead. #​2343
  • Deprecated Connection.Response#bufferUp() in lieu of Connection.Response#readFully() which can throw a checked IOException.
  • Deprecated internal methods Validate#ensureNotNull (replaced by typed Validate#expectNotNull); protected HTML appenders from Attribute and Node.
  • If you happen to be using any of the deprecated methods, please take the opportunity now to migrate away from them, as they will be removed in a future release.
Improvements
  • Enhanced the Selector to support direct matching against nodes such as comments and text nodes. For example, you can now find an element that follows a specific comment: ::comment:contains(prices) + p will select p elements immediately after a <!-- prices: --> comment. Supported types include ::node, ::leafnode, ::comment, ::text, ::data, and ::cdata. Node contextual selectors like ::node:contains(text), :matches(regex), and :blank are also supported. Introduced Element#selectNodes(String css) and Element#selectNodes(String css, Class nodeType) for direct node selection. #​2324
  • Added TagSet#onNewTag(Consumer<Tag> customizer): register a callback that’s invoked for each new or cloned Tag when it’s inserted into the set. Enables dynamic tweaks of tag options (for example, marking all custom tags as self-closing, or everything in a given namespace as preserving whitespace).
  • Made TokenQueue and CharacterReader autocloseable, to ensure that they will release their buffers back to the buffer pool, for later reuse.
  • Added Selector#evaluatorOf(String css), as a clearer way to obtain an Evaluator from a CSS query. An alias of QueryParser.parse(String css).
  • Custom tags (defined via the TagSet) in a foreign namespace (e.g. SVG) can be configured to parse as data tags.
  • Added NodeVisitor#traverse(Node) to simplify node traversal calls (vs. importing NodeTraversor).
  • Updated the default user-agent string to improve compatibility. #​2341
  • The HTML parser now allows the specific text-data type (Data, RcData) to be customized for known tags. (Previously, that was only supported on custom tags.) #​2326.
  • Added Connection#readFully() as a replacement for Connection#bufferUp() with an explicit IOException. Similarly, added Connection#readBody() over Connection#body(). Deprecated Connection#bufferUp(). #​2327
  • When serializing HTML, the < and > characters are now escaped in attributes. This helps prevent a class of mutation XSS attacks. #​2337
  • Changed Connection to prefer using the JDK's HttpClient over HttpUrlConnection, if available, to enable HTTP/2 support by default. Users can disable via -Djsoup.useHttpClient=false. #​2340
Bug Fixes
  • The contents of a script in a svg foreign context should be parsed as script data, not text. #​2320
  • Tag#isFormSubmittable() was updating the Tag's options. #​2323
  • The HTML pretty-printer would incorrectly trim whitespace when text followed an inline element in a block element. #​2325
  • Custom tags with hyphens or other non-letter characters in their names now work correctly as Data or RcData tags. Their closing tags are now tokenized properly. #​2332
  • When cloning an Element, the clone would retain the source's cached child Element list (if any), which could lead to incorrect results when modifying the clone's child elements. #​2334

Configuration

📅 Schedule: Branch creation - Between 12:00 AM and 03:59 AM, only on Monday ( * 0-3 * * 1 ) in timezone Europe/Berlin, Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

@renovate renovate bot force-pushed the renovate/org.jsoup.version branch 3 times, most recently from e388797 to 9fa5fe2 Compare July 1, 2025 12:59
@renovate renovate bot force-pushed the renovate/org.jsoup.version branch 3 times, most recently from 93e3294 to 47a5f4d Compare July 11, 2025 12:30
@renovate renovate bot force-pushed the renovate/org.jsoup.version branch 2 times, most recently from 6c07329 to c09e96c Compare July 28, 2025 08:06
@renovate renovate bot force-pushed the renovate/org.jsoup.version branch 4 times, most recently from c364bc5 to 496711a Compare August 4, 2025 14:59
@renovate renovate bot changed the title fix(deps): update dependency org.jsoup:jsoup to v1.21.1 fix(deps): update dependency org.jsoup:jsoup to v1.21.2 Aug 25, 2025
@renovate renovate bot force-pushed the renovate/org.jsoup.version branch from 496711a to 47042f2 Compare August 25, 2025 03:22
@renovate renovate bot force-pushed the renovate/org.jsoup.version branch 3 times, most recently from 0eac753 to 2d97b90 Compare September 22, 2025 09:21
@renovate renovate bot force-pushed the renovate/org.jsoup.version branch 4 times, most recently from ce0b2b4 to 9d2111b Compare October 27, 2025 11:16
@renovate renovate bot force-pushed the renovate/org.jsoup.version branch from 9d2111b to 51f68fe Compare October 30, 2025 12:53
@renovate renovate bot force-pushed the renovate/org.jsoup.version branch from 51f68fe to 1a73ec3 Compare November 19, 2025 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant