Skip to content

Conversation

@Aman071106
Copy link
Contributor

refactor(core): improve docstrings for HTML link extraction utilities

Description

This PR updates and clarifies the docstrings for find_all_links and extract_sub_links in
libs/core/langchain_core/utils/html.py.

The previous return-value descriptions were vague (e.g., "all links", "sub links"). They have now been revised to clearly describe the behavior and output of each function:

  • find_all_links → “A list of all links found in the HTML.”
  • extract_sub_links → “A list of absolute paths to sub links.”

These improvements make the utilities more understandable and developer-friendly without altering functionality.

Verification

  • ruff check libs/core/langchain_core/utils/html.py: Passed
  • pytest libs/core/tests/unit_tests/utils/test_html.py: Passed

Checklists

  • PR title follows the required format: TYPE(SCOPE): DESCRIPTION
  • Changes are limited to the langchain-core package
  • make format, make lint, and make test pass

@Aman071106 Aman071106 requested a review from eyurtsev as a code owner December 31, 2025 12:50
@github-actions github-actions bot added core `langchain-core` package issues & PRs refactor PRs that include a refactor labels Dec 31, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented Dec 31, 2025

CodSpeed Performance Report

Merging #34550 will improve performance by 29.29%

Comparing Aman071106:docs/improve-html-utils-docs (f208999) with master (5517ef3)

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

⚡ 10 improvements
✅ 3 untouched
⏩ 21 skipped1

Benchmarks breakdown

Mode Benchmark BASE HEAD Efficiency
WallTime test_import_time[HumanMessage] 312.3 ms 263.5 ms +18.52%
WallTime test_import_time[ChatPromptTemplate] 741.8 ms 590 ms +25.73%
WallTime test_import_time[tool] 666.8 ms 517.3 ms +28.88%
WallTime test_import_time[BaseChatModel] 661.8 ms 531.5 ms +24.51%
WallTime test_import_time[CallbackManager] 588.6 ms 460 ms +27.94%
WallTime test_import_time[Runnable] 636.1 ms 492.2 ms +29.25%
WallTime test_import_time[RunnableLambda] 625.8 ms 489.7 ms +27.79%
WallTime test_import_time[InMemoryVectorStore] 790.7 ms 611.6 ms +29.29%
WallTime test_import_time[LangChainTracer] 563.9 ms 440.9 ms +27.91%
WallTime test_import_time[Document] 217.7 ms 186.3 ms +16.82%

Footnotes

  1. 21 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@mdrxy mdrxy merged commit 50c5bb5 into langchain-ai:master Jan 8, 2026
89 checks passed
@Aman071106 Aman071106 deleted the docs/improve-html-utils-docs branch January 8, 2026 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core `langchain-core` package issues & PRs refactor PRs that include a refactor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants