Harden integrations outbound egress policy by rmusser01 · Pull Request #2604 · rmusser01/tldw_server

rmusser01 · 2026-07-04T01:46:58Z

Addresses AUDIT-2026-06-27-INTEGRATIONS-001 and AUDIT-2026-06-27-INTEGRATIONS-002 / TASK-12146.

Change summary:

Routes workflow research adapter direct HTTP calls and tokenizer resolver provider/token-count requests through central HTTP helpers instead of raw clients.
Preserves explicit local-provider behavior where supported.
Restores PubMed search/summary connection reuse by sharing one central-policy async client across both requests.
Renames the Backlog task from duplicate TASK-12138 to TASK-12146 to avoid collisions with current dev tasks.

Latest-dev validation:

Fetched origin/dev and rebased this branch onto fd5c152. Merge-base matched origin/dev before validation.
Passed: .venv/bin/python -m pytest tldw_Server_API/tests/Workflows/adapters/test_research_adapters.py::test_pubmed_search_adapter_reuses_central_http_client -q (1 passed).
Passed: .venv/bin/python -m pytest tldw_Server_API/tests/Workflows/adapters/test_research_adapters.py tldw_Server_API/tests/Writing/test_tokenizer_resolver_unit.py -q (117 passed).
Passed: .venv/bin/python -m bandit -r tldw_Server_API/app/core/Workflows/adapters/research/search.py tldw_Server_API/app/core/Workflows/adapters/research/bibliography.py tldw_Server_API/app/core/LLM_Calls/tokenizer_resolver.py -f json -o /tmp/bandit_integrations_egress_latest_dev.json with 0 findings.
Passed: git diff --check.
Raw-client scan over touched production files for httpx.AsyncClient, httpx.Client, requests., urllib.request, and urlopen( returned no matches.

Review notes:

Addressed both client-reuse comments by adding optional-client support to _managed_afetch, wrapping PubMed requests in one create_async_client(timeout=30) context, and adding an adapter-level regression proving both afetch calls receive the same client.

Draft until the required human-authored Change summary is added.

coderabbitai · 2026-07-04T01:47:04Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 852afff9-a779-42ed-b18c-a5b97a17a9c9

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/audit-integrations-egress-2026-07-04

_{Comment @coderabbitai help to get the list of available commands.}

gemini-code-assist

Code Review

This pull request hardens the outbound HTTP policy for audit integrations by routing workflow research adapters and tokenizer resolver requests through the central HTTP client helpers (fetch and afetch) instead of raw clients. This ensures compliance with egress policies, such as denying private or loopback targets. Unit and regression tests have been updated accordingly. The reviewer feedback recommends enhancing the _managed_afetch helper to optionally accept an existing client, which would allow reusing a single client across sequential requests (e.g., in PubMed search) to preserve connection pooling and reduce connection overhead.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-07-04T01:48:20Z

+async def _managed_afetch(**kwargs: Any) -> Any:
+    timeout = kwargs.get("timeout")
+    async with create_async_client(timeout=timeout) as client:
+        return await afetch(client=client, **kwargs)


To support reusing a single httpx.AsyncClient across multiple requests (which is highly recommended for connection pooling and performance), let's update the _managed_afetch helper to accept an optional pre-existing client parameter.

Suggested change

async def _managed_afetch(**kwargs: Any) -> Any:

timeout = kwargs.get("timeout")

async with create_async_client(timeout=timeout) as client:

return await afetch(client=client, **kwargs)

async def _managed_afetch(client: Any = None, **kwargs: Any) -> Any:

if client is not None:

return await afetch(client=client, **kwargs)

timeout = kwargs.get("timeout")

async with create_async_client(timeout=timeout) as client:

return await afetch(client=client, **kwargs)

Addressed in 13010ef. _managed_afetch now accepts an optional pre-existing async client and delegates to central afetch with that client instead of always creating its own context. Added adapter-level coverage proving PubMed passes the same client into both central fetch calls.

gemini-code-assist · 2026-07-04T01:48:20Z

+        # Search for IDs
+        search_url = f"{base_url}/esearch.fcgi"
+        search_params = {
+            "db": "pubmed",
+            "term": query,
+            "retmax": max_results,
+            "retmode": "json",
+        }
+        search_response = await _managed_afetch(method="GET", url=search_url, params=search_params, timeout=30)
+        search_data = search_response.json()
+
+        id_list = search_data.get("esearchresult", {}).get("idlist", [])
+        if not id_list:
+            return {"papers": [], "total_results": 0, "query": query}
+
+        # Fetch details
+        fetch_url = f"{base_url}/esummary.fcgi"
+        fetch_params = {
+            "db": "pubmed",
+            "id": ",".join(id_list),
+            "retmode": "json",
+        }
+        fetch_response = await _managed_afetch(method="GET", url=fetch_url, params=fetch_params, timeout=30)
+        fetch_data = fetch_response.json()


In the previous implementation, a single httpx.AsyncClient was created and reused for both the search and fetch requests. Reusing the client is highly recommended as it leverages connection pooling, avoiding the overhead of establishing a new TCP and SSL connection for each individual request.

Using _managed_afetch twice here creates and closes two separate clients sequentially, which introduces unnecessary latency and performance overhead.

We can restore the connection pooling behavior while still adhering to the central outbound HTTP policy by wrapping both requests in a single async with create_async_client(timeout=30) as client: block and passing the client directly to _managed_afetch.

async with create_async_client(timeout=30) as client: # Search for IDs search_url = f"{base_url}/esearch.fcgi" search_params = { "db": "pubmed", "term": query, "retmax": max_results, "retmode": "json", } search_response = await _managed_afetch(client=client, method="GET", url=search_url, params=search_params, timeout=30) search_data = search_response.json() id_list = search_data.get("esearchresult", {}).get("idlist", []) if not id_list: return {"papers": [], "total_results": 0, "query": query} # Fetch details fetch_url = f"{base_url}/esummary.fcgi" fetch_params = { "db": "pubmed", "id": ",".join(id_list), "retmode": "json", } fetch_response = await _managed_afetch(client=client, method="GET", url=fetch_url, params=fetch_params, timeout=30) fetch_data = fetch_response.json()

Addressed in 13010ef. PubMed search now wraps the ESearch and ESummary calls in one create_async_client(timeout=30) context while still routing both requests through _managed_afetch/central afetch. The new test_pubmed_search_adapter_reuses_central_http_client regression verifies one client context and two fetch calls sharing the same client.

Remediates TASK-12138 and audit findings AUDIT-2026-06-27-INTEGRATIONS-001/002 by routing workflow research and tokenizer HTTP requests through central HTTP policy helpers with regression coverage.

qodo-code-review · 2026-07-04T21:47:52Z

PR Summary by Qodo

Harden integrations outbound egress by routing adapter/provider HTTP via central helpers

🐞 Bug fix 🧪 Tests 📝 Documentation 🕐 40+ Minutes

AI Description

• Route research adapters and tokenizer provider calls through central HTTP egress policy helpers.
• Preserve PubMed connection reuse by sharing one managed async client across both requests.
• Add regression tests covering central-client usage and private/loopback URL denial.

Diagram

graph TD
  DOC["Audit plan/task"] --> RA["Research adapters"] --> HC["Central HTTP helpers"] --> EXT{{"External APIs"}}
  TR["Tokenizer resolver"] --> HC
  TST["Regression tests"] --> RA
  TST --> TR

High-Level Assessment

The following are alternative approaches to this PR:

1. Enforce egress only at network boundary (proxy/firewall)

➕ Centralized enforcement across all languages/processes
➕ Reduces risk of future code-level bypasses
➖ Harder to unit-test; failures become environment-dependent
➖ Still leaves in-process SSRF-style validation gaps (e.g., redirect/URL parsing) unless duplicated elsewhere

2. Dependency-inject a shared policy client into all adapters

➕ Makes connection reuse explicit (especially for multi-call flows like PubMed)
➕ Reduces per-call client creation overhead
➖ Wider signature churn across adapter interfaces
➖ More plumbing in workflow execution to pass clients around safely

3. Create a dedicated integration SDK wrapper per provider

➕ Encapsulates provider-specific logic + policy in one place
➕ Can standardize retries/timeouts/response validation per provider
➖ More code/abstraction than needed for targeted remediation
➖ May duplicate capabilities already present in http_client helpers

Recommendation: Current approach (migrating call sites to the existing central http_client helpers and adding regression tests) is the best fit for the audit remediation: it minimizes interface churn while ensuring trust_env/proxy/redirect/egress checks are consistently applied. The added optional-client support enables PubMed connection reuse without undermining policy enforcement.

Files changed (7) +454 / -177

Bug fix (3) +135 / -112

tokenizer_resolver.pyRoute tokenizer provider POSTs through central sync HTTP fetch helper +11/-5

Route tokenizer provider POSTs through central sync HTTP fetch helper
• Replaces direct requests.post usage in _http_post with the central http_client.fetch helper so calls inherit egress/proxy/trust_env defaults and redirect handling. Minor cleanup uses contextlib.suppress for safe attribute restoration during runtime probing.
tldw_Server_API/app/core/LLM_Calls/tokenizer_resolver.py

bibliography.pyMove DOI resolution HTTP calls onto central async HTTP helpers +33/-28

Move DOI resolution HTTP calls onto central async HTTP helpers
• Removes raw httpx.AsyncClient usage and routes DOI resolution through a managed afetch helper built on create_async_client. Preserves response parsing and metadata shaping while ensuring central egress policy applies.
tldw_Server_API/app/core/Workflows/adapters/research/bibliography.py

search.pyRoute research adapter outbound HTTP via managed central afetch/client +91/-79

Route research adapter outbound HTTP via managed central afetch/client

• Replaces raw httpx clients for arXiv direct PDF downloads, PubMed search/summary, Semantic Scholar search, and Google Patents queries with central afetch via create_async_client. Adds optional client passthrough to ensure PubMed search and summary share the same client context (connection reuse) while still applying central policy.

tldw_Server_API/app/core/Workflows/adapters/research/search.py

Tests (2) +222 / -65

test_research_adapters.pyAdd regression coverage for central HTTP helper usage and PubMed client reuse +164/-65

Add regression coverage for central HTTP helper usage and PubMed client reuse

• Adds tests asserting arXiv direct pdf_url downloads go through central afetch and that private/loopback URLs are denied by the central policy path. Adds a regression test verifying PubMed search and summary calls reuse one managed async client and updates existing sanitization tests to stub _managed_afetch instead of patching httpx imports.

tldw_Server_API/tests/Workflows/adapters/test_research_adapters.py

test_tokenizer_resolver_unit.pyAdd tests asserting tokenizer POST uses central fetch and enforces egress denial +58/-0

Add tests asserting tokenizer POST uses central fetch and enforces egress denial
• Introduces unit tests that monkeypatch http_client.fetch to verify tokenizer_resolver._http_post calls the central helper with expected parameters. Adds a denial-path test ensuring EgressPolicyError propagates for private/loopback targets.
tldw_Server_API/tests/Writing/test_tokenizer_resolver_unit.py

Documentation (2) +97 / -0

2026-07-04-audit-integrations-egress-remediation-plan.mdAdd staged remediation plan for integrations egress hardening +23/-0

Add staged remediation plan for integrations egress hardening
• Introduces a four-stage plan covering regression tests, research adapter migration, tokenizer resolver migration, and verification steps. Documents success criteria and required validation commands for repeatable audit follow-up.
Docs/superpowers/plans/2026-07-04-audit-integrations-egress-remediation-plan.md

task-12146 - Harden-audit-Integrations-outbound-HTTP-policy.mdCreate TASK-12146 audit remediation record with verification evidence +74/-0

Create TASK-12146 audit remediation record with verification evidence
• Adds a completed backlog task documenting the audit findings, acceptance criteria, touched files, and validation evidence (pytest, Bandit, diff checks, raw-client scan). Replaces the previously referenced duplicate task id to avoid collisions.
backlog/tasks/task-12146 - Harden-audit-Integrations-outbound-HTTP-policy.md

qodo-code-review · 2026-07-04T21:50:53Z

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (2) 📜 Skill insights (0)

Context used

✅ Compliance rules (platform): 74 rules

1. Blocking Path.write_bytes() in async 📘 Rule violation ➹ Performance

Description

run_arxiv_download_adapter() performs a synchronous file write (Path(...).write_bytes(...))
inside an async def, which can block the event loop and degrade concurrency/latency under load.

Code

tldw_Server_API/app/core/Workflows/adapters/research/search.py[207]
+            Path(output_path).write_bytes(response.content)

Evidence
PR Compliance ID 224212 requires avoiding synchronous I/O inside async code paths. The updated
adapter writes the PDF using Path(...).write_bytes(...) within an async function, which is a
blocking filesystem call.
Rule 224212: Prefer asynchronous I/O over blocking synchronous calls
tldw_Server_API/app/core/Workflows/adapters/research/search.py[203-207]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`Path(output_path).write_bytes(response.content)` is synchronous I/O executed inside an async adapter (`run_arxiv_download_adapter`). This can block the event loop.

## Issue Context
The compliance rule requires avoiding blocking synchronous I/O in async-capable code paths. Use an async file API or offload the blocking write to a thread.

## Fix Focus Areas
- tldw_Server_API/app/core/Workflows/adapters/research/search.py[203-207]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. Redirect leaks auth headers 🐞 Bug ⛨ Security

Description

tokenizer_resolver._http_post sends provider Authorization/x-api-key headers through
http_client.fetch with allow_redirects=True, and the central redirect loop follows Location hops
without stripping those headers. If a provider responds with a cross-host redirect, the request will
resend secrets to the redirect target.

Code

tldw_Server_API/app/core/LLM_Calls/tokenizer_resolver.py[R652-664]

def _http_post(*, url: str, payload: dict[str, Any], headers: dict[str, str], timeout: float) -> Any:
    try:
-        import requests  # type: ignore
+        from tldw_Server_API.app.core.http_client import fetch
    except Exception as exc:
        raise TokenizerUnavailable("Provider tokenizer HTTP client unavailable") from exc
-    return requests.post(url, json=payload, headers=headers, timeout=timeout)
+    return fetch(
+        method="POST",
+        url=url,
+        json=payload,
+        headers=headers,
+        timeout=timeout,
+        allow_redirects=True,
+    )

Evidence
Tokenizer resolver builds credential-bearing headers and passes them into _http_post, which calls
fetch(... allow_redirects=True). In http_client._fetch_httpx_response, redirect handling updates
cur_url and continues the loop, while the request path (_do_once) always uses the original
headers to build req_headers, so secrets will be resent on the redirected request.
tldw_Server_API/app/core/LLM_Calls/tokenizer_resolver.py[205-215]
tldw_Server_API/app/core/LLM_Calls/tokenizer_resolver.py[652-664]
tldw_Server_API/app/core/http_client.py[2814-2838]
tldw_Server_API/app/core/http_client.py[2941-2964]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`tokenizer_resolver._http_post` now uses the central `http_client.fetch(..., allow_redirects=True)` while passing headers that often contain secrets (e.g., `Authorization: Bearer ...`, `x-api-key`). The central sync redirect implementation follows redirects by updating `cur_url` and looping, but it does **not** clear/partition sensitive headers when the redirect target changes origin, so secrets can be resent to the redirected host.

### Issue Context
- `_http_post` is used by multiple tokenizer/count adapters and is invoked with credential-bearing headers.
- `http_client.fetch` (response-based API) implements manual redirect handling and reuses the original `headers` across hops.

### Fix Focus Areas
- Option A (local fix): disable redirects for tokenizer provider POSTs (set `allow_redirects=False`) unless explicitly required.
- Option B (central fix, preferred): in `http_client` redirect handling for response-based `fetch`/`afetch`, detect cross-origin redirects and either (1) strip sensitive headers (`Authorization`, cookies, API keys, etc.) and clear cookies, or (2) fail closed unless the caller explicitly opts in.

- tldw_Server_API/app/core/LLM_Calls/tokenizer_resolver.py[652-664]
- tldw_Server_API/app/core/http_client.py[2814-2843]
- tldw_Server_API/app/core/http_client.py[2941-2964]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. Silent exception suppression in finally 📘 Rule violation ☼ Reliability

Description

with suppress(Exception) silently ignores failures when restoring encoding.timeout_seconds,
which can hide real errors and leave runtime state inconsistent without any contextual message.

Code

tldw_Server_API/app/core/LLM_Calls/tokenizer_resolver.py[R894-895]

+            with suppress(Exception):
                setattr(encoding, "timeout_seconds", original_timeout)
-            except Exception:
-                pass

Evidence
PR Compliance ID 380655 requires failing fast (or at minimum not silently swallowing exceptions) and
providing contextual error messages. The new with suppress(Exception) explicitly suppresses any
error during cleanup with no message or context.
Rule 380655: Fail fast with contextual error messages
tldw_Server_API/app/core/LLM_Calls/tokenizer_resolver.py[892-895]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The `finally` block uses `with suppress(Exception):` when restoring `encoding.timeout_seconds`, swallowing any exception without emitting a contextual error message.

## Issue Context
The compliance rule flags swallowing exceptions with empty handlers/suppression as a red flag. If restoration fails, either raise a contextual exception or record a contextual message (e.g., warning/log) so failures are observable.

## Fix Focus Areas
- tldw_Server_API/app/core/LLM_Calls/tokenizer_resolver.py[892-895]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

4. Wrong task id in plan 🐞 Bug ⚙ Maintainability

Description

The remediation plan’s Stage 4 success criteria still references TASK-12138, but this PR adds
TASK-12146; this breaks audit traceability for the recorded verification step.

Code

Docs/superpowers/plans/2026-07-04-audit-integrations-egress-remediation-plan.md[21]
+**Success Criteria**: Focused pytest passes, `git diff --check` passes, Bandit runs on touched production files, and `TASK-12138` records final verification.

Evidence
The plan explicitly names TASK-12138 for verification recording, while the added backlog task
frontmatter clearly sets the id to TASK-12146.
Docs/superpowers/plans/2026-07-04-audit-integrations-egress-remediation-plan.md[19-22]
backlog/tasks/task-12146 - Harden-audit-Integrations-outbound-HTTP-policy.md[1-4]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The plan document references the old task id `TASK-12138`, but the actual remediation task added in this PR is `TASK-12146`. This inconsistency makes it harder to trace verification artifacts back to the correct task.

### Issue Context
A new backlog task file exists with id `TASK-12146`.

### Fix Focus Areas
- Docs/superpowers/plans/2026-07-04-audit-integrations-egress-remediation-plan.md[19-22]
- backlog/tasks/task-12146 - Harden-audit-Integrations-outbound-HTTP-policy.md[1-4]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-07-04T21:50:53Z

+            response.raise_for_status()
+            filename = pdf_url.split("/")[-1] or "paper.pdf"
+            output_path = str(art_dir / filename)
+            Path(output_path).write_bytes(response.content)


1. Blocking path.write_bytes() in async 📘 Rule violation ➹ Performance

run_arxiv_download_adapter() performs a synchronous file write (Path(...).write_bytes(...)) inside an async def, which can block the event loop and degrade concurrency/latency under load.

Agent Prompt

## Issue description `Path(output_path).write_bytes(response.content)` is synchronous I/O executed inside an async adapter (`run_arxiv_download_adapter`). This can block the event loop. ## Issue Context The compliance rule requires avoiding blocking synchronous I/O in async-capable code paths. Use an async file API or offload the blocking write to a thread. ## Fix Focus Areas - tldw_Server_API/app/core/Workflows/adapters/research/search.py[203-207]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-07-04T21:50:53Z

+            with suppress(Exception):
                setattr(encoding, "timeout_seconds", original_timeout)


2. Silent exception suppression in finally 📘 Rule violation ☼ Reliability

with suppress(Exception) silently ignores failures when restoring encoding.timeout_seconds, which can hide real errors and leave runtime state inconsistent without any contextual message.

Agent Prompt

## Issue description The `finally` block uses `with suppress(Exception):` when restoring `encoding.timeout_seconds`, swallowing any exception without emitting a contextual error message. ## Issue Context The compliance rule flags swallowing exceptions with empty handlers/suppression as a red flag. If restoration fails, either raise a contextual exception or record a contextual message (e.g., warning/log) so failures are observable. ## Fix Focus Areas - tldw_Server_API/app/core/LLM_Calls/tokenizer_resolver.py[892-895]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-07-04T21:50:53Z

 def _http_post(*, url: str, payload: dict[str, Any], headers: dict[str, str], timeout: float) -> Any:
    try:
-        import requests  # type: ignore
+        from tldw_Server_API.app.core.http_client import fetch
    except Exception as exc:
        raise TokenizerUnavailable("Provider tokenizer HTTP client unavailable") from exc
-    return requests.post(url, json=payload, headers=headers, timeout=timeout)
+    return fetch(
+        method="POST",
+        url=url,
+        json=payload,
+        headers=headers,
+        timeout=timeout,
+        allow_redirects=True,
+    )


3. Redirect leaks auth headers 🐞 Bug ⛨ Security

tokenizer_resolver._http_post sends provider Authorization/x-api-key headers through http_client.fetch with allow_redirects=True, and the central redirect loop follows Location hops without stripping those headers. If a provider responds with a cross-host redirect, the request will resend secrets to the redirect target.

Agent Prompt

### Issue description `tokenizer_resolver._http_post` now uses the central `http_client.fetch(..., allow_redirects=True)` while passing headers that often contain secrets (e.g., `Authorization: Bearer ...`, `x-api-key`). The central sync redirect implementation follows redirects by updating `cur_url` and looping, but it does **not** clear/partition sensitive headers when the redirect target changes origin, so secrets can be resent to the redirected host. ### Issue Context - `_http_post` is used by multiple tokenizer/count adapters and is invoked with credential-bearing headers. - `http_client.fetch` (response-based API) implements manual redirect handling and reuses the original `headers` across hops. ### Fix Focus Areas - Option A (local fix): disable redirects for tokenizer provider POSTs (set `allow_redirects=False`) unless explicitly required. - Option B (central fix, preferred): in `http_client` redirect handling for response-based `fetch`/`afetch`, detect cross-origin redirects and either (1) strip sensitive headers (`Authorization`, cookies, API keys, etc.) and clear cookies, or (2) fail closed unless the caller explicitly opts in. - tldw_Server_API/app/core/LLM_Calls/tokenizer_resolver.py[652-664] - tldw_Server_API/app/core/http_client.py[2814-2843] - tldw_Server_API/app/core/http_client.py[2941-2964]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-07-04T21:50:53Z

+
+## Stage 4: Verification And Task Record
+**Goal**: Prove the remediation and record it for repeatable audit follow-up.
+**Success Criteria**: Focused pytest passes, `git diff --check` passes, Bandit runs on touched production files, and `TASK-12138` records final verification.


4. Wrong task id in plan 🐞 Bug ⚙ Maintainability

The remediation plan’s Stage 4 success criteria still references TASK-12138, but this PR adds TASK-12146; this breaks audit traceability for the recorded verification step.

Agent Prompt

### Issue description The plan document references the old task id `TASK-12138`, but the actual remediation task added in this PR is `TASK-12146`. This inconsistency makes it harder to trace verification artifacts back to the correct task. ### Issue Context A new backlog task file exists with id `TASK-12146`. ### Fix Focus Areas - Docs/superpowers/plans/2026-07-04-audit-integrations-egress-remediation-plan.md[19-22] - backlog/tasks/task-12146 - Harden-audit-Integrations-outbound-HTTP-policy.md[1-4]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…s; sync docs - Add tests/Chat_Macros/property/test_macro_parser_properties.py: 10 hypothesis properties for parse_macro_args/_coerce_bool/ load_macro_definition, importorskip-guarded so they skip on dev today and activate when codex/chat-macros-v1 (PR #2618) merges. Validated against that branch: 10/10 pass. - ci.yml: add tests/RPG (11 files, silently unsharded on dev — shard guard was red repo-wide; 177/177 pass locally) and tests/Chat_Macros (covers the prep file now and the macro branch's tests on merge) to the gap-verified-7 shard. - Audit/plan: egress (#2604) and ACP (#2619) fixes merged to dev today — update live-unfixed wording; chat-macros re-added as prep-done pending PR #2618. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

rmusser01 · 2026-07-04T22:26:48Z

Follow-up PR #2634 addresses the remaining review comments from this PR: blocking write_bytes in run_arxiv_download_adapter, allow_redirects=False for credential-bearing tokenizer POSTs (redirect header-leak), logged (not suppressed) timeout restore, and the TASK-12138→TASK-12146 plan traceability fix.

fix: address review comments from PRs #2604 and #2619

…s; sync docs - Add tests/Chat_Macros/property/test_macro_parser_properties.py: 10 hypothesis properties for parse_macro_args/_coerce_bool/ load_macro_definition, importorskip-guarded so they skip on dev today and activate when codex/chat-macros-v1 (PR #2618) merges. Validated against that branch: 10/10 pass. - ci.yml: add tests/RPG (11 files, silently unsharded on dev — shard guard was red repo-wide; 177/177 pass locally) and tests/Chat_Macros (covers the prep file now and the macro branch's tests on merge) to the gap-verified-7 shard. - Audit/plan: egress (#2604) and ACP (#2619) fixes merged to dev today — update live-unfixed wording; chat-macros re-added as prep-done pending PR #2618. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

- Task 4 nightly shuffle: use -p pytest_randomly (full module name), not the entry-point alias, per review — more reliable under PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 (gemini review) - Add tracked follow-up: central http_client cross-origin redirect header stripping (qodo Option B from PR #2604), superseding the per-caller allow_redirects=False mitigation from PR #2634 - fake-indexeddb package.json comment was already addressed in a prior revision Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

gemini-code-assist Bot reviewed Jul 4, 2026

View reviewed changes

rmusser01 added 2 commits July 4, 2026 12:04

Harden integrations outbound egress policy

b3873b9

Remediates TASK-12138 and audit findings AUDIT-2026-06-27-INTEGRATIONS-001/002 by routing workflow research and tokenizer HTTP requests through central HTTP policy helpers with regression coverage.

Preserve PubMed client reuse in egress audit

13010ef

rmusser01 force-pushed the codex/audit-integrations-egress-2026-07-04 branch from 883b6c4 to 13010ef Compare July 4, 2026 19:09

rmusser01 marked this pull request as ready for review July 4, 2026 21:46

rmusser01 merged commit 27df9f6 into dev Jul 4, 2026
5 of 58 checks passed

qodo-code-review Bot reviewed Jul 4, 2026

View reviewed changes

rmusser01 added a commit that referenced this pull request Jul 4, 2026

Merge origin/dev (PRs #2604, #2619) into test-suite-audit branch

3113b84

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

rmusser01 mentioned this pull request Jul 4, 2026

fix: address review comments from PRs #2604 and #2619 #2634

Merged

rmusser01 added a commit that referenced this pull request Jul 4, 2026

Merge pull request #2634 from rmusser01/fix/pr2604-2619-review-followup

f69c130

fix: address review comments from PRs #2604 and #2619

rmusser01 mentioned this pull request Jul 4, 2026

fix(http_client): strip credentials on cross-origin redirect hops #2635

Merged

		with suppress(Exception):
		setattr(encoding, "timeout_seconds", original_timeout)

Conversation

rmusser01 commented Jul 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jul 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jul 4, 2026

Choose a reason for hiding this comment

Uh oh!

rmusser01 Jul 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jul 4, 2026

Choose a reason for hiding this comment

Uh oh!

rmusser01 Jul 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

qodo-code-review Bot commented Jul 4, 2026

PR Summary by Qodo

Uh oh!

qodo-code-review Bot commented Jul 4, 2026

Code Review by Qodo

Uh oh!

qodo-code-review Bot Jul 4, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-code-review Bot Jul 4, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-code-review Bot Jul 4, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-code-review Bot Jul 4, 2026

Choose a reason for hiding this comment

Uh oh!

rmusser01 commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rmusser01 commented Jul 4, 2026 •

edited

Loading

coderabbitai Bot commented Jul 4, 2026 •

edited

Loading