feat(platform): add shared investigation flows and harden transparency tools#59
feat(platform): add shared investigation flows and harden transparency tools#59lspassos1 wants to merge 3 commits intoenioxt:mainfrom
Conversation
…y tools Add JSON investigation import, public shared gallery, complete shared bundles, and fork flows across the API and frontend. Also harden Brave/DDG/PNCP fallbacks and expand chat, patterns, and investigation coverage. Validated with: - cd api && uv run pytest tests/unit/test_chat.py tests/unit/test_patterns.py tests/unit/test_investigation.py tests/unit/test_transparency_tools.py -q - cd api && uv run pytest -m integration tests/integration/test_entity_integration.py tests/integration/test_search_integration.py tests/integration/test_graph_integration.py -q - cd frontend && npm test -- --run src/components/investigation/InvestigationPanel.test.tsx src/pages/SharedInvestigation.test.tsx src/pages/SharedInvestigations.test.tsx - cd frontend && npm run type-check -- --pretty false
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 5 minutes and 23 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (16)
📝 WalkthroughWalkthroughAdds public sharing for investigations (export/import bundles, shared gallery, fork-to-continue), implements import endpoint and service logic, introduces Cypher queries for shared listings/tags/annotations, refactors web-search and PNCP tooling with retries/fallbacks, and expands backend/frontend tests and UI for import/gallery/fork flows. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client (Browser)
participant Router as Investigation Router
participant Service as Investigation Service
participant DB as Neo4j
participant Storage as File System
Client->>Router: POST /api/v1/investigations/import (multipart JSON file)
Router->>Router: validate content-type, parse JSON -> InvestigationExportBundle
Router->>Service: import_investigation_bundle(bundle, user_id)
Service->>DB: create_investigation(title, user_id)
DB-->>Service: investigation_id
Service->>Service: iterate entities (dedupe/trim)
Service->>DB: add entity -> relation
DB-->>Service: success/skip
Service->>Service: iterate annotations (trim)
Service->>DB: create_annotation(...)
DB-->>Service: annotation_id
Service->>Service: iterate tags (dedupe by name/color)
Service->>DB: create_tag(...)
DB-->>Service: tag_id
Service->>DB: get_investigation(investigation_id)
DB-->>Service: populated investigation
Service-->>Router: InvestigationImportResponse (counts, skipped_ids)
Router-->>Client: 201 Created + payload
sequenceDiagram
participant Caller as Service (tool_web_search)
participant Brave as Brave API
participant DDG as DuckDuckGo
Caller->>Brave: POST /search (attempt 1)
alt Brave 429 or 5xx
Brave-->>Caller: 429/5xx
Caller->>Caller: await sleep(backoff)
Caller->>Brave: POST /search (attempt N)
alt Brave succeeds
Brave-->>Caller: 200 + results
Caller-->>Caller: return results (no/with note)
else Brave fails after retries
Brave-->>Caller: final error
Caller->>DDG: GET /search
alt DDG succeeds
DDG-->>Caller: 200 + results
Caller-->>Client: results + degradation note (Brave fail)
else DDG fails
DDG-->>Caller: error
Caller-->>Client: empty results + combined degradation note
end
end
else Brave succeeds immediately
Brave-->>Caller: 200 + results
Caller-->>Client: results
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Concrete blocker:
Recommended resolution for a maintainer with write access:
Once the label is added, the workflow should rerun automatically because |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 562c973fb8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| <Route index element={<Landing />} /> | ||
| {!IS_PUBLIC_MODE && <Route path="login" element={<Login />} />} | ||
| {!IS_PUBLIC_MODE && <Route path="register" element={<Register />} />} | ||
| <Route path="shared" element={<SharedInvestigations />} /> |
There was a problem hiding this comment.
Expose shared gallery outside auth-redirect shell
This route is mounted inside the RedirectIfAuth wrapper (non-public mode), so any authenticated user requesting /shared is immediately redirected to /app and cannot access the gallery. That also makes the /shared link rendered in the shared investigation page unusable for signed-in users. Mount /shared outside the redirect-only group (or explicitly exempt it) so the public gallery is reachable regardless of auth state.
Useful? React with 👍 / 👎.
| if not records: | ||
| return [], 0 |
There was a problem hiding this comment.
Preserve total count when shared page has no rows
Returning 0 when records is empty causes incorrect pagination metadata for out-of-range pages (e.g., page 2 after deletions), even when shared investigations still exist. Because the current query returns total only with paged rows, an empty page drops the real count and can break UI pagination logic. Use a separate count path (or query shape) so total remains accurate even when no items are returned.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
frontend/src/App.tsx (1)
65-76:⚠️ Potential issue | 🟠 Major
/sharedgets redirected away for authenticated users.Because this route lives under
RedirectIfAuth, a logged-in user in normal mode is sent to/appinstead of the shared gallery. Move the gallery route outside that wrapper, or add an authenticated mirror.🛠️ Possible route split
+ <Route element={<PublicShell />}> + <Route path="shared" element={<SharedInvestigations />} /> + </Route> + <Route element={IS_PUBLIC_MODE ? <PublicShell /> : ( <RedirectIfAuth> <PublicShell /> </RedirectIfAuth> )} > <Route index element={<Landing />} /> {!IS_PUBLIC_MODE && <Route path="login" element={<Login />} />} {!IS_PUBLIC_MODE && <Route path="register" element={<Register />} />} - <Route path="shared" element={<SharedInvestigations />} /> </Route>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/App.tsx` around lines 65 - 76, The /shared route is nested inside the RedirectIfAuth wrapper so authenticated users get redirected to /app; move the Route with path="shared" (element={<SharedInvestigations />}) out of the RedirectIfAuth-protected block (or add an authenticated mirror route) so it is registered at the top-level alongside the parent Route/element={<PublicShell />} rather than as a child of <RedirectIfAuth>, keeping the existing conditional checks on IS_PUBLIC_MODE and preserving PublicShell as the container for that route.frontend/src/pages/SharedInvestigation.tsx (1)
90-98:⚠️ Potential issue | 🟠 MajorEntity IDs may contain unmasked CPFs.
The
entity_idsarray can contain CPF values (based on the Cypher query'scoalesce(e.cpf, ...)chain). Since this is a public shared investigation view, CPFs should be masked as***.***.***.XXper coding guidelines.🛡️ Proposed fix to mask CPFs in entity display
+function maskCpf(value: string): string { + // CPF format: 11 digits, optionally formatted as XXX.XXX.XXX-XX + const cpfPattern = /^\d{3}\.?\d{3}\.?\d{3}-?\d{2}$/; + if (cpfPattern.test(value.replace(/\D/g, '').length === 11 ? value : '')) { + const digits = value.replace(/\D/g, ''); + if (digits.length === 11) { + return `***.***.***.${digits.slice(-2)}`; + } + } + return value; +} {(investigation.entity_ids ?? []).map((eid) => ( - <span key={eid} className={styles.entityChip}>{eid}</span> + <span key={eid} className={styles.entityChip}>{maskCpf(eid)}</span> ))}As per coding guidelines: "Mask CPFs in public output as ..***.XX format"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/pages/SharedInvestigation.tsx` around lines 90 - 98, The entity IDs rendered from investigation.entity_ids in the SharedInvestigation component may include raw CPF values and must be masked before display; implement a small helper (e.g., maskCpf) and call it when mapping over investigation.entity_ids (the span with className={styles.entityChip}) to transform CPFs into the ***.***.***.XX format: detect CPFs by stripping non-digits and checking for 11 digits, then replace the first 9 digits with asterisks and format with dots keeping only the last two digits visible; leave non-CPF IDs unchanged.
🧹 Nitpick comments (5)
api/tests/unit/test_patterns.py (1)
133-138: Strengthen detector payload assertions to match test intent.On Line 133–Line 138, the test only checks
pattern_id. Since this test targets detector payload behavior, please also assert key serialized fields (data,entity_ids,sources) so regressions are caught earlier.Suggested assertion extension
assert payload["entity_id"] == "test-id" assert payload["total"] == 1 assert payload["patterns"][0]["pattern_id"] == pattern_name + assert payload["patterns"][0]["entity_ids"] == ["test-id"] + assert payload["patterns"][0]["data"]["risk_signal"] == 7.5 + assert payload["patterns"][0]["sources"][0]["database"] == "neo4j_public" mock_run_one.assert_awaited_once()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/tests/unit/test_patterns.py` around lines 133 - 138, The test currently only asserts pattern_id on the returned detector payload; expand assertions in the test (the block using response, payload and pattern_name and mock_run_one) to also verify the serialized detector fields by asserting payload["patterns"][0]["data"] equals the expected data blob, payload["patterns"][0]["entity_ids"] equals the expected entity list, and payload["patterns"][0]["sources"] equals the expected sources set/list so the detector serialization is validated alongside pattern_id; keep existing status_code, entity_id, total and mock_run_one assertions intact.frontend/src/pages/SharedInvestigations.tsx (1)
46-65: Consider adding pagination controls for larger datasets.The component uses
listSharedInvestigations()which supports pagination (page,sizeparameters), but the UI doesn't expose pagination controls. For now this works, but as the shared gallery grows, you may want to add pagination or infinite scroll.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/pages/SharedInvestigations.tsx` around lines 46 - 65, The UI for SharedInvestigations.tsx renders investigations from investigations.map but does not expose pagination even though the backend API listSharedInvestigations(page, size) supports it; add pagination state (page, size), update the data fetch that calls listSharedInvestigations to pass page and size and re-fetch when they change, render simple pagination controls (prev/next and page number or page size selector) near the grid and disable prev/next appropriately, and ensure the Link/card rendering logic (keying by investigation.id and using investigation.share_token) continues to work unchanged while showing a loading/empty state when switching pages.api/src/bracc/routers/investigation.py (2)
322-340: Note:response_modelis bypassed when returningJSONResponsedirectly.The
response_model=InvestigationExportBundledeclaration serves only as OpenAPI documentation since FastAPI skips model validation when you return aResponsesubclass directly. The current implementation is structurally correct, but consider returning the model instance directly if you want runtime validation:return InvestigationExportBundle( investigation=investigation, annotations=annotations, tags=tags, )This is minor and the current approach works correctly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/bracc/routers/investigation.py` around lines 322 - 340, The endpoint export_investigation currently returns a JSONResponse which bypasses FastAPI's response_model validation; change the return to construct and return an InvestigationExportBundle instance (using the investigation, annotations, tags from svc.get_investigation, svc.list_annotations, svc.list_tags) instead of JSONResponse so FastAPI will validate/serialize the payload at runtime and keep response_model in sync with the returned data.
1-474: File exceeds the 300-line limit for API routes.This file has 474 lines, which exceeds the 300-line limit specified in coding guidelines. Consider extracting shared investigation endpoints (
list_shared_investigations,fork_shared_investigation,get_shared_investigation) and export endpoints into separate router modules.As per coding guidelines: "API routes must not exceed 300 lines".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/bracc/routers/investigation.py` around lines 1 - 474, This file is over the 300-line limit; extract related endpoints into new router modules to reduce size: move the shared investigation endpoints (list_shared_investigations, fork_shared_investigation, get_shared_investigation) into a new shared router module and move the export endpoints (export_investigation, export_investigation_pdf, export_investigation_md, export_investigation_html) plus the helper _resolve_entities into a new export/router module; keep existing dependencies like Depends(ensure_investigations_enabled) and reuse services (svc, render_investigation_pdf/md/html, execute_query_single, mask_formatted_cpf, mask_raw_cpf, PEP_ROLES) by importing them; update imports in the original file to remove the moved functions and register the new routers where the app includes routers (or export them for inclusion), ensuring route prefixes, response models, and docstrings remain unchanged.api/src/bracc/services/investigation_service.py (1)
1-448: File exceeds the 400-line limit for Python modules.This file has 448 lines, which exceeds the 400-line limit specified in coding guidelines. Consider extracting shared investigation and import/fork functions into a separate module (e.g.,
shared_investigation_service.py).As per coding guidelines: "Python modules must not exceed 400 lines".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/bracc/services/investigation_service.py` around lines 1 - 448, This module exceeds the 400-line limit; extract the shared-investigation logic into a new module (e.g., shared_investigation_service) by moving the functions get_shared_investigation, list_annotations_by_share_token, list_tags_by_share_token, import_investigation_bundle, and fork_shared_investigation (and any small helpers they rely on) into that new file, update imports here to import those symbols, and ensure all references (e.g., calls to get_shared_investigation, import_investigation_bundle, fork_shared_investigation, list_annotations_by_share_token, list_tags_by_share_token) are adjusted to the new module; keep function signatures unchanged and run tests to confirm behavior remains identical.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/src/bracc/queries/annotation_list_by_token.cypher`:
- Around line 1-7: The Cypher query that matches (i:Investigation {share_token:
$token})-[:HAS_ANNOTATION]->(a:Annotation) and returns/ORDERs annotations is
unbounded; add a LIMIT of 1000 to the query (e.g., append LIMIT 1000 after the
ORDER BY) so the Annotation list lookup by share token cannot return arbitrarily
many rows and force a full sort; update the query in
annotation_list_by_token.cypher to include this LIMIT.
In `@api/src/bracc/queries/tag_list_by_token.cypher`:
- Around line 1-6: Add a LIMIT 1000 to the Cypher query that finds tags by share
token to prevent unbounded results: locate the MATCH on (i:Investigation
{share_token: $token})-[:HAS_TAG]->(t:Tag) and append "LIMIT 1000" after the
ORDER BY t.name so the RETURN (t.id, i.id AS investigation_id, t.name, t.color)
is capped at 1000 rows.
In `@api/src/bracc/routers/investigation.py`:
- Around line 67-99: The import_investigation endpoint currently reads the
entire upload into memory via await file.read() (raw) which can exhaust memory;
enforce a maximum allowed upload size (e.g., MAX_IMPORT_SIZE) by reading only up
to that limit (or streaming in chunks and summing bytes) and if the size exceeds
the limit raise an HTTPException (413) before attempting json.loads; update the
import_investigation function to validate file.size or the length of raw after
read and reject oversized files before calling
InvestigationExportBundle.model_validate or svc.import_investigation_bundle.
In `@api/src/bracc/services/investigation_service.py`:
- Around line 385-396: The annotation loop in investigation_service.py calls
create_annotation for every annotation in bundle.annotations without checking
whether its entity was actually imported; update the loop in the import flow
(where imported_annotations is incremented) to first normalize the annotation
entity id (e.g., entity_id = annotation.entity_id.strip()) and skip creation if
that id is in skipped_entity_ids (or not present in the set of successfully
imported entity ids), only calling create_annotation(session, created.id,
entity_id, annotation.text, user_id) and incrementing imported_annotations when
the entity was successfully imported.
In `@api/src/bracc/services/transparency_tools.py`:
- Around line 1312-1315: The loop currently breaks on the first non-empty
`results` due to `if results: break`, which prevents accumulating up to 10
items; update both occurrences (the `if results: break` around the block that
collects modalidade/base URLs and the similar one at lines ~1334-1335) to only
break when `len(results) >= 10` so the helper continues scanning until it has up
to 10 results, keeping the existing inner `if len(results) >= 10: break` logic
intact.
- Around line 1258-1335: The PNCP probing loop using httpx.AsyncClient can
perform up to 42 sequential requests and currently only relies on per-request
timeout=20.0; add a global deadline so the whole loop cannot hang for many
minutes. Wrap the outer loop (the async with httpx.AsyncClient(...) as client
and the nested for base_url / for modalidade loops) with a global timeout (e.g.,
using asyncio.timeout(PNCP_GLOBAL_DEADLINE) or by checking time.monotonic() at
each iteration) and stop further requests when the deadline is exceeded; ensure
you abort/return from the function (or break both loops) when the deadline hits
so no more client.get calls (refer to the AsyncClient creation and the loops
over _PNCP_URLS and _PNCP_MODALIDADE_CODES and the client.get call) continue
running.
- Around line 105-109: The title and snippet values inserted in results.append
(the "title" and "snippet" fields in transparency_tools.py) are taken verbatim
from external search results and must be sanitized/marked as untrusted before
returning to the chat loop; update the code that builds results (both the block
around results.append and the similar code at lines 187-190) to: 1)
strip/normalize/control whitespace and newlines, 2) escape or remove HTML/JS and
control characters, 3) truncate to a safe length (already 300 for snippet but
re-apply after cleaning), 4) remove or neutralize prompt‑like tokens/phrases
(e.g., "ignore previous", "you are", "system:"), and 5) prepend or wrap the text
with an explicit provenance tag such as "[source snippet - untrusted]" so
downstream code treats it as external data rather than instructions.
- Around line 101-111: The call to resp.json() inside the Brave response
handling can raise a JSON decode error and skip the DuckDuckGo fallback; wrap
the resp.json() call in a try/except (catch json.JSONDecodeError/ValueError)
around the block that builds results (using resp, data, results, max_results)
and on exception log or attach the decode error and proceed to the existing
fallback path (so the DDG fallback and the degraded-note response still run) or
return the degraded-note result instead of letting the exception propagate.
In `@frontend/src/api/client.ts`:
- Around line 30-35: The current logic always sets headers.set("Content-Type",
"application/json") for non-FormData requests, which also applies to bodiless
methods and triggers CORS preflights; update the condition in
frontend/src/api/client.ts to only set Content-Type when a body is present and
not a FormData (i.e., ensure init?.body is truthy and !(init?.body instanceof
FormData)) and/or the HTTP method permits a body (check init?.method !== "GET"
&& init?.method !== "HEAD"), then call headers.set("Content-Type",
"application/json"); leave headers untouched for bodiless requests to avoid
unnecessary preflight.
In `@frontend/src/components/investigation/InvestigationPanel.tsx`:
- Around line 50-53: The file import handler handleImportFile currently accepts
any file and relies on server-side rejection; update handleImportFile to
validate client-side by first checking the file extension (file.name
endsWith('.json')) and MIME (file.type === 'application/json' or allow common
fallbacks) and reject with a user-facing error if those checks fail, then read
the file as text (e.g., FileReader or file.text()) and run JSON.parse inside a
try/catch to detect malformed JSON before uploading, aborting the upload and
invoking the component's existing UI error feedback (e.g., set state or toast)
when validation fails.
In `@TASKS.md`:
- Line 3: TASKS.md has grown past the SSOT backlog limit (max 500 lines); split
it by moving older/closed task entries out of TASKS.md into one or more archive
files (e.g., TASKS_ARCHIVE_YYYY-MM.md) and leave only the active/backlog items
so TASKS.md is <=500 lines, update any top-of-file summary line and internal
links in TASKS.md to point to the new archive files, and commit with a clear
message noting the archive split; search for TASKS.md and the top summary line
(the "**Updated:**" header) to locate where to perform the truncate-and-archive
changes.
---
Outside diff comments:
In `@frontend/src/App.tsx`:
- Around line 65-76: The /shared route is nested inside the RedirectIfAuth
wrapper so authenticated users get redirected to /app; move the Route with
path="shared" (element={<SharedInvestigations />}) out of the
RedirectIfAuth-protected block (or add an authenticated mirror route) so it is
registered at the top-level alongside the parent Route/element={<PublicShell />}
rather than as a child of <RedirectIfAuth>, keeping the existing conditional
checks on IS_PUBLIC_MODE and preserving PublicShell as the container for that
route.
In `@frontend/src/pages/SharedInvestigation.tsx`:
- Around line 90-98: The entity IDs rendered from investigation.entity_ids in
the SharedInvestigation component may include raw CPF values and must be masked
before display; implement a small helper (e.g., maskCpf) and call it when
mapping over investigation.entity_ids (the span with
className={styles.entityChip}) to transform CPFs into the ***.***.***.XX format:
detect CPFs by stripping non-digits and checking for 11 digits, then replace the
first 9 digits with asterisks and format with dots keeping only the last two
digits visible; leave non-CPF IDs unchanged.
---
Nitpick comments:
In `@api/src/bracc/routers/investigation.py`:
- Around line 322-340: The endpoint export_investigation currently returns a
JSONResponse which bypasses FastAPI's response_model validation; change the
return to construct and return an InvestigationExportBundle instance (using the
investigation, annotations, tags from svc.get_investigation,
svc.list_annotations, svc.list_tags) instead of JSONResponse so FastAPI will
validate/serialize the payload at runtime and keep response_model in sync with
the returned data.
- Around line 1-474: This file is over the 300-line limit; extract related
endpoints into new router modules to reduce size: move the shared investigation
endpoints (list_shared_investigations, fork_shared_investigation,
get_shared_investigation) into a new shared router module and move the export
endpoints (export_investigation, export_investigation_pdf,
export_investigation_md, export_investigation_html) plus the helper
_resolve_entities into a new export/router module; keep existing dependencies
like Depends(ensure_investigations_enabled) and reuse services (svc,
render_investigation_pdf/md/html, execute_query_single, mask_formatted_cpf,
mask_raw_cpf, PEP_ROLES) by importing them; update imports in the original file
to remove the moved functions and register the new routers where the app
includes routers (or export them for inclusion), ensuring route prefixes,
response models, and docstrings remain unchanged.
In `@api/src/bracc/services/investigation_service.py`:
- Around line 1-448: This module exceeds the 400-line limit; extract the
shared-investigation logic into a new module (e.g.,
shared_investigation_service) by moving the functions get_shared_investigation,
list_annotations_by_share_token, list_tags_by_share_token,
import_investigation_bundle, and fork_shared_investigation (and any small
helpers they rely on) into that new file, update imports here to import those
symbols, and ensure all references (e.g., calls to get_shared_investigation,
import_investigation_bundle, fork_shared_investigation,
list_annotations_by_share_token, list_tags_by_share_token) are adjusted to the
new module; keep function signatures unchanged and run tests to confirm behavior
remains identical.
In `@api/tests/unit/test_patterns.py`:
- Around line 133-138: The test currently only asserts pattern_id on the
returned detector payload; expand assertions in the test (the block using
response, payload and pattern_name and mock_run_one) to also verify the
serialized detector fields by asserting payload["patterns"][0]["data"] equals
the expected data blob, payload["patterns"][0]["entity_ids"] equals the expected
entity list, and payload["patterns"][0]["sources"] equals the expected sources
set/list so the detector serialization is validated alongside pattern_id; keep
existing status_code, entity_id, total and mock_run_one assertions intact.
In `@frontend/src/pages/SharedInvestigations.tsx`:
- Around line 46-65: The UI for SharedInvestigations.tsx renders investigations
from investigations.map but does not expose pagination even though the backend
API listSharedInvestigations(page, size) supports it; add pagination state
(page, size), update the data fetch that calls listSharedInvestigations to pass
page and size and re-fetch when they change, render simple pagination controls
(prev/next and page number or page size selector) near the grid and disable
prev/next appropriately, and ensure the Link/card rendering logic (keying by
investigation.id and using investigation.share_token) continues to work
unchanged while showing a loading/empty state when switching pages.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f4fca3e2-760a-4f7c-899c-43989874843f
📒 Files selected for processing (27)
TASKS.mdapi/src/bracc/models/investigation.pyapi/src/bracc/queries/annotation_list_by_token.cypherapi/src/bracc/queries/investigation_shared_list.cypherapi/src/bracc/queries/tag_list_by_token.cypherapi/src/bracc/routers/investigation.pyapi/src/bracc/services/investigation_service.pyapi/src/bracc/services/transparency_tools.pyapi/tests/integration/conftest.pyapi/tests/unit/test_chat.pyapi/tests/unit/test_investigation.pyapi/tests/unit/test_patterns.pyapi/tests/unit/test_transparency_tools.pyfrontend/src/App.tsxfrontend/src/api/client.tsfrontend/src/components/investigation/InvestigationDetail.tsxfrontend/src/components/investigation/InvestigationPanel.module.cssfrontend/src/components/investigation/InvestigationPanel.test.tsxfrontend/src/components/investigation/InvestigationPanel.tsxfrontend/src/i18n.tsfrontend/src/pages/SharedInvestigation.module.cssfrontend/src/pages/SharedInvestigation.test.tsxfrontend/src/pages/SharedInvestigation.tsxfrontend/src/pages/SharedInvestigations.module.cssfrontend/src/pages/SharedInvestigations.test.tsxfrontend/src/pages/SharedInvestigations.tsxfrontend/src/stores/investigation.ts
| # TASKS.md — EGOS Inteligência (SSOT) | ||
|
|
||
| > **Updated:** 2026-03-18 | **Patterns:** 10 | **Nodes:** 77.0M | **Rels:** 25.1M | **Tools:** 27 | **Tasks:** 103/141 ✅ | **GitHub Issues:** https://github.com/enioxt/EGOS-Inteligencia/issues | ||
| > **Updated:** 2026-03-26 | **Patterns:** 10 | **Nodes:** 77.0M | **Rels:** 25.1M | **Tools:** 27 | **Tasks:** 105/141 ✅ | **GitHub Issues:** https://github.com/enioxt/EGOS-Inteligencia/issues |
There was a problem hiding this comment.
TASKS.md exceeds SSOT size limit and should be split/archived.
The file is now far beyond the allowed backlog size, which makes SSOT maintenance harder and conflicts with the repo rule.
As per coding guidelines "Maintain SSOT registry files: ... TASKS.md (backlog, max 500 lines)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@TASKS.md` at line 3, TASKS.md has grown past the SSOT backlog limit (max 500
lines); split it by moving older/closed task entries out of TASKS.md into one or
more archive files (e.g., TASKS_ARCHIVE_YYYY-MM.md) and leave only the
active/backlog items so TASKS.md is <=500 lines, update any top-of-file summary
line and internal links in TASKS.md to point to the new archive files, and
commit with a clear message noting the archive split; search for TASKS.md and
the top summary line (the "**Updated:**" header) to locate where to perform the
truncate-and-archive changes.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 72659ff04d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| if resp.status_code == 200: | ||
| results: list[dict[str, str]] = [] | ||
| data = resp.json() |
There was a problem hiding this comment.
Handle malformed Brave payloads before fallback
_search_brave only wraps client.get(...) in try/except; when Brave returns HTTP 200 with a non-JSON body (or otherwise invalid JSON), resp.json() raises and escapes the function, so tool_web_search never reaches the DuckDuckGo fallback. This turns a recoverable provider hiccup into a hard failure for chat/tool calls that depend on web search.
Useful? React with 👍 / 👎.
| await create_annotation( | ||
| session, | ||
| created.id, | ||
| annotation.entity_id.strip(), | ||
| annotation.text, |
There was a problem hiding this comment.
Skip annotations for entities that failed to import
The import flow records entity IDs that could not be added in skipped_entity_ids, but still creates annotations for every bundle item without checking whether that entity was successfully imported. In cases where entity ingestion fails (e.g., stale IDs in a shared bundle), this produces orphan annotations tied to IDs not present in the new investigation, which corrupts investigation consistency and inflates imported_annotations.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
🧹 Nitpick comments (3)
frontend/src/App.test.tsx (1)
32-43: Consider usingafterEachfor test state cleanup.Manually resetting
authState.token = nullat the end of the test (line 42) is fragile—if the test fails before that line, the state won't reset, potentially polluting subsequent tests. As the test suite grows, this can lead to flaky failures.♻️ Suggested refactor using afterEach
+import { afterEach, describe, expect, it, vi } from "vitest"; -import { describe, expect, it, vi } from "vitest"; // ... mocks ... describe("App", () => { + afterEach(() => { + authState.token = null; + }); + it("keeps /shared reachable for authenticated users", () => { authState.token = "token"; render( <MemoryRouter initialEntries={["/shared"]}> <App /> </MemoryRouter>, ); expect(screen.getByText("Shared gallery stub")).toBeInTheDocument(); - authState.token = null; });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/App.test.tsx` around lines 32 - 43, The test manually resets authState.token at the end of the "keeps /shared reachable for authenticated users" test which is fragile; instead add a shared cleanup using afterEach that clears authState.token so it always runs regardless of test outcome. Locate the test referencing authState and the App component (MemoryRouter + initialEntries "/shared") and add an afterEach hook that sets authState.token = null (or restores a known initial state) to ensure no cross-test pollution.api/tests/unit/test_investigation.py (2)
390-456: Consider verifying the parsed bundle passed to the service.The test correctly validates the HTTP contract (201 status, response structure) and confirms the service was invoked. However,
import_mock.assert_awaited_once()doesn't verify that the file content was correctly parsed and passed to the service.🔧 Strengthened assertion (optional)
assert response.status_code == 201 assert response.json()["investigation"]["id"] == "imported-inv" - import_mock.assert_awaited_once() + import_mock.assert_awaited_once() + call_args = import_mock.call_args + # Verify bundle content was passed (second positional arg after session) + assert call_args is not None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/tests/unit/test_investigation.py` around lines 390 - 456, The test should verify the actual parsed bundle passed into bracc.routers.investigation.svc.import_investigation_bundle, not just that the service was called; after the POST and import_mock.assert_awaited_once(), retrieve the awaited call args from import_mock (e.g. import_mock.await_args or import_mock.call_args) and assert the first positional arg equals the expected parsed payload dictionary (the same structure as payload/investigation/annotations/tags) so the test validates the router correctly parsed the uploaded file and forwarded the expected bundle to import_investigation_bundle.
1-33: File exceeds 400-line limit; consider splitting.This test file is 914 lines, well beyond the 400-line limit specified in the coding guidelines. Consider splitting into separate modules, e.g.:
test_investigation_core.py— CRUD operationstest_investigation_sharing.py— share/import/fork flowstest_investigation_cypher.py— Cypher integrity checksThe new imports and cypher file entries are correctly added for the new shared investigation flows.
As per coding guidelines: "Python modules must not exceed 400 lines".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/tests/unit/test_investigation.py` around lines 1 - 33, The test module test_investigation.py exceeds the 400-line guideline; split it into smaller focused test modules (e.g., test_investigation_core.py for CRUD tests that exercise investigation_create, investigation_get, investigation_list, investigation_update, investigation_delete; test_investigation_sharing.py for sharing/import/fork flows that exercise investigation_share, investigation_by_token, investigation_shared_list, investigation_shared_count; and test_investigation_cypher.py for cypher-related checks that reference CypherLoader and the INVESTIGATION_CYPHER_FILES list). Move shared fixtures/constants (FAKE_PDF, INVESTIGATION_CYPHER_FILES) into a common test fixture module or conftest.py and update imports in the new modules; ensure tests still import investigation_service and CypherLoader where needed and run unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@api/tests/unit/test_investigation.py`:
- Around line 390-456: The test should verify the actual parsed bundle passed
into bracc.routers.investigation.svc.import_investigation_bundle, not just that
the service was called; after the POST and import_mock.assert_awaited_once(),
retrieve the awaited call args from import_mock (e.g. import_mock.await_args or
import_mock.call_args) and assert the first positional arg equals the expected
parsed payload dictionary (the same structure as
payload/investigation/annotations/tags) so the test validates the router
correctly parsed the uploaded file and forwarded the expected bundle to
import_investigation_bundle.
- Around line 1-33: The test module test_investigation.py exceeds the 400-line
guideline; split it into smaller focused test modules (e.g.,
test_investigation_core.py for CRUD tests that exercise investigation_create,
investigation_get, investigation_list, investigation_update,
investigation_delete; test_investigation_sharing.py for sharing/import/fork
flows that exercise investigation_share, investigation_by_token,
investigation_shared_list, investigation_shared_count; and
test_investigation_cypher.py for cypher-related checks that reference
CypherLoader and the INVESTIGATION_CYPHER_FILES list). Move shared
fixtures/constants (FAKE_PDF, INVESTIGATION_CYPHER_FILES) into a common test
fixture module or conftest.py and update imports in the new modules; ensure
tests still import investigation_service and CypherLoader where needed and run
unchanged.
In `@frontend/src/App.test.tsx`:
- Around line 32-43: The test manually resets authState.token at the end of the
"keeps /shared reachable for authenticated users" test which is fragile; instead
add a shared cleanup using afterEach that clears authState.token so it always
runs regardless of test outcome. Locate the test referencing authState and the
App component (MemoryRouter + initialEntries "/shared") and add an afterEach
hook that sets authState.token = null (or restores a known initial state) to
ensure no cross-test pollution.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a5b81c8a-25da-482f-8ffe-3b66e0c0f80d
📒 Files selected for processing (5)
api/src/bracc/queries/investigation_shared_count.cypherapi/src/bracc/services/investigation_service.pyapi/tests/unit/test_investigation.pyfrontend/src/App.test.tsxfrontend/src/App.tsx
✅ Files skipped from review due to trivial changes (1)
- api/src/bracc/queries/investigation_shared_count.cypher
🚧 Files skipped from review as they are similar to previous changes (1)
- api/src/bracc/services/investigation_service.py
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 878c229159
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| user_id: str, | ||
| title: str | None = None, | ||
| ) -> InvestigationImportResponse | None: | ||
| shared_investigation = await get_shared_investigation(session, token) |
There was a problem hiding this comment.
Fork shared bundles from unmasked source data
fork_shared_investigation is copying entity_ids from get_shared_investigation, but that path applies public CPF masking before returning data. For shared investigations that contain CPF-based entities, the fork bundle now carries masked IDs like ***.***.***.01, so import_investigation_bundle cannot resolve those entities in add_entity_to_investigation, and dependent annotations are skipped as well. This breaks the new “continue from this research” flow for person-centric investigations; the fork path needs to read unmasked token data instead of the public-masked representation.
Useful? React with 👍 / 👎.
Summary
This PR completes the remaining work behind #47, #53, #32, and #33. It adds real shared investigation import/share/fork flows, expands backend coverage around chat and patterns, and hardens the transparency tools fallback path for Brave, DuckDuckGo, and PNCP.
Root cause
The investigation feature only shared an ID/token view without annotations or tags, had no import/fork flow, and copied an incorrect public URL. Separately, the web search fallback still degraded opaquely under provider failures, and PNCP calls could still fail because the API expects compact dates and a required modalidade parameter.
Changes
Validation
=============================== warnings summary ===============================
tests/unit/test_chat.py::test_call_openrouter_executes_tool_calls_and_collects_evidence[asyncio]
/Users/lucaspassos/Documents/EGOS-Inteligencia/api/src/bracc/routers/chat.py:748: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
"timestamp": datetime.utcnow().isoformat() + "Z",
tests/unit/test_investigation.py: 19 warnings
/Users/lucaspassos/Documents/EGOS-Inteligencia/api/.venv/lib/python3.12/site-packages/jwt/api_jwt.py:153: InsecureKeyLengthWarning: The HMAC key is 23 bytes long, which is below the minimum recommended length of 32 bytes for SHA256. See RFC 7518 Section 3.2.
return self._jws.encode(
tests/unit/test_investigation.py: 19 warnings
/Users/lucaspassos/Documents/EGOS-Inteligencia/api/.venv/lib/python3.12/site-packages/jwt/api_jwt.py:371: InsecureKeyLengthWarning: The HMAC key is 23 bytes long, which is below the minimum recommended length of 32 bytes for SHA256. See RFC 7518 Section 3.2.
decoded = self.decode_complete(
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
57 passed, 39 warnings in 0.28s
=============================== warnings summary ===============================
.venv/lib/python3.12/site-packages/testcontainers/neo4j/init.py:63
/Users/lucaspassos/Documents/EGOS-Inteligencia/api/.venv/lib/python3.12/site-packages/testcontainers/neo4j/init.py:63: DeprecationWarning: The @wait_container_is_ready decorator is deprecated and will be removed in a future version. Use structured wait strategies instead: container.waiting_for(HttpWaitStrategy(8080).for_status_code(200)) or container.waiting_for(LogMessageWaitStrategy('ready'))
@wait_container_is_ready()
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
13 skipped, 1 warning in 0.10s
RUN v3.2.4 /Users/lucaspassos/Documents/EGOS-Inteligencia/frontend
✓ src/pages/SharedInvestigations.test.tsx (1 test) 62ms
✓ src/components/investigation/InvestigationPanel.test.tsx (5 tests) 106ms
✓ src/pages/SharedInvestigation.test.tsx (4 tests) 103ms
Test Files 3 passed (3)
Tests 10 passed (10)
Start at 21:34:17
Duration 4.79s (transform 529ms, setup 1.41s, collect 2.03s, tests 271ms, environment 9.54s, prepare 456ms)
Risk
Low to moderate. The new investigation sharing/import paths change API surface area and frontend flows, but they are covered by targeted tests. Integration coverage still skips when Docker is unavailable, so testcontainers-backed execution was not exercised end-to-end in this environment.
Summary by CodeRabbit
New Features
Bug Fixes