Skip to content

feat(composer): click pasted/attached image thumbnails to lightbox-zoom them#1758

Merged
1 commit merged intomasterfrom
fix/composer-chip-lightbox
May 6, 2026
Merged

feat(composer): click pasted/attached image thumbnails to lightbox-zoom them#1758
1 commit merged intomasterfrom
fix/composer-chip-lightbox

Conversation

@nesquena-hermes
Copy link
Copy Markdown
Collaborator

feat(composer): click pasted/attached image thumbnails to lightbox-zoom them

Why

When pasting screenshots into the composer — especially multiple in sequence, which now works end-to-end with the companion Mac fix at hermes-webui/hermes-swift-mac#74 — users had no way to verify which image actually attached. The 56×56 thumbnail in the chip is enough to confirm "an image is here" but offers no detail at all.

From the request:

"When I hit Cmd+C and save an image to the clipboard and then paste the clipboard out, I want to be able to click on any one of those uploaded images that's inside the composer bar and have it zoom up like a lightbox so I can see the image in full once it's been pasted in to the composer input."

This is paired with the Mac sequential-paste fix in PR #74 — together they form a clean v0.51.13 + v1.6.4 release: paste multiple screenshots in a row, see them all as distinct chips, click any one to verify before sending.

What changes

static/ui.js — extend the existing lightbox delegate

The image lightbox infrastructure already exists for message-attached images (the _openImgLightbox() helper at line 269 + the document-level click delegate at line 298 for .msg-media-img). This PR extends the same delegate to also fire on .attach-thumb composer chips:

document.addEventListener('click', e => {
  if(!e.target || !e.target.closest) return;
  // Message-attached images (already wired since v0.50.x).
  let img = e.target.closest('.msg-media-img');
  if(img){ _openImgLightbox(img.src, img.alt); return; }
  // Composer attach-tray image thumbnails — click any pasted/dropped image
  // chip to lightbox-zoom it before sending. Excludes audio/video chips,
  // which keep their inline media controls. SVG thumbnails (.attach-thumb--svg)
  // are still images visually, so they qualify.
  img = e.target.closest('.attach-thumb');
  if(img && img.tagName === 'IMG'){
    _openImgLightbox(img.src, img.alt || img.title || 'Attached image');
    return;
  }
});

Three correctness properties:

  • Audio/video chips are excluded. They render .attach-chip--audio or .attach-chip--video with native <audio> / <video> controls — no .attach-thumb img is ever produced for them, so the closest() returns null and the handler does nothing.
  • SVG thumbnails qualify. .attach-thumb attach-thumb--svg is still an <img>; users want to zoom SVGs the same way they zoom rasters.
  • The chip's × remove button is a sibling, not an ancestor, of the thumb. Clicking × → e.target.closest('.attach-thumb') returns null → the handler doesn't fire → the existing remove handler runs unobstructed. No interference.

static/style.css — affordance

Two-line change to make the click discoverable:

/* Was: cursor: default — actively misleading. */
.attach-thumb{... cursor:zoom-in; transition:filter .12s ease, transform .12s ease;}
.attach-thumb:hover{filter:brightness(1.05); transform:scale(1.04);}

cursor: zoom-in is the standard browser cursor for "click to enlarge" — instantly tells the user the thumbnail is interactive. The hover emphasis (4% scale + slight brightness) gives instant visual feedback before click.

tests/test_composer_chip_lightbox.py — 5 regression tests

Pin the wiring at the source level:

  1. Document click delegate handles .attach-thumb clicks.
  2. Document click delegate still handles .msg-media-img (no regression on the v0.50.x behaviour).
  3. Audio/video chips never render an .attach-thumb img.
  4. .attach-thumb cursor is zoom-in (catches anyone reverting to cursor:default).
  5. .attach-thumb:hover emphasis rule present.

Verification

Tests

47 composer-area tests pass (the new file plus all existing paste/composer/attach tests). 0 regressions.

Live browser check on port 8789

Programmatically simulated three sequential pastes via addFiles() with the new Mac-side filename pattern (screenshot-1778097130001-1.png, -2, -3):

{
  count: 3,
  names: ["screenshot-...001-1.png", "...002-2.png", "...003-3.png"],
  chips: 3,
  thumbs: 3,
  cursor: "zoom-in"
}

All three distinct files in pendingFiles, all three chips render, cursor is zoom-in. ✓

Click thumb #2:

{
  thumbs_count: 3,
  lightbox_before: 0,
  lightbox_after: 1,
  lightbox_img_src_present: true,
  lightbox_alt: "screenshot-1778097130002-2.png",
  lightbox_role: "dialog"
}

Lightbox opens with the right image, ARIA role is dialog, alt text matches the filename. ✓

Click × on chip #2 (sibling-of-thumb, after closing lightbox):

{ beforeLb: 0, afterLb: 0, remainingChips: 2, remainingNames: [...] }

× removes only that chip, no lightbox interference. ✓

Escape key:

{ lbBefore: 1, lbAfter: 0 }

Closes the lightbox — the existing _escHandler in _openImgLightbox() handles it. ✓

Companion PR

hermes-webui/hermes-swift-mac#74 — unique filename per paste so sequential screenshot pastes actually appear as distinct chips. Without that, the second paste's 'screenshot.png' collides with the first in addFiles()'s f.name-keyed dedupe and silently disappears. Together: paste, paste, paste, click any to zoom-verify, send.

Refs

Refs #1733.

…om them

When pasting screenshots into the composer (especially multiple in
sequence, now possible end-to-end with hermes-webui/hermes-swift-mac
PR #74) the user has no way to verify the right image attached. The
56x56 thumbnail in the chip is fine as a UI affordance but offers no
detail at all. Quote from the request:

  When I hit Cmd+C and save an image to the clipboard and then paste
  the clipboard out, I want to be able to click on any one of those
  uploaded images that's inside the composer bar and have it zoom up
  like a lightbox so I can see the image in full once it's been
  pasted in to the composer input.

The lightbox infrastructure already exists for message-attached
images (static/ui.js:269 _openImgLightbox + the doc-level click
delegate at :298 for .msg-media-img). This PR extends the same
delegate to also fire on .attach-thumb composer chips:

  - Clicking the thumbnail opens the existing image lightbox with the
    blob URL as src and the file name as alt text.
  - Audio/video chips are excluded (they have their own native
    <audio> / <video> controls and don't render an .attach-thumb
    img).
  - SVG thumbnails (.attach-thumb attach-thumb--svg) qualify — they
    are images visually.
  - The chip's x remove button is a sibling, not an ancestor, of the
    thumb — closest('.attach-thumb') from the button returns null,
    so removing still works without lightbox interference.

Also updates static/style.css:
  - cursor: zoom-in on .attach-thumb (was cursor: default — actively
    misleading).
  - Subtle :hover emphasis (brightness 1.05 + scale 1.04, 120ms ease)
    so users discover the affordance before clicking.

5 regression tests in tests/test_composer_chip_lightbox.py pinning:
  - delegate handles .attach-thumb on IMG elements
  - delegate still handles .msg-media-img (no regression)
  - audio/video chips do NOT render an .attach-thumb img
  - cursor:zoom-in declared on the .attach-thumb selector
  - hover emphasis rule present

Browser-verified live on port 8789:
  - addFiles three distinct screenshot files (mimicking three Mac
    sequential pastes) -> 3 chips, 3 thumbs, all distinct.
  - Click thumb #2 -> lightbox opens with the right image, alt text
    matches filename.
  - Click x on chip #2 -> removes that chip, no lightbox.
  - Escape key closes lightbox.

Companion PR on the Mac side:
hermes-webui/hermes-swift-mac#74 (unique filename per paste so
sequential pastes actually appear as distinct chips).

Refs #1733.
Copy link
Copy Markdown
Owner

@nesquena nesquena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — end-to-end ✅ (clean approve, headless-Chrome harness verifies all 4 click paths)

What this ships

Self-built PR adding lightbox-zoom behavior to composer attachment chips so users can verify the right image attached before sending. Refs #1733; paired with companion Mac PR hermes-webui/hermes-swift-mac#74 (unique filenames per sequential paste). +115/-4 across 3 files:

  • static/ui.js:298-313 — extends the existing document-level click delegate (which already routes .msg-media-img_openImgLightbox) to also fire on .attach-thumb IMG elements. Branch order: message-attached image → composer-chip thumb → no-op.
  • static/style.css:897-898.attach-thumb cursor changed from default (actively misleading) to zoom-in; adds a :hover emphasis rule (4% scale + 5% brightness, 120ms ease) for instant visual feedback.
  • tests/test_composer_chip_lightbox.py — 5 regression tests across 2 classes (delegate wiring + cursor affordance).

Traced against upstream hermes-agent

Frontend-only — pure DOM event delegation in static/ui.js, no server round-trip, no config.yaml write, no session-state mutation, no agent IPC. Cross-tool surface: zero. ✓

End-to-end trace

Pre-fix flow (1733's symptom):

  1. User pastes screenshot → paste listener at static/boot.js synthesises File objects → addFiles() populates S.pendingFilesrenderTray() at static/ui.js:6264 generates an .attach-chip attach-chip--media attach-chip--image div containing <img class="attach-thumb" src="${blobUrl}" alt="${name}"> + <button>×</button>.
  2. User clicks the 56×56 thumbnail to verify the image. Nothing happens — the document-level click delegate at static/ui.js only matched .msg-media-img. ❌

Post-fix flow:

  1. Same up to step 2.
  2. User clicks the thumbnail.
  3. Document delegate at static/ui.js:298 now has a second branch: e.target.closest('.attach-thumb') matches the IMG, tagName==='IMG' guard passes, calls _openImgLightbox(img.src, img.alt || img.title || 'Attached image').
  4. Lightbox opens with the blob URL as src and the filename as alt + ARIA-label. ✅

Behavioural harness — headless Chrome end-to-end

Built /tmp/test_lightbox_delegate.html mirroring the real renderTray() chip structure (image chip with sibling × button, audio chip with <audio> controls and × button, second image chip), wired the actual delegate code from this PR, and ran via Google Chrome --headless --dump-dom:

thumbs found: 2
after clicking thumb 0: lightboxes=1, alt=screenshot-1.png             ← happy path ✓
after clicking remove button on chip 1: lightboxes=0 (should be 0)     ← × sibling, no interference ✓
after clicking audio chip remove button: lightboxes=0 (should be 0)    ← audio chip × ignored ✓
after clicking audio element: lightboxes=0 (should be 0)               ← audio element ignored ✓

All four click paths behave correctly. The PR description claims closest('.attach-thumb') from the × button returns null because the button is a sibling-of-thumb — verified empirically. Audio/video chips genuinely have no .attach-thumb IMG (verified by reading renderTray() at static/ui.js:6276-6283) so the delegate can't fire on them by construction.

Pre-fix test verification

Reverted static/ui.js and static/style.css to origin/master and ran the new tests:

4 failed, 1 passed
- test_delegate_handles_attach_thumb_clicks         FAIL  (delegate doesn't match .attach-thumb)
- test_delegate_still_handles_message_attached_images FAIL (different anchor block on master)
- test_attach_thumb_cursor_is_zoom_in              FAIL  (cursor is :default on master)
- test_attach_thumb_has_hover_emphasis             FAIL  (no :hover rule on master)
- test_delegate_excludes_audio_video_chips         PASS  (asserts existing audio/video render shape)

The right shape of regression coverage — 4/5 are wired tightly enough to catch a regression to the previous state.

Other audit — things that are correct already

  • renderTray() chip shape verified at static/ui.js:6276-6283: image chip = <img class="attach-thumb"> + <button> (siblings); SVG chip = <img class="attach-thumb attach-thumb--svg"> + <button>; audio chip = <span>...</span><audio controls> + <button> (no .attach-thumb); video chip = <span>...</span><video controls> + <button> (no .attach-thumb). The tagName === 'IMG' guard is technically redundant given closest('.attach-thumb') already filters, but it's free defense-in-depth against a future contributor adding .attach-thumb to a non-IMG element.
  • f.name injected via ${esc(f.name)} in alt/title attributes at static/ui.js:6277. The _openImgLightbox reads img.alt (the DOM-decoded form, not raw HTML) and writes it via setAttribute('aria-label', alt) (which serializes safely without re-injection) and img.alt = alt (property assignment, not innerHTML). No XSS surface added.
  • Blob URL as src: pasted images get URL.createObjectURL(f) blob URLs at static/ui.js:6275. Origin-bound, locally-scoped, no remote URL injection. Lightbox renders the same blob URL.
  • Existing .msg-media-img branch preserved with the literal anchor let img = e.target.closest('.msg-media-img'); if(img){ _openImgLightbox(img.src, img.alt); return; } (test test_delegate_still_handles_message_attached_images locks this).
  • Branch ordering matters and is correct: .msg-media-img matched first, then .attach-thumb. A .msg-media-img couldn't simultaneously be an .attach-thumb (different DOM scopes), but the early return after the first match keeps the logic clean.
  • Escape key + click-outside close are inherited from the existing _openImgLightbox (via lb._escHandler and lb.onclick). Verified by inspection of static/ui.js:269-291.
  • Memory hygiene: blob URLs are revoked on chip remove at static/ui.js:6289 (URL.revokeObjectURL(chip.dataset.blobUrl)). Lightbox doesn't take ownership; just reads the same blob URL the chip's IMG already holds. When the chip is removed (× clicked), the blob is revoked — and any open lightbox showing the same URL would lose its image, which is acceptable (user's intent on × is to remove the attachment).
  • Cursor affordance: cursor: zoom-in is the standard CSS cursor for "click to enlarge" — instantly recognisable. Pre-fix cursor: default falsely advertised non-interactivity.
  • No new endpoints, no auth changes, no Python touched.

Edge-case trace

Scenario Expected Actual
Single image chip → click thumb lightbox opens with image ✅ (harness)
Single image chip → click × button chip removed, no lightbox ✅ (harness)
Multiple images → click thumb #2 lightbox shows image #2 with that alt ✅ (harness)
Audio chip → click × button chip removed, no lightbox ✅ (harness)
Audio chip → click audio element audio plays, no lightbox ✅ (harness)
Video chip → click video element video plays, no lightbox ✅ (no .attach-thumb in chip)
SVG chip → click thumb lightbox opens (SVG renders inside <img>) ✅ (.attach-thumb attach-thumb--svg is still IMG)
Plain file chip (paperclip) → click filename nothing happens ✅ (no .attach-thumb rendered)
Message-attached image (existing v0.50.x) → click lightbox opens (unchanged) ✅ (.msg-media-img branch preserved)
Lightbox open → Escape closes ✅ (existing _escHandler)
Lightbox open → click outside img closes ✅ (existing lb.onclick)
Lightbox open → click × button closes ✅ (existing close button)
Cross-tool: agent CLI doesn't render chips n/a
cursor: zoom-in declared on .attach-thumb yes
:hover emphasis rule present yes (brightness 1.05, scale 1.04)
× button is sibling, not ancestor, of .attach-thumb yes (renderTray() confirmed)

Tests

  • tests/test_composer_chip_lightbox.py — 5/5 pass on the PR branch. 4/5 verified to fail on origin/master (the right shape of regression coverage; only the audio/video-exclusion test passes pre-fix because it asserts existing render shape).
  • node -c static/ui.js — clean.
  • Full suite (excluding pre-existing macOS bash 3.2 test_ctl_script.py failures unrelated to this PR): 4578 passed, 57 skipped, 3 xpassed, 0 failed in 43.28s.
  • Behavioural harness via headless Chrome: 4 click paths (thumb-image, ×-on-image, ×-on-audio, audio-element) all behave correctly.

Minor observations (non-blocking)

  • The closest('.msg-media-img') branch returns lightbox(img.src, img.alt) (no fallback to img.title || 'Attached image'), while the new .attach-thumb branch does have that fallback. This asymmetry is fine — message-attached images render with explicit alt text via the message-renderer; chip thumbs render alt=${esc(f.name)} so alt is always populated too. The fallback is purely defensive.
  • No keyboard support for the chip thumb — a click handler only. Users who tab into the chip and press Enter won't get the lightbox. Symmetric with the existing .msg-media-img behavior, so not a regression. Worth a follow-up that adds tabindex="0" + Enter/Space on the chip's IMG; out of scope here.
  • cursor: zoom-in is a Webkit-/Blink-supported cursor but not universally — Firefox honours it; some older engines fall back to pointer. Acceptable graceful degradation.
  • The two transition properties (filter and transform) are kept short (.12s ease) so the hover effect feels snappy. Matches the lightbox's own .12s open animation.
  • PR description mentions paired Mac PR #74 for sequential-paste filename uniqueness; tested case uses screenshot-<ts>-1.png/-2/-3 filenames — matching the contract from PR #1706 (which previously fixed the same bug shape on the WebUI paste-handler side). Sequential-paste end-to-end now works across both surfaces.

Recommendation

Approved. Surgical UI feature with correct scope: extends one existing delegate, adds two CSS lines, wires 5 regression tests. Behavioural harness via headless Chrome verifies all 4 click paths (thumb-opens-lightbox, ×-removes-chip, audio-×-no-lightbox, audio-click-no-lightbox) end-to-end. Pre-fix verification confirms 4/5 tests catch regressions to the previous state (the 5th asserts unchanged audio/video render shape).

Cross-tool safe (frontend-only, no agent surface). XSS audit clean — alt/title come from ${esc(f.name)} and reach the lightbox via DOM property reads + safe setAttribute writes. Pairs with the companion Mac PR for sequential-paste filename uniqueness.

Parked at approval — ready for the release agent's merge/tag pipeline.

@nesquena-hermes nesquena-hermes closed this pull request by merging all changes into master in 704f8ab May 6, 2026
pull Bot pushed a commit to soitun/hermes-webui that referenced this pull request May 6, 2026
pull Bot pushed a commit to soitun/hermes-webui that referenced this pull request May 6, 2026
Constituent PR:
- nesquena#1758 (@nesquena-hermes) — feat(composer): click pasted/attached image
  thumbnails to lightbox-zoom them. Refs nesquena#1733. Companion Mac PR
  hermes-webui/hermes-swift-mac#74 for sequential-paste filename uniqueness.

Independent review: @nesquena APPROVED with exhaustive headless-Chrome
behavioural harness verifying all 4 click paths (thumb-image, ×-on-image,
×-on-audio, audio-element). Pre-fix verification confirmed 4/5 of the new
tests catch regressions to the previous state.

Opus advisor: SHIP, all 6 verification questions clean. One non-blocking
nit absorbed in-release: wrap .attach-thumb:hover in @media (hover: hover)
for iPad sticky-hover hygiene (3-LOC defensive cleanup).

Tests: 4637 → 4642 collected (+5). 4630 passed, 9 skipped, 3 xpassed,
0 failed.

Pre-release verification:
- pytest 4630 passed, 0 failed
- node -c clean on static/ui.js
- 11/11 browser API endpoints PASS
- Pre-stamp re-fetch: PR head still matches local rebase
- Opus advisor: SHIP, 0 MUST-FIX

Refs nesquena#1733.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants