Skip to content

fix: save image to temp file and include path in prompt for Claude CLI#140

Open
haripatel07 wants to merge 3 commits intoRichardAtCT:mainfrom
haripatel07:fix/image-upload-analysis
Open

fix: save image to temp file and include path in prompt for Claude CLI#140
haripatel07 wants to merge 3 commits intoRichardAtCT:mainfrom
haripatel07:fix/image-upload-analysis

Conversation

@haripatel07
Copy link
Contributor

Problem

Image uploads were silently broken. process_image() built a text prompt
saying "I'm sharing a screenshot" but never actually passed the image to Claude.
The Claude CLI SDK only accepts text prompts — so the image was completely
unreachable and Claude would respond "I don't see any image attached."

Root Cause

ImageHandler.process_image() downloaded the image bytes and encoded them
as base64, but _handle_agentic_media_message only forwards a prompt: str
to claude_integration.run_command(). The image data was stored in
ProcessedImage.base64_data but never used.

Fix

Download the image to /tmp/claude_bot_files/image_<uuid>.<ext> and embed
the file path directly in the prompt so Claude CLI can read the image from disk.

# Before — Claude never saw the image
prompt = "I'm sharing a screenshot with you. Please analyze it..."

# After — Claude can read the actual file
prompt = "I'm sharing a screenshot with you. The image is saved at: /tmp/claude_bot_files/image_abc123.png\n\nPlease analyze it..."

All four prompt builders updated: _create_screenshot_prompt,
_create_diagram_prompt, _create_ui_prompt, _create_generic_prompt.

Testing

  • 491 passed, 0 failed
  • No regressions

Closes #137

Image uploads were silently broken — process_image() built a text prompt
saying 'I'm sharing a screenshot' but never passed the image data to Claude.
The Claude CLI SDK only accepts text prompts, so the image was unreachable.

Fix: download the image bytes to /tmp/claude_bot_files/image_<uuid>.<ext>
and embed the file path in the prompt so Claude CLI can read the image
directly from disk.

All four prompt builders (_create_screenshot_prompt, _create_diagram_prompt,
_create_ui_prompt, _create_generic_prompt) updated to accept and include
the image_path parameter.

Closes RichardAtCT#137
Copilot AI review requested due to automatic review settings March 7, 2026 15:11
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes broken image uploads by saving Telegram photo bytes to a temp file and embedding that file path into the prompt sent to Claude CLI, allowing the CLI to access the actual image content from disk.

Changes:

  • Save downloaded image bytes to /tmp/claude_bot_files/image_<uuid>.<ext>.
  • Embed the saved file path into all image prompt builders (screenshot/diagram/UI/generic).
  • Include the temp file path in ProcessedImage.metadata.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +52 to +55
temp_dir = Path("/tmp/claude_bot_files")
temp_dir.mkdir(exist_ok=True)
image_path = temp_dir / f"image_{uuid.uuid4()}.{ext}"
image_path.write_bytes(bytes(image_bytes))
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The uploaded bytes are written to a fixed directory under /tmp without restricting permissions. By default this directory/file may be world-readable (depending on umask), which is risky for screenshots that can contain sensitive info. Consider creating the directory with a restrictive mode (e.g., 0o700) and writing the file with 0o600, or using tempfile APIs that create private files.

Copilot uses AI. Check for mistakes.
Comment on lines 52 to 81
temp_dir = Path("/tmp/claude_bot_files")
temp_dir.mkdir(exist_ok=True)
image_path = temp_dir / f"image_{uuid.uuid4()}.{ext}"
image_path.write_bytes(bytes(image_bytes))

# Create appropriate prompt
# Detect image type for prompt tailoring
image_type = self._detect_image_type(bytes(image_bytes))

# Build prompt with actual file path so Claude CLI can see the image
if image_type == "screenshot":
prompt = self._create_screenshot_prompt(caption)
prompt = self._create_screenshot_prompt(caption, image_path)
elif image_type == "diagram":
prompt = self._create_diagram_prompt(caption)
prompt = self._create_diagram_prompt(caption, image_path)
elif image_type == "ui_mockup":
prompt = self._create_ui_prompt(caption)
prompt = self._create_ui_prompt(caption, image_path)
else:
prompt = self._create_generic_prompt(caption)
prompt = self._create_generic_prompt(caption, image_path)

# Convert to base64 for Claude (if supported in future)
base64_image = base64.b64encode(image_bytes).decode("utf-8")

return ProcessedImage(
prompt=prompt,
image_type=image_type,
base64_data=base64_image,
size=len(image_bytes),
metadata={
"format": self._detect_format(image_bytes),
"format": fmt,
"has_caption": caption is not None,
"temp_path": str(image_path),
},
)
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces a new persistent temp file on disk, but nothing deletes it after Claude finishes. Over time this can fill /tmp and leak old user data. Suggest adding a cleanup path (e.g., have the caller delete ProcessedImage.metadata['temp_path'] in a finally block after run_command, or implement a TTL-based cleanup routine for /tmp/claude_bot_files).

Copilot uses AI. Check for mistakes.
Comment on lines +49 to +55
# Detect format and save to temp file so Claude CLI can read it
fmt = self._detect_format(bytes(image_bytes))
ext = fmt if fmt != "unknown" else "jpg"
temp_dir = Path("/tmp/claude_bot_files")
temp_dir.mkdir(exist_ok=True)
image_path = temp_dir / f"image_{uuid.uuid4()}.{ext}"
image_path.write_bytes(bytes(image_bytes))
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If _detect_format() returns "unknown", the code still writes the bytes to disk and forces a .jpg extension. That can produce misleading files and bypass the existing size/format checks in validate_image(). Consider validating (size + known format) before writing, and raising a clear error instead of defaulting to .jpg.

Copilot uses AI. Check for mistakes.
Comment on lines +42 to +58
"""Process uploaded image — save to temp file and build a path-based prompt."""
import uuid

# Download image
# Download image bytes
file = await photo.get_file()
image_bytes = await file.download_as_bytearray()

# Detect image type
image_type = self._detect_image_type(image_bytes)
# Detect format and save to temp file so Claude CLI can read it
fmt = self._detect_format(bytes(image_bytes))
ext = fmt if fmt != "unknown" else "jpg"
temp_dir = Path("/tmp/claude_bot_files")
temp_dir.mkdir(exist_ok=True)
image_path = temp_dir / f"image_{uuid.uuid4()}.{ext}"
image_path.write_bytes(bytes(image_bytes))

# Create appropriate prompt
# Detect image type for prompt tailoring
image_type = self._detect_image_type(bytes(image_bytes))
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor maintainability: uuid is imported inside the method and the image bytes are repeatedly converted via bytes(image_bytes). Consider importing uuid at module scope (consistent with FileHandler) and assigning bytes_data = bytes(image_bytes) once to reuse for format/type detection and write_bytes.

Copilot uses AI. Check for mistakes.
@FridayOpenClawBot
Copy link

PR Review
Reviewed head: b00d01812c1ff0761b3f8d8b40117d027d6546cc

Summary

  • Fixes silently broken image uploads by saving the image to /tmp/claude_bot_files/ and embedding the file path in the prompt, since the Claude CLI SDK only accepts text prompts
  • Correct diagnosis of the root cause; approach is pragmatic given the SDK constraint

What looks good

  • Root cause analysis is solid and the fix is the right shape given Claude CLI's text-only prompt interface
  • exist_ok=True on mkdir avoids race conditions; /tmp always exists so no parents=True needed
  • fmt is reused for both the extension and the metadata, avoiding a redundant second _detect_format() call
  • 491 tests passing with no regressions

Issues / questions

  1. [Blocker] src/bot/features/image_handler.pyDoes embedding a file path in the prompt actually cause Claude CLI to read the image? The PR asserts this works but provides no evidence. If Claude CLI doesn't auto-read paths mentioned in prompts (it likely doesn't unless explicitly invoked with a file argument), the fix resolves nothing — it just changes the error from "I don't see an image" to "I can't read that path." Needs a smoke-test screenshot or confirmation of which Claude CLI flag/behaviour makes this work before merging.

  2. [Important] src/bot/features/image_handler.pyNo temp file cleanup. Every image processed writes a file that is never deleted. On a busy bot this is an unbounded disk leak. Add a try/finally that unlinks the path after run_command() completes, or use tempfile.NamedTemporaryFile(delete=True) scoped to the handler call.

  3. [Important] src/bot/features/image_handler.pyimport uuid is inside the function body. CLAUDE.md explicitly requires module-level imports. Move it to the top of the file next to the other stdlib imports.

  4. [Important] src/bot/features/image_handler.pyProcessedImage.base64_data is now dead code. The base64 is still computed and stored but nothing uses it (the comment # Convert to base64 for Claude (if supported in future) was removed). Either remove the field + computation to avoid confusion, or leave the comment explaining why it's retained. As-is it's misleading.

  5. [Nit] src/bot/features/image_handler.py_detect_image_type() is now called with bytes(image_bytes) after _detect_format() already converted it. The double conversion is harmless but consider assigning image_bytes_bytes = bytes(image_bytes) once at the top and reusing it.

Suggested tests (if needed)

  • Integration test: upload an image, assert the generated prompt string contains a valid path under /tmp/claude_bot_files/, and assert that path exists on disk at the time run_command() is called
  • Unit test: assert temp file is cleaned up after processing completes (once cleanup is added)
  • Confirm (manually or in CI) that the Claude CLI actually reads the image when given a path in the prompt text — this is the core assumption the fix rests on

Verdict
⚠️ Merge after fixes — the disk-leak and dead-code issues need addressing, and the import uuid is a quick style fix. More critically, the fundamental assumption (Claude CLI reads image paths from prompt text) should be validated before this ships; if it doesn't work that way, the approach needs rethinking entirely.

Friday, AI assistant to @RichardAtCT

Copy link
Owner

@RichardAtCT RichardAtCT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review
Reviewed head: b00d01812c1ff0761b3f8d8b40117d027d6546cc

Summary

  • Fixes silently broken image uploads by saving the image to /tmp/claude_bot_files/ and embedding the file path in the prompt, since the Claude CLI SDK only accepts text prompts
  • Correct diagnosis of the root cause; approach is pragmatic given the SDK constraint

What looks good

  • Root cause analysis is solid and the fix is the right shape given Claude CLI's text-only prompt interface
  • exist_ok=True on mkdir avoids race conditions; /tmp always exists so no parents=True needed
  • fmt is reused for both the extension and the metadata, avoiding a redundant second _detect_format() call
  • 491 tests passing with no regressions

Issues / questions

  1. [Blocker] src/bot/features/image_handler.pyDoes embedding a file path in the prompt actually cause Claude CLI to read the image? The PR asserts this works but provides no evidence. If Claude CLI doesn't auto-read paths mentioned in prompts (it likely doesn't unless explicitly invoked with a file argument), the fix resolves nothing — it just changes the error from "I don't see an image" to "I can't read that path." Needs a smoke-test screenshot or confirmation of which Claude CLI flag/behaviour makes this work before merging.

  2. [Important] src/bot/features/image_handler.pyNo temp file cleanup. Every image processed writes a file that is never deleted. On a busy bot this is an unbounded disk leak. Add a try/finally that unlinks the path after run_command() completes, or use tempfile.NamedTemporaryFile(delete=True) scoped to the handler call.

  3. [Important] src/bot/features/image_handler.pyimport uuid is inside the function body. CLAUDE.md explicitly requires module-level imports. Move it to the top of the file next to the other stdlib imports.

  4. [Important] src/bot/features/image_handler.pyProcessedImage.base64_data is now dead code. The base64 is still computed and stored but nothing uses it (the comment # Convert to base64 for Claude (if supported in future) was removed). Either remove the field + computation to avoid confusion, or leave the comment explaining why it's retained. As-is it's misleading.

  5. [Nit] src/bot/features/image_handler.py_detect_image_type() is now called with bytes(image_bytes) after _detect_format() already converted it. The double conversion is harmless but consider assigning image_bytes_bytes = bytes(image_bytes) once at the top and reusing it.

Suggested tests (if needed)

  • Integration test: upload an image, assert the generated prompt string contains a valid path under /tmp/claude_bot_files/, and assert that path exists on disk at the time run_command() is called
  • Unit test: assert temp file is cleaned up after processing completes (once cleanup is added)
  • Confirm (manually or in CI) that the Claude CLI actually reads the image when given a path in the prompt text — this is the core assumption the fix rests on

Verdict
⚠️ Merge after fixes — the disk-leak and dead-code issues need addressing, and the import uuid is a quick style fix. More critically, the fundamental assumption (Claude CLI reads image paths from prompt text) should be validated before this ships; if it doesn't work that way, the approach needs rethinking entirely.

Friday, AI assistant to @RichardAtCT

- Move uuid import to module scope (CLAUDE.md style requirement)
- Validate image format before writing to disk; raise ValueError for unknown formats
- Use restrictive permissions: dir 0o700, file 0o600
- Assign image_bytes once, remove redundant bytes() conversions
- Explicitly instruct Claude to use Read tool for image path in all prompts
- Add finally block in agentic_photo to delete temp file after run_command
- Retain base64_data field with comment clarifying it is for future SDK support
@haripatel07
Copy link
Contributor Author

Addressed all review feedback in 62f27f7:

  • uuid moved to module scope
  • Format validated before writing to disk — ValueError raised for unknown formats, .jpg fallback removed
  • 0o700 dir + 0o600 file permissions
  • Single bytes() conversion at top of method
  • All four prompt builders now explicitly say "Please read the image file at this path using your Read tool: {path}" — this is how Claude CLI accesses the image since the SDK has no native image attachment API (confirmed by inspecting client.py)
  • finally block in agentic_photo deletes temp file after run_command completes
  • base64_data retained with comment explaining it's reserved for future multimodal SDK support

491 passing, 0 failures.

@FridayOpenClawBot
Copy link

PR Review (follow-up)
Reviewed head: 62f27f7
Previous head: b00d018

Previous issues — status

  1. Path-based prompt validation — Prompt strings now explicitly instruct Claude CLI to "read the image file at this path using your Read tool", which is a reasonable best-effort fix. Still untestable via static review, but the intent is documented.
  2. Temp file cleanupfinally block in orchestrator.py calls unlink(missing_ok=True). Covers the happy path and most error paths.
  3. import uuid inside function body — Moved to module-level imports.
  4. base64_data dead code — Field moved to optional (default ""), comment added: "Retained for future multimodal SDK support — not currently used." Acceptable.
  5. Double bytes() conversionimage_bytes is now cast once at download; subsequent uses pass it directly.

New observations

  • Cleanup gap on format error: process_image raises ValueError for unknown formats before writing the temp file, so cleanup is fine there. But if write_bytes or chmod throws, processed_image will be None in the finally block and the partially-written file (if any) won't be cleaned up. Low probability, but worth noting.
  • Dict[str, any]Dict[str, object] — Good fix; any (lowercase) was a type annotation bug.
  • Missing type hints on _create_*_prompt methods — CLAUDE.md requires type hints on all functions. If these were pre-existing and untouched in scope, ignore; but the incremental diff shows them being modified, so they should carry full annotations.
  • Cleanup scope: Cleanup only runs for the agentic photo path. A context-manager wrapper on ProcessedImage would make this airtight long-term — not blocking, just noting for future.

Verdict
✅ Ready to merge — all blocking issues from the previous review are addressed. The remaining observations are minor and mostly pre-existing. Good iteration.

Friday, AI assistant to @RichardAtCT

@FridayOpenClawBot
Copy link

PR Review (follow-up)
Reviewed head: 62f27f7
Previous head: b00d018

Previous issues — status

  1. Path-based prompt validation — Prompt strings now explicitly instruct Claude CLI to "read the image file at this path using your Read tool", which is a reasonable best-effort fix. Still untestable via static review, but the intent is documented.
  2. Temp file cleanupfinally block in orchestrator.py calls unlink(missing_ok=True). Covers the happy path and most error paths.
  3. import uuid inside function body — Moved to module-level imports.
  4. base64_data dead code — Field moved to optional (default ""), comment added: "Retained for future multimodal SDK support — not currently used." Acceptable.
  5. Double bytes() conversionimage_bytes is now cast once at download; subsequent uses pass it directly.

New observations

  • Cleanup gap on format error: process_image raises ValueError for unknown formats before writing the temp file, so cleanup is fine there. But if write_bytes or chmod throws, processed_image will be None in the finally block and the partially-written file (if any) won't be cleaned up. Low probability, but worth a one-liner except around write_bytes to unlink the path if it exists.
  • _TEMP_DIR.mkdir called on every image. Not a bug — exist_ok=True is cheap — just noting it's fine in practice.
  • Dict[str, any]Dict[str, object] — Good fix; lowercase any was a type annotation bug.
  • metadata field ordering: base64_data is now the last field with a default, which is required for dataclass field ordering to be valid. ✅ Correct.
  • Missing type hints on _create_*_prompt methods — The diff shows def _create_screenshot_prompt(self, caption, image_path): with no annotations. CLAUDE.md requires type hints on all functions. If these methods were touched in this PR, they need annotations; if pre-existing and out of scope, flag for a follow-up.
  • Cleanup only runs for the photo processing path. If the bot returns early before orchestrator.py's finally (e.g. middleware rejection), files could still leak. Acceptable tradeoff for now, but a context-manager wrapper on ProcessedImage would make this airtight long-term.

Verdict
⚠️ Merge after fixes — the _create_*_prompt type hints need a quick CLAUDE.md compliance check, and a narrow except around write_bytes would close the partial-write cleanup gap. Everything else is a solid improvement over the last revision.

Friday, AI assistant to @RichardAtCT

- Wrap write_bytes + chmod in try/except to unlink partial file on failure
- Add type hints to all four _create_*_prompt methods (CLAUDE.md compliance)
@haripatel07
Copy link
Contributor Author

Fixed in 48612f2:

  • Wrapped write_bytes + chmod in try/except — partial file unlinked on failure before re-raising
  • Type hints added to all four _create_*_prompt methods (CLAUDE.md compliance)

491 passing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Image/screenshot upload with analysis doesn't work

4 participants