Add multipart/form-data and signed URL support to /glmocr/parse by majcheradam · Pull Request #161 · zai-org/GLM-OCR

majcheradam · 2026-03-23T17:39:41Z

Summary

server.py: /glmocr/parse now accepts multipart/form-data (uploaded files via files field, remote URLs via urls field) and http/https signed URLs in JSON mode; all inputs are converted to data: URIs in-memory — no temp files written
image_utils.py: added pdf_bytes_to_images_pil and pdf_bytes_to_images_pil_iter for zero-disk-I/O PDF rendering via pypdfium2 in-memory bytes API
page_loader.py: added data:application/pdf;base64,... routing in _load_source / _iter_source; added _load_pdf_bytes and _iter_pdf_bytes methods

Usage

File upload (multipart):

curl -X POST http://localhost:5002/glmocr/parse \
  -F "files=@document.pdf" \
  -F "files=@image.png"

Signed URL (JSON):

curl -X POST http://localhost:5002/glmocr/parse \
  -H "Content-Type: application/json" \
  -d '{"images": ["https://s3.amazonaws.com/bucket/doc.pdf?X-Amz-Signature=..."]}'

Mixed (multipart + remote URLs):

curl -X POST http://localhost:5002/glmocr/parse \
  -F "files=@local.pdf" \
  -F "urls=https://signed-url/remote.png"

Test plan

Upload a PDF via multipart — verify all pages rendered correctly
Upload an image via multipart — verify result matches JSON path
Pass a signed S3/GCS URL in JSON images — verify fetched in-memory
Pass a signed URL in multipart urls field — verify resolved correctly
Existing application/json with file:// and data: URLs still works
Invalid Content-Type returns 415

🤖 Generated with Claude Code

- server.py: accept multipart/form-data (file uploads via "files" field, remote URLs via "urls" field) and http/https signed URLs in JSON mode; all inputs converted to data: URIs in-memory, no temp files written - image_utils.py: add pdf_bytes_to_images_pil and pdf_bytes_to_images_pil_iter for in-memory PDF rendering via pypdfium2 - page_loader.py: add data:application/pdf;base64 branch in _load_source and _iter_source; add _load_pdf_bytes and _iter_pdf_bytes methods Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

majcheradam closed this Mar 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multipart/form-data and signed URL support to /glmocr/parse#161

Add multipart/form-data and signed URL support to /glmocr/parse#161
majcheradam wants to merge 1 commit intozai-org:mainfrom
ocrbase-hq:main

majcheradam commented Mar 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

majcheradam commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Usage

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

majcheradam commented Mar 23, 2026 •

edited

Loading