[docs] [template] [data] Refactor current LLM Batch inference template #59897

Aydin-ab · 2026-01-06T18:19:34Z

follow up of this (closed for inactivity over the holidays):
#58571

There are a lot of files changed but the main technical content to review are the README.ipynb

For context, the goal is to refactor this current template
https://console.anyscale.com/template-preview/llm_batch_inference
And split it into 2: one on text data, and the other on vision data
Both are very similar, but should be read independently as two distinct content 👍

Signed-off-by: Aydin Abiar <[email protected]>

…full 2M Signed-off-by: Aydin Abiar <[email protected]>

Signed-off-by: Aydin Abiar <[email protected]>

gemini-code-assist

Code Review

This pull request introduces two new beginner-level Ray Data LLM batch inference examples: one for text data (reformatting dates) and another for vision data (generating image captions). The changes include adding new vocabulary terms like 'postprocess' and 'reformat' to the Vale style guide, updating documentation configuration to include the new example READMEs, and adding entries for these examples in examples.yml. For each example, new files were added, including Jupyter notebooks, corresponding Python scripts (for small and scaled datasets), Anyscale job configurations, CI scripts for testing and converting notebooks to markdown, and cluster configurations for AWS and GCE. Review comments highlighted several inconsistencies and areas for improvement: a grammatical error ('On contrary' instead of 'On the contrary'), trailing blank lines in YAML files, duplication of the nb2py.py utility script across examples, and an out-of-sync README for the vision example. Additionally, there were inconsistencies in the number of partitions used in the vision example's Python script versus its notebook, and a missing repartitioning step for large datasets in the vision notebook.

doc/source/data/examples/llm_batch_inference_text/content/README.ipynb

doc/source/data/examples/llm_batch_inference_vision/ci/aws.yaml

doc/source/data/examples/llm_batch_inference_vision/ci/nb2py.py

doc/source/data/examples/llm-batch-inference-vision/content/README.md

doc/source/data/examples/llm_batch_inference_vision/content/batch_inference_vision.py

doc/source/data/examples/llm_batch_inference_vision/content/batch_inference_vision_scaled.py

doc/source/data/examples/llm_batch_inference_vision/ci/nb2py.py

Signed-off-by: Aydin Abiar <[email protected]>

nrghosh

left comments inline
broader and other detailed comments below

Both Notebooks

detokenize=false issue
Job config filename mismatches

Comments & Explanations

max_model_len is a hard sequence length cap, not an "estimate." Sequences exceeding this will error at runtime
Repartition numbers (64, 128, 256) are arbitrary magic numbers with no explained heuristic (should be ~2-4x worker count)
batch_size recommendations don't explain the actual constraint (KV cache memory per sequence)

Performance Section

concurrency and batch size advice is kind of circular, vague and not super actionable—no guidance on how to determine or actually think about those values
quantization examples need caveats / disclaimers and explanation about model arch / hardware type / quantization method, they are not interchangable
Model parallelism example suddenly switches to different model (Llama 90B) without explanation— and model ref switches between a lot of the examples - confusing to walk through

Missing Content

No error handling for malformed inputs (corrupt images, bad data)
No checkpointing example or discussion despite intro mentioning it

Text Notebook Specific

Use Case & Prompt

Date reformatting seems like a trivial use-case even for a demo, why do something that a regex could do?
temperature=0.3 why? why not 0?
No output validation in postprocess—should probably validate MM-DD-YYYY format
mention guided/constrained decoding to guarantee valid output?
String concatenation is missing a space: "...MM-DD-YYYY.""Be concise..."

Vision Notebook Specific

Code

explain why 225x225 resize? model specific? users may not understand why or how to adapt.
Notebook does %pip install datasets - maybe pin version?

Explanations

explain why max_model_len=8192 vs text's 256 (32x difference) - talk about vision overhead?
"go over your Anyscale Job" → "go to your Anyscale Job"
show image validation as part of the demo?

def preprocess(row):
    try:
        image = Image.open(BytesIO(row['jpg']['bytes'])).convert('RGB')
    except Exception as e:
        return None

and maybe show off how Ray Data handles None returns (or use filter() to drop failures).

Both

Performance remarks / tips and pasted code examples doesn't read super smoothly and is confusing (if I put on my user hat and look at it like the first time) - jumping between models, magic numbners, etc - could be a lot more thorough in explanation. Will defer to https://github.com/anyscale/docs/pull/1626/changes for more detailed tuning and performance focus.

doc/source/data/examples/llm_batch_inference_vision/content/README.ipynb

doc/source/data/examples/llm_batch_inference_text/content/README.ipynb

doc/source/data/examples/llm_batch_inference_vision/content/README.ipynb

doc/source/data/examples/llm_batch_inference_text/content/README.ipynb

…of batch size, concurrency, more refs to docs links, refactor quantization and model parallelism section for more readability, add image validation, mention anyscale runtime, pin datasets version Signed-off-by: Aydin Abiar <[email protected]>

Signed-off-by: Aydin Abiar <[email protected]>

Aydin-ab · 2026-01-08T01:40:15Z

@nrghosh
Followed your suggestions + the ones in your main comment (changing the prompt task etc)
I'll test the new code tomorrow but if the content looks ok now let me know 👍 Thanks a lot

doc/source/data/examples/llm_batch_inference_text/content/README.ipynb

doc/source/data/examples/llm_batch_inference_vision/content/README.ipynb

Signed-off-by: Aydin Abiar <[email protected]>

doc/source/data/examples/llm_batch_inference_vision/content/batch_inference_vision_scaled.py

nrghosh · 2026-01-08T18:10:16Z

/gemini review

nrghosh

thanks @Aydin-ab - see also cursor/gemini comments when it comes to code. As long as you are able to run them successfully, should be free of serious bugs now.

gemini-code-assist

Code Review

The pull request introduces new LLM batch inference examples for both text and vision data, along with their corresponding CI configurations and helper scripts. The changes effectively split the existing template into two distinct, independently readable content pieces, which is a good refactoring step. The new examples demonstrate how to use Ray Data LLM APIs for batch inference, including data preparation, processor configuration, and scaling considerations. The addition of CI scripts ensures these examples remain functional.

However, there are a few areas that could be improved for robustness and clarity:

The nb2py.py scripts use specific string matching to modify dataset limits for CI. This approach is brittle and could break if the exact string in the notebook changes.
Some comments in the Jupyter notebooks are slightly misleading regarding dataset size limits.
The standalone Python scripts contain hardcoded configuration values that would ideally be configurable for real-world use cases.

doc/source/data/examples/llm_batch_inference_text/ci/nb2py.py

doc/source/data/examples/llm_batch_inference_text/content/README.ipynb

doc/source/data/examples/llm_batch_inference_text/content/batch_inference_text.py

doc/source/data/examples/llm_batch_inference_vision/ci/nb2py.py

doc/source/data/examples/llm_batch_inference_vision/content/README.ipynb

doc/source/data/examples/llm_batch_inference_vision/content/batch_inference_vision.py

Signed-off-by: Aydin Abiar <[email protected]>

doc/source/data/examples/llm_batch_inference_text/content/batch_inference_text.py

doc/source/data/examples/llm_batch_inference_vision/content/batch_inference_vision.py

doc/source/data/examples/llm_batch_inference_vision/content/batch_inference_vision_scaled.py

Signed-off-by: Aydin Abiar <[email protected]>

doc/source/data/examples/llm_batch_inference_text/content/batch_inference_text.py

Signed-off-by: Aydin Abiar <[email protected]>

doc/source/data/examples/llm_batch_inference_vision/content/batch_inference_vision.py

Signed-off-by: Aydin Abiar <[email protected]>

Aydin Abiar added 30 commits November 12, 2025 12:10

move content into content/

9d3e8d1

Signed-off-by: Aydin Abiar <[email protected]>

make release test ci/ folder

cd90a58

Signed-off-by: Aydin Abiar <[email protected]>

make template workspace configs/ (compute config)

5c4e5ce

Signed-off-by: Aydin Abiar <[email protected]>

adding release tests

4a11152

Signed-off-by: Aydin Abiar <[email protected]>

remove README from sphinx discovery

ef2c09f

Signed-off-by: Aydin Abiar <[email protected]>

move order in examples.yml

21d6470

Signed-off-by: Aydin Abiar <[email protected]>

add vocabulary for vale compliance

f67cab0

Signed-off-by: Aydin Abiar <[email protected]>

(ci test only) limit size of large dataset to 10k samples instead of …

8f76af4

…full 2M Signed-off-by: Aydin Abiar <[email protected]>

fix link to notebook in examples.yml

7be6989

Signed-off-by: Aydin Abiar <[email protected]>

fix prompt

16a4d35

Signed-off-by: Aydin Abiar <[email protected]>

adding to examples.yml

ff185f0

Signed-off-by: Aydin Abiar <[email protected]>

adding release test config

d333e83

Signed-off-by: Aydin Abiar <[email protected]>

update content with new dataset

b6d758c

Signed-off-by: Aydin Abiar <[email protected]>

add byod for release test

1466052

Signed-off-by: Aydin Abiar <[email protected]>

nitpick template vision text

11486df

Signed-off-by: Aydin Abiar <[email protected]>

adding llm batch inference vision workspace

62b50b2

Signed-off-by: Aydin Abiar <[email protected]>

refactor content for consistency

1e4450b

Signed-off-by: Aydin Abiar <[email protected]>

fix worker node gce

31b0634

Signed-off-by: Aydin Abiar <[email protected]>

reformat

f5d2b61

Signed-off-by: Aydin Abiar <[email protected]>

vale

15e01ce

Signed-off-by: Aydin Abiar <[email protected]>

sphinx orphan for notebook

cfb1979

Signed-off-by: Aydin Abiar <[email protected]>

update image

858144c

Signed-off-by: Aydin Abiar <[email protected]>

adding batch inference llm visio nhelper modules

94006d7

Signed-off-by: Aydin Abiar <[email protected]>

reformat nitpicks

345483b

Signed-off-by: Aydin Abiar <[email protected]>

fix issues cursor

611a7b6

Signed-off-by: Aydin Abiar <[email protected]>

fix job config

92243df

Signed-off-by: Aydin Abiar <[email protected]>

fix job config for vision tempalte

06f24e6

Signed-off-by: Aydin Abiar <[email protected]>

add datasets to byod

271f2c9

Signed-off-by: Aydin Abiar <[email protected]>

ignore % command cell in ci testing

7575da0

Signed-off-by: Aydin Abiar <[email protected]>

change compute ocnfigs to L4 instead of L40S

4224abe

Signed-off-by: Aydin Abiar <[email protected]>

gemini-code-assist bot reviewed Jan 6, 2026

View reviewed changes

cursor bot reviewed Jan 6, 2026

View reviewed changes

doc/source/data/examples/llm_batch_inference_vision/ci/nb2py.py Outdated Show resolved Hide resolved

refactor/rename for easier maintainability

4d1f6c9

Signed-off-by: Aydin Abiar <[email protected]>

ray-gardener bot added docs An issue or change related to documentation data Ray Data-related issues labels Jan 6, 2026

Aydin Abiar added 2 commits January 6, 2026 11:24

follow git bots suggestions

c97621f

Signed-off-by: Aydin Abiar <[email protected]>

minor gramatical fix

b3b8d3d

Signed-off-by: Aydin Abiar <[email protected]>

nrghosh requested changes Jan 7, 2026

View reviewed changes

Aydin Abiar added 2 commits January 7, 2026 17:25

change prompt task to something more relevant to LLM capabilities

6ffeb78

Signed-off-by: Aydin Abiar <[email protected]>

cursor bot reviewed Jan 8, 2026

View reviewed changes

doc/source/data/examples/llm_batch_inference_text/content/README.ipynb Outdated Show resolved Hide resolved

doc/source/data/examples/llm_batch_inference_vision/content/README.ipynb Outdated Show resolved Hide resolved

fix typos, fix error in nb2py and update byod

4448896

Signed-off-by: Aydin Abiar <[email protected]>

cursor bot reviewed Jan 8, 2026

View reviewed changes

doc/source/data/examples/llm_batch_inference_vision/content/batch_inference_vision_scaled.py Outdated Show resolved Hide resolved

nrghosh reviewed Jan 8, 2026

View reviewed changes

gemini-code-assist bot reviewed Jan 8, 2026

View reviewed changes

Aydin Abiar added 3 commits January 8, 2026 11:26

add parameter to change dataset size

41de5fa

Signed-off-by: Aydin Abiar <[email protected]>

fix vale errors

cca6c0a

Signed-off-by: Aydin Abiar <[email protected]>

fix typo bug + add structured output to text task

9dbe446

Signed-off-by: Aydin Abiar <[email protected]>

cursor bot reviewed Jan 8, 2026

View reviewed changes

Aydin Abiar added 4 commits January 8, 2026 13:43

fix typo

6289489

Signed-off-by: Aydin Abiar <[email protected]>

fix structured output bug

d109d61

Signed-off-by: Aydin Abiar <[email protected]>

remove mention of notebook

22d189e

Signed-off-by: Aydin Abiar <[email protected]>

reorder examples with more consistent naming/titles

4982f68

Signed-off-by: Aydin Abiar <[email protected]>

cursor bot reviewed Jan 9, 2026

View reviewed changes

doc/source/data/examples/llm_batch_inference_text/content/batch_inference_text.py Outdated Show resolved Hide resolved

Aydin Abiar added 2 commits January 8, 2026 20:40

consistent .py script + add informative comments about structured output

55d6667

Signed-off-by: Aydin Abiar <[email protected]>

fix header typo

0f64701

Signed-off-by: Aydin Abiar <[email protected]>

cursor bot reviewed Jan 9, 2026

View reviewed changes

doc/source/data/examples/llm_batch_inference_vision/content/batch_inference_vision.py Outdated Show resolved Hide resolved

remove detokenize=

9f126ca

Signed-off-by: Aydin Abiar <[email protected]>

[docs] [template] [data] Refactor current LLM Batch inference template #59897

Are you sure you want to change the base?

[docs] [template] [data] Refactor current LLM Batch inference template #59897

Conversation

Aydin-ab commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nrghosh left a comment

Choose a reason for hiding this comment

Both Notebooks

Text Notebook Specific

Vision Notebook Specific

Both

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Aydin-ab commented Jan 8, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nrghosh commented Jan 8, 2026

Uh oh!

nrghosh left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Aydin-ab commented Jan 6, 2026 •

edited

Loading