-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[docs] [template] [data] Refactor current LLM Batch inference template #59897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Aydin-ab
wants to merge
66
commits into
ray-project:master
Choose a base branch
from
Aydin-ab:docs/data/examples/llm_batch_inference_small
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 55 commits
Commits
Show all changes
66 commits
Select commit
Hold shift + click to select a range
9d3e8d1
move content into content/
cd90a58
make release test ci/ folder
5c4e5ce
make template workspace configs/ (compute config)
4a11152
adding release tests
ef2c09f
remove README from sphinx discovery
21d6470
move order in examples.yml
f67cab0
add vocabulary for vale compliance
8f76af4
(ci test only) limit size of large dataset to 10k samples instead of …
7be6989
fix link to notebook in examples.yml
16a4d35
fix prompt
ff185f0
adding to examples.yml
d333e83
adding release test config
b6d758c
update content with new dataset
1466052
add byod for release test
11486df
nitpick template vision text
62b50b2
adding llm batch inference vision workspace
1e4450b
refactor content for consistency
31b0634
fix worker node gce
f5d2b61
reformat
15e01ce
vale
cfb1979
sphinx orphan for notebook
858144c
update image
94006d7
adding batch inference llm visio nhelper modules
345483b
reformat nitpicks
611a7b6
fix issues cursor
92243df
fix job config
06f24e6
fix job config for vision tempalte
271f2c9
add datasets to byod
7575da0
ignore % command cell in ci testing
4224abe
change compute ocnfigs to L4 instead of L40S
fd4beb5
min_node 0 now
6fa6bb8
refactor nitpicks
4d42086
no engine parameters anymore + add stats() tips
a908ea3
rename variables for scaling section
0b6e54c
refactor +rebuild helper modules
b30a21f
Merge branch 'master' into docs/data/examples/llm_batch_inference_small
Aydin-ab 51d9054
remove saving dataset section (buggy)
b9e5369
fix typoi
db15a49
fix typoi
863cfa8
sync with notebook
8057e69
suggestions applied
666f623
increase header leavel of performance tips
ca16504
ref to anyscale docs
cab65e3
add model parallelism sectioon
f50dcd8
add markdwon note about repartition()
44ad061
ci test on small datset instead of 1M rows
fbbc96c
add helper conversion script + small nitpicks
cc978e6
apply kunling suggestions
42891fd
Apply suggestions from code review
Aydin-ab 44c0188
Merge branch 'master' into docs/data/examples/llm_batch_inference_small
Aydin-ab 4d1f6c9
refactor/rename for easier maintainability
c97621f
follow git bots suggestions
b3b8d3d
minor gramatical fix
e3a2445
apply suggestions: fix detokenize issue, typos, thorough explanation …
6ffeb78
change prompt task to something more relevant to LLM capabilities
4448896
fix typos, fix error in nb2py and update byod
41de5fa
add parameter to change dataset size
cca6c0a
fix vale errors
9dbe446
fix typo bug + add structured output to text task
6289489
fix typo
d109d61
fix structured output bug
22d189e
remove mention of notebook
4982f68
reorder examples with more consistent naming/titles
55d6667
consistent .py script + add informative comments about structured output
0f64701
fix header typo
9f126ca
remove detokenize=
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -25,4 +25,4 @@ filegroup( | |
| "**/ci/gce.yaml" | ||
| ]), | ||
| visibility = ["//release:__pkg__"], | ||
| ) | ||
| ) | ||
17 changes: 17 additions & 0 deletions
17
doc/source/data/examples/llm_batch_inference_text/ci/aws.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| cloud_id: {{env["ANYSCALE_CLOUD_ID"]}} | ||
| region: us-west-2 | ||
|
|
||
| # Head node | ||
| head_node_type: | ||
| name: 8CPU-32GB | ||
| instance_type: m5.2xlarge | ||
|
|
||
| # Worker nodes | ||
| worker_node_types: | ||
| - name: 1xL4:8CPU-32GB | ||
| instance_type: g6.2xlarge | ||
| min_workers: 0 | ||
| max_workers: 10 | ||
|
|
||
| flags: | ||
| allow-cross-zone-autoscaling: true |
17 changes: 17 additions & 0 deletions
17
doc/source/data/examples/llm_batch_inference_text/ci/gce.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| cloud_id: {{env["ANYSCALE_CLOUD_ID"]}} | ||
| region: us-central1 | ||
|
|
||
| # Head node | ||
| head_node_type: | ||
| name: 8CPU-32GB | ||
| instance_type: n2-standard-8 | ||
|
|
||
| # Worker nodes | ||
| worker_node_types: | ||
| - name: 1xL4:8CPU-32GB | ||
| instance_type: g2-standard-8-nvidia-l4-1 | ||
| min_workers: 0 | ||
| max_workers: 10 | ||
|
|
||
| flags: | ||
| allow-cross-zone-autoscaling: true |
78 changes: 78 additions & 0 deletions
78
doc/source/data/examples/llm_batch_inference_text/ci/nb2py.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| #!/usr/bin/env python3 | ||
| import argparse | ||
| import nbformat | ||
|
|
||
|
|
||
| def convert_notebook( | ||
| input_path: str, output_path: str, ignore_cmds: bool = False | ||
| ) -> None: | ||
| """ | ||
| Read a Jupyter notebook and write a Python script, converting all %%bash | ||
| cells and IPython "!" commands into subprocess.run calls that raise on error. | ||
| Cells that load or autoreload extensions are ignored. | ||
| """ | ||
| nb = nbformat.read(input_path, as_version=4) | ||
| with open(output_path, "w") as out: | ||
| for cell in nb.cells: | ||
| # Only process code cells | ||
| if cell.cell_type != "code": | ||
| continue | ||
|
|
||
| lines = cell.source.splitlines() | ||
|
|
||
| # Detect a %%bash cell | ||
| if lines: | ||
| # Detect any IPython '!' shell commands in code lines | ||
| has_bang = any(line.lstrip().startswith("!") for line in lines) | ||
| # Start with "serve run" "serve shutdown" "curl" or "anyscale service" commands | ||
| to_ignore_cmd = ( | ||
| "serve run", | ||
| "serve shutdown", | ||
| "curl", | ||
| "anyscale service", | ||
| ) | ||
| has_ignored_start = any( | ||
| line.lstrip().startswith(to_ignore_cmd) for line in lines | ||
| ) | ||
| if has_bang or has_ignored_start: | ||
| if ignore_cmds: | ||
| continue | ||
| out.write("import subprocess\n") | ||
| for line in lines: | ||
| stripped = line.lstrip() | ||
| if stripped.startswith("!"): | ||
| cmd = stripped[1:].lstrip() | ||
| out.write( | ||
| f"subprocess.run(r'''{cmd}''',\n" | ||
| " shell=True,\n" | ||
| " check=True,\n" | ||
| " executable='/bin/bash')\n" | ||
| ) | ||
| else: | ||
| out.write(line.rstrip() + "\n") | ||
| out.write("\n") | ||
| else: | ||
| # Regular Python cell: | ||
| code = cell.source.rstrip() | ||
| if "ds_large = ds.limit(1_000_000)" in code: | ||
| # Instead of testing a large dataset in CI, test a small dataset | ||
| code = code.replace("ds.limit(1_000_000)", "ds.limit(10_000)") | ||
| # else, dump as-is | ||
| out.write(code + "\n\n") | ||
|
|
||
|
|
||
| def main() -> None: | ||
| parser = argparse.ArgumentParser( | ||
| description="Convert a Jupyter notebook to a Python script, preserving bash cells and '!' commands as subprocess calls unless ignored with --ignore-cmds." | ||
| ) | ||
| parser.add_argument("input_nb", help="Path to the input .ipynb file") | ||
| parser.add_argument("output_py", help="Path for the output .py script") | ||
| parser.add_argument( | ||
| "--ignore-cmds", action="store_true", help="Ignore bash cells and '!' commands" | ||
| ) | ||
| args = parser.parse_args() | ||
| convert_notebook(args.input_nb, args.output_py, ignore_cmds=args.ignore_cmds) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() | ||
14 changes: 14 additions & 0 deletions
14
doc/source/data/examples/llm_batch_inference_text/ci/tests.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| #!/bin/bash | ||
|
|
||
| # Install requirements first (done by CI automatically): | ||
| # release/ray_release/byod/byod_llm_batch_inference_text.sh | ||
|
|
||
| # Don't use nbconvert or jupytext unless you're willing | ||
| # to check each subprocess unit and validate that errors | ||
| # aren't being consumed/hidden | ||
|
|
||
| set -exo pipefail | ||
|
|
||
| python ci/nb2py.py "content/README.ipynb" "content/README.py" --ignore-cmds | ||
| python "content/README.py" | ||
| rm "content/README.py" |
14 changes: 14 additions & 0 deletions
14
doc/source/data/examples/llm_batch_inference_text/configs/aws.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # Head node | ||
| head_node_type: | ||
| name: 8CPU-32GB | ||
| instance_type: m5.2xlarge | ||
|
|
||
| # Worker nodes | ||
| worker_node_types: | ||
| - name: 1xL4:8CPU-32GB | ||
| instance_type: g6.2xlarge | ||
| min_workers: 0 | ||
| max_workers: 10 | ||
|
|
||
| flags: | ||
| allow-cross-zone-autoscaling: true |
14 changes: 14 additions & 0 deletions
14
doc/source/data/examples/llm_batch_inference_text/configs/gce.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # Head node | ||
| head_node_type: | ||
| name: 8CPU-32GB | ||
| instance_type: n2-standard-8 | ||
|
|
||
| # Worker nodes | ||
| worker_node_types: | ||
| - name: 1xL4:8CPU-32GB | ||
| instance_type: g2-standard-8-nvidia-l4-1 | ||
| min_workers: 0 | ||
| max_workers: 10 | ||
|
|
||
| flags: | ||
| allow-cross-zone-autoscaling: true |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.