Skip to content

Commit 388d786

Browse files
authored
Merge branch 'master' into feat/device_name
2 parents fe6121f + e088694 commit 388d786

File tree

10 files changed

+215
-43
lines changed

10 files changed

+215
-43
lines changed

.github/checkgroup.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ subprojects:
4747
- "!*.md"
4848
- "!**/*.md"
4949
checks:
50-
- "pytorch.yml / Lit Job (nvidia/cuda:12.1.1-runtime-ubuntu22.04, pytorch, 3.10, A100_X_2)"
50+
- "pytorch.yml / Lit Job (nvidia/cuda:12.1.1-runtime-ubuntu22.04, pytorch, 3.10, L4_X_2)"
5151
- "pytorch.yml / Lit Job (nvidia/cuda:12.6.3-runtime-ubuntu22.04, lightning, 3.12, L4_X_2)"
5252
- "pytorch.yml / Lit Job (nvidia/cuda:12.6.3-runtime-ubuntu22.04, pytorch, 3.12, L4_X_2)"
5353

@@ -148,7 +148,7 @@ subprojects:
148148
- "!*.md"
149149
- "!**/*.md"
150150
checks:
151-
- "fabric.yml / Lit Job (nvidia/cuda:12.1.1-runtime-ubuntu22.04, fabric, 3.10, A100_X_2)"
151+
- "fabric.yml / Lit Job (nvidia/cuda:12.1.1-runtime-ubuntu22.04, fabric, 3.10, L4_X_2)"
152152
- "fabric.yml / Lit Job (nvidia/cuda:12.6.3-runtime-ubuntu22.04, fabric, 3.12, L4_X_2)"
153153
- "fabric.yml / Lit Job (nvidia/cuda:12.6.3-runtime-ubuntu22.04, lightning, 3.12, L4_X_2)"
154154

.github/markdown-links-config.json

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,8 @@
2323
}
2424
}
2525
],
26-
"timeout": "20s",
26+
"timeout": "30s",
2727
"retryOn429": true,
28-
"retryCount": 5,
29-
"fallbackRetryDelay": "20s"
28+
"retryCount": 10,
29+
"fallbackRetryDelay": "10s"
3030
}

.github/workflows/docs-build.yml

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -67,26 +67,33 @@ jobs:
6767
- uses: actions/checkout@v5
6868
with:
6969
ref: ${{ inputs.checkout }}
70+
token: ${{ secrets.GITHUB_TOKEN }}
7071
# only Pytorch has/uses notebooks
7172
submodules: ${{ matrix.pkg-name == 'pytorch' }}
7273
lfs: ${{ matrix.pkg-name == 'pytorch' }}
73-
- uses: actions/setup-python@v6
74+
75+
- name: Install uv and set Python version
76+
uses: astral-sh/setup-uv@v6
7477
with:
7578
python-version: "3.10"
79+
# TODO: Avoid activating environment like this
80+
# see: https://github.com/astral-sh/setup-uv/tree/v6/?tab=readme-ov-file#activate-environment
81+
activate-environment: true
82+
enable-cache: true
7683

7784
- name: List notebooks
7885
if: ${{ matrix.pkg-name == 'pytorch' }}
7986
working-directory: _notebooks/
8087
run: |
81-
pip install -q py-tree
88+
uv pip install -q py-tree
8289
py-tree .notebooks/
8390
ls -lhR .notebooks/
8491
8592
- name: Pull sphinx template
8693
run: |
87-
pip install -q -r requirements/ci.txt
94+
uv pip install -q -r requirements/ci.txt
8895
aws s3 sync --no-sign-request s3://sphinx-packages/ ${PYPI_LOCAL_DIR}
89-
pip install lai-sphinx-theme -U -f ${PYPI_LOCAL_DIR}
96+
uv pip install lai-sphinx-theme -U -f ${PYPI_LOCAL_DIR}
9097
9198
- name: pip wheels cache
9299
uses: actions/cache/restore@v4
@@ -100,25 +107,29 @@ jobs:
100107
run: |
101108
sudo apt-get update --fix-missing
102109
sudo apt-get install -y pandoc
110+
103111
- name: Install package & dependencies
104112
timeout-minutes: 20
105113
run: |
106114
mkdir -p ${PYPI_CACHE_DIR} # in case cache was not hit
107115
ls -lh ${PYPI_CACHE_DIR}
108-
pip install .[all] -U -r requirements/${{ matrix.pkg-name }}/docs.txt \
116+
uv pip install .[all] -U -r requirements/${{ matrix.pkg-name }}/docs.txt \
109117
-f ${PYPI_LOCAL_DIR} -f ${PYPI_CACHE_DIR} --extra-index-url="${TORCH_URL}"
110-
pip list
118+
uv pip list
119+
111120
- name: Install req. for Notebooks/tutorials
112121
if: matrix.pkg-name == 'pytorch'
113122
timeout-minutes: 10
114-
run: pip install -q -r _notebooks/.actions/requires.txt
123+
run: uv pip install -q -r _notebooks/.actions/requires.txt
115124

116125
- name: Full build for deployment
117126
if: github.event_name != 'pull_request'
118127
run: echo "DOCS_FETCH_ASSETS=1" >> $GITHUB_ENV
128+
119129
- name: Build without warnings
120130
if: github.event_name != 'workflow_dispatch'
121131
run: echo "BUILD_SPHINX_OPTS=-W --keep-going" >> $GITHUB_ENV
132+
122133
- name: Make ${{ matrix.target }}
123134
working-directory: ./docs/source-${{ matrix.pkg-name }}
124135
# allow failing link check and doctest if you run with dispatch
@@ -128,6 +139,7 @@ jobs:
128139
- name: Keep artifact
129140
if: github.event_name == 'pull_request'
130141
run: echo "ARTIFACT_DAYS=7" >> $GITHUB_ENV
142+
131143
- name: Upload built docs
132144
if: ${{ matrix.target == 'html' }}
133145
uses: actions/upload-artifact@v4

.gitmodules

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
[submodule "_notebooks"]
22
path = _notebooks
3-
url = https://github.com/Lightning-AI/lightning-tutorials.git
3+
url = https://github.com/Lightning-AI/tutorials.git
44
branch = publication

.lightning/workflows/benchmark.yml

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -22,12 +22,13 @@ env:
2222
RUN_ONLY_CUDA_TESTS: "1"
2323

2424
run: |
25-
# Install Python and UV
26-
apt-get update -qq --fix-missing
25+
echo "Installing dependencies"
26+
apt-get update -qq --fix-missing -o=Dpkg::Use-Pty=0 &> /dev/null
2727
apt-get install -q -y software-properties-common curl
28-
# Add deadsnakes PPA for newer Python versions if needed
28+
echo "Add deadsnakes PPA for newer Python versions if needed"
2929
add-apt-repository ppa:deadsnakes/ppa -y
30-
apt-get update -qq --fix-missing
30+
apt-get update -qq --fix-missing -o=Dpkg::Use-Pty=0 &> /dev/null
31+
echo "Install Python ${python_version} and other dependencies"
3132
apt-get install -q -y --no-install-recommends --allow-downgrades --allow-change-held-packages \
3233
build-essential \
3334
pkg-config \
@@ -36,23 +37,25 @@ run: |
3637
libopenmpi-dev \
3738
openmpi-bin
3839
40+
echo "Install Python ${python_version} and UV"
3941
apt-get install -y python${python_version} python${python_version}-venv python${python_version}-dev
4042
ln -sf /usr/bin/python${python_version} /usr/bin/python
4143
curl -LsSf https://astral.sh/uv/install.sh | sh
4244
43-
# Source the environment and ensure UV is in PATH
45+
echo "Source the environment and ensure UV is in PATH"
4446
[ -f "$HOME/.local/bin/env" ] && . "$HOME/.local/bin/env"
4547
export PATH="$HOME/.local/bin:$PATH"
4648
source $HOME/.cargo/env 2>/dev/null || true
4749
export PATH="$HOME/.cargo/bin:$PATH"
4850
49-
# Verify UV installation
51+
echo "Verify UV installation"
5052
command -v uv || (echo "UV not found in PATH" && exit 1)
5153
# Create and activate a local uv virtual environment
5254
uv venv .venv -p "/usr/bin/python${python_version}" || uv venv .venv -p "python${python_version}" || uv venv .venv
5355
. .venv/bin/activate
5456
hash -r
5557
58+
echo "Show system information"
5659
whereis nvidia
5760
nvidia-smi
5861
python --version
@@ -68,26 +71,26 @@ run: |
6871
CUDA_VERSION_MM="${CUDA_VERSION_M_M//./}" # "126"
6972
export UV_TORCH_BACKEND=cu${CUDA_VERSION_MM}
7073
71-
# Adjust tests
74+
echo "Adjust tests"
7275
uv pip install -q -r .actions/requirements.txt
7376
python .actions/assistant.py copy_replace_imports --source_dir="./tests" \
7477
--source_import="lightning.fabric,lightning.pytorch" \
7578
--target_import="lightning_fabric,pytorch_lightning"
7679
77-
# Install package
80+
echo "Install package"
7881
uv pip install ".[dev]"
7982
8083
# Env details
8184
python requirements/collect_env_details.py
8285
python -c "import torch ; mgpu = torch.cuda.device_count() ; assert mgpu >= 2, f'GPU: {mgpu}'"
8386
8487
cd tests/
85-
# Testing: benchmarks
88+
echo "Testing: benchmarks"
8689
export PL_RUNNING_BENCHMARKS=1
8790
python -m pytest parity_${PACKAGE_NAME} -v --durations=0
8891
export PL_RUNNING_BENCHMARKS=0
8992
90-
# Testing: fabric standalone tasks
93+
echo "Testing: fabric standalone tasks"
9194
export PL_RUN_STANDALONE_TESTS=1
9295
if [ "${PACKAGE_NAME}" == "fabric" ]; then
9396
cd parity_fabric/

.lightning/workflows/fabric.yml

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ parametrize:
1212
- image: "nvidia/cuda:12.1.1-runtime-ubuntu22.04"
1313
PACKAGE_NAME: "fabric"
1414
python_version: "3.10"
15-
machine: "A100_X_2"
15+
machine: "L4_X_2"
1616
- image: "nvidia/cuda:12.6.3-runtime-ubuntu22.04"
1717
PACKAGE_NAME: "fabric"
1818
python_version: "3.12"
@@ -37,12 +37,13 @@ env:
3737
RUN_ONLY_CUDA_TESTS: "1"
3838

3939
run: |
40-
# Install Python and UV
41-
apt-get update -qq --fix-missing
40+
echo "Installing dependencies"
41+
apt-get update -qq --fix-missing -o=Dpkg::Use-Pty=0 &> /dev/null
4242
apt-get install -q -y software-properties-common curl
43-
# Add deadsnakes PPA for newer Python versions if needed
43+
echo "Add deadsnakes PPA for newer Python versions if needed"
4444
add-apt-repository ppa:deadsnakes/ppa -y
45-
apt-get update -qq --fix-missing
45+
apt-get update -qq --fix-missing -o=Dpkg::Use-Pty=0 &> /dev/null
46+
echo "Install Python ${python_version} and other dependencies"
4647
apt-get install -q -y --no-install-recommends --allow-downgrades --allow-change-held-packages \
4748
build-essential \
4849
pkg-config \
@@ -54,23 +55,25 @@ run: |
5455
libnccl2 \
5556
libnccl-dev
5657
58+
echo "Install Python ${python_version} and UV"
5759
apt-get install -y python${python_version} python${python_version}-venv python${python_version}-dev
5860
ln -sf /usr/bin/python${python_version} /usr/bin/python
5961
curl -LsSf https://astral.sh/uv/install.sh | sh
6062
61-
# Source the environment and ensure UV is in PATH
63+
echo "Source the environment and ensure UV is in PATH"
6264
[ -f "$HOME/.local/bin/env" ] && . "$HOME/.local/bin/env"
6365
export PATH="$HOME/.local/bin:$PATH"
6466
source $HOME/.cargo/env 2>/dev/null || true
6567
export PATH="$HOME/.cargo/bin:$PATH"
6668
67-
# Verify UV installation
69+
echo "Verify UV installation"
6870
command -v uv || (echo "UV not found in PATH" && exit 1)
6971
# Create and activate a local uv virtual environment
7072
uv venv .venv -p "/usr/bin/python${python_version}" || uv venv .venv -p "python${python_version}" || uv venv .venv
7173
. .venv/bin/activate
7274
hash -r
7375
76+
echo "Show system information"
7477
whereis nvidia
7578
nvidia-smi
7679
python --version
@@ -98,7 +101,7 @@ run: |
98101
uv pip install "cython<3.0" wheel # for compatibility
99102
fi
100103
101-
# install the base so we can adjust other packages
104+
echo "Install the base so we can adjust other packages"
102105
uv pip install .
103106
echo "Adjust torch versions in requirements files"
104107
PYTORCH_VERSION=$(python -c "import torch; print(torch.__version__.split('+')[0])")
@@ -119,6 +122,7 @@ run: |
119122
--target_import="lightning_fabric"
120123
fi
121124
125+
echo "Install package with [${PACKAGE_NAME}] extras"
122126
extra=$(python -c "print({'lightning': 'fabric-'}.get('$(PACKAGE_NAME)', ''))")
123127
uv pip install ".[${extra}dev]" --upgrade
124128

.lightning/workflows/pytorch.yml

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ parametrize:
1212
- image: "nvidia/cuda:12.1.1-runtime-ubuntu22.04"
1313
PACKAGE_NAME: "pytorch"
1414
python_version: "3.10"
15-
machine: "A100_X_2"
15+
machine: "L4_X_2"
1616
- image: "nvidia/cuda:12.6.3-runtime-ubuntu22.04"
1717
PACKAGE_NAME: "pytorch"
1818
python_version: "3.12"
@@ -37,12 +37,13 @@ env:
3737
RUN_ONLY_CUDA_TESTS: "1"
3838

3939
run: |
40-
# Install Python and UV
41-
apt-get update -qq --fix-missing
40+
echo "Installing dependencies"
41+
apt-get update -qq --fix-missing -o=Dpkg::Use-Pty=0 &> /dev/null
4242
apt-get install -q -y software-properties-common curl
43-
# Add deadsnakes PPA for newer Python versions if needed
43+
echo "Add deadsnakes PPA for newer Python versions if needed"
4444
add-apt-repository ppa:deadsnakes/ppa -y
45-
apt-get update -qq --fix-missing
45+
apt-get update -qq --fix-missing -o=Dpkg::Use-Pty=0 &> /dev/null
46+
echo "Install Python ${python_version} and other dependencies"
4647
apt-get install -q -y --no-install-recommends --allow-downgrades --allow-change-held-packages \
4748
build-essential \
4849
pkg-config \
@@ -54,23 +55,25 @@ run: |
5455
libnccl2 \
5556
libnccl-dev
5657
58+
echo "Install Python ${python_version} and UV"
5759
apt-get install -y python${python_version} python${python_version}-venv python${python_version}-dev
5860
ln -sf /usr/bin/python${python_version} /usr/bin/python
5961
curl -LsSf https://astral.sh/uv/install.sh | sh
6062
61-
# Source the environment and ensure UV is in PATH
63+
echo "Source the environment and ensure UV is in PATH"
6264
[ -f "$HOME/.local/bin/env" ] && . "$HOME/.local/bin/env"
6365
export PATH="$HOME/.local/bin:$PATH"
6466
source $HOME/.cargo/env 2>/dev/null || true
6567
export PATH="$HOME/.cargo/bin:$PATH"
6668
67-
# Verify UV installation
69+
echo "Verify UV installation"
6870
command -v uv || (echo "UV not found in PATH" && exit 1)
6971
# Create and activate a local uv virtual environment
7072
uv venv .venv -p "/usr/bin/python${python_version}" || uv venv .venv -p "python${python_version}" || uv venv .venv
7173
. .venv/bin/activate
7274
hash -r
7375
76+
echo "Show system information"
7477
whereis nvidia
7578
nvidia-smi
7679
python --version
@@ -98,7 +101,7 @@ run: |
98101
uv pip install "cython<3.0" wheel # for compatibility
99102
fi
100103
101-
# install the base so we can adjust other packages
104+
echo "Install the base so we can adjust other packages"
102105
uv pip install .
103106
echo "Adjust torch versions in requirements files"
104107
PYTORCH_VERSION=$(python -c "import torch; print(torch.__version__.split('+')[0])")
@@ -119,9 +122,11 @@ run: |
119122
--target_import="lightning_fabric,pytorch_lightning"
120123
fi
121124
125+
echo "Install package"
122126
extra=$(python -c "print({'lightning': 'pytorch-'}.get('$(PACKAGE_NAME)', ''))")
123127
uv pip install -e ".[${extra}dev]" --upgrade
124128
129+
echo "Ensure only a single package is installed"
125130
if [ "${PACKAGE_NAME}" == "pytorch" ]; then
126131
echo "uninstall lightning to have just single package"
127132
uv pip uninstall lightning

docs/source-pytorch/common/checkpointing_basic.rst

Lines changed: 36 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -58,12 +58,45 @@ Lightning automatically saves a checkpoint for you in your current working direc
5858
# simply by using the Trainer you get automatic checkpointing
5959
trainer = Trainer()
6060
61-
To change the checkpoint path use the `default_root_dir` argument:
61+
62+
Checkpoint save location
63+
========================
64+
65+
The location where checkpoints are saved depends on whether you have configured a logger:
66+
67+
**Without a logger**, checkpoints are saved to the ``default_root_dir``:
68+
69+
.. code-block:: python
70+
71+
# saves checkpoints to 'some/path/checkpoints/'
72+
trainer = Trainer(default_root_dir="some/path/", logger=False)
73+
74+
**With a logger**, checkpoints are saved to the logger's directory, **not** to ``default_root_dir``:
6275

6376
.. code-block:: python
6477
65-
# saves checkpoints to 'some/path/' at every epoch end
66-
trainer = Trainer(default_root_dir="some/path/")
78+
from lightning.pytorch.loggers import CSVLogger
79+
80+
# checkpoints will be saved to 'logs/my_experiment/version_0/checkpoints/'
81+
# NOT to 'some/path/checkpoints/'
82+
trainer = Trainer(
83+
default_root_dir="some/path/", # This will be ignored for checkpoints!
84+
logger=CSVLogger("logs", "my_experiment")
85+
)
86+
87+
To explicitly control the checkpoint location when using a logger, use the
88+
:class:`~lightning.pytorch.callbacks.ModelCheckpoint` callback:
89+
90+
.. code-block:: python
91+
92+
from lightning.pytorch.callbacks import ModelCheckpoint
93+
94+
# explicitly set checkpoint directory
95+
checkpoint_callback = ModelCheckpoint(dirpath="my/custom/checkpoint/path/")
96+
trainer = Trainer(
97+
logger=CSVLogger("logs", "my_experiment"),
98+
callbacks=[checkpoint_callback]
99+
)
67100
68101
69102
----

0 commit comments

Comments
 (0)