Torch and Lightning 2.0.0 #60

kylebgorman · 2023-03-17T23:30:27Z

The library does not work with Lightning (and one suspects that Torch itself is also an issue) > 2.0.0. The first issue I encounter when running yoyodyne-train with no arguments is related to a change in how Lightning command-line arguments are handled---I suspect there are at least a few more.

So that the library is not broken at head---which I consider unacceptable---I have pinned as follows:

pytorch-lightning>=1.7.0,<2.0.0
torch>=1.11.0,<2.0.0

What we need to do is just to migrate to 2.0.0, by fixing Lightning (and Torch, if any) bugs until things work, and then re-pin these two dependencies >=2.0.0. I have initially assigned myself, but I would welcome help.

The text was updated successfully, but these errors were encountered:

kylebgorman · 2023-03-22T14:01:50Z

@Adamits I have assigned this to you since you wanna take a stab.

There is a new interface for CLI stuff (probably an imporvement over the somewhat ad hoc one they gave us):

https://lightning.ai/docs/pytorch/stable/cli/lightning_cli.html
https://lightning.ai/docs/pytorch/stable/cli/lightning_cli_expert.html

Looks like an improvement, but some work to make things conformant.

kylebgorman · 2023-07-02T14:20:47Z

I have read these docs and have the basic lay of the land now. The migration should go in two steps:

First, we should implement subclasses of LightningDataModule (this is available in Lightning pre 2.0 too). The train_dataloader method should return the training data dataloader (naturally) and similarly with dev_dataloader and test_dataloader. (Even though we don't need access to train, dev, and test in any one process, we can probably set it up so we can use the same one for training and inference.) Most of the logic of get_datasets and get_loaders would be moved into this too. This is thus mostly just cleanup.
Then, we have to move arg parsing to LightningCli and migrate Torch and Lightning to >=2.0. This is a bit magical (it introspects to find flags for the dataset and models) but our house is mostly in order so it might just work. At this point there are probably one or two low-level Torch issues we'll hit up against too and can handle those as they arise...

Adamits · 2023-07-03T16:21:17Z

Sounds good, I think I understand. Though I feel a bit annoyed that we have to use LightningCli instead of standard python libraries... I will try to prioritize it this week.

kylebgorman · 2023-07-03T16:22:50Z

Sounds good, I think I understand. Though I feel a bit annoyed that we have to use LightningCli instead of standard python libraries... I will try to prioritize it this week.

I'm working on the data module part of things in the next few days---haven't thought about the model side yet.

FWIW, what LightningCLI is is basically a tool that can introspect to populate the argparse parser...which should help us enormously with the kind of bug we saw with not passing around the label smoothing...

kylebgorman · 2024-04-04T15:27:07Z

@Adamits believes it is possible to migrate to Torch 2 while staying on the 1.13 branch of PTL. The ability to complete this migration in two steps (one to Torch 2, then later migrating to PTL 2 focusing on CLI changes) would make this much easier.

See CUNY-CL#60 for context, though we are not yet done with it since we are not migrating to PyTorch-Lightning 2.

kylebgorman · 2024-04-08T16:39:19Z

Draft of the migration to PyTorch 2 in #173. Testing now---focusing on the transformer since we can use the causal mask flag there, hopefully.

kylebgorman · 2024-07-28T14:36:15Z

I have been using LightningCLI in another project and I have to say, it's a total vibe shift and it's way better than having big Bash scripts (without any nice syntax support), so I think we should prioritize this now.

At some earlier point we were stuck on PyTorch <2, and there was no support for Python 3.10 or later. CUNY-CL#173 freed us from this restriction, but left in the restriction to PyTorch-Lightning <2. However, we were pinned on Python <= 3.10 by PyTorch, not PyTorch-Lightning, so we can now add support for additional Pythons. Python 3.12 is the actively developed branch; Python 3.8-3.11 are are on long-term (i.e., security-fix) support; Python 3.8 is just about to go out of date and Python 3.13 is approaching prerelease: https://devguide.python.org/versions/. So this adds support for all supported Python versions (except 3.8, which was already old when we started this project and which we never supported in the first place). This is closely related to, but does not close, CUNY-CL#60.

kylebgorman · 2024-10-05T15:08:08Z

Another feature of this migration is that at least some if not all the models will work on M1/M2 Macs using their MPS accelerators.

Closes CUNY-CL#60. Closes CUNY-CL#218. LightningCLI removes our need to create separate training and prediction CLI programs, moving nearly all of that logic into the base model. This commit in particular sets the stage: * Updates dependencies. * Increments minor version number. * Creates an empty `cli.py` where the CLI-speific logic will live.

kylebgorman added bug Something isn't working release blocker Should be solved before release labels Mar 17, 2023

kylebgorman assigned kylebgorman and Adamits and unassigned kylebgorman Mar 17, 2023

kylebgorman mentioned this issue Jul 3, 2023

Label smoothing update #83

Closed

kylebgorman mentioned this issue Jul 7, 2023

Python interface for loading #107

Closed

kylebgorman mentioned this issue Jul 17, 2023

Migrates to data modules #110

Merged

kylebgorman added a commit to kylebgorman/yoyodyne that referenced this issue Apr 8, 2024

Attempts migration to PyTorch >= 2.0.0.

36f119d

See CUNY-CL#60 for context, though we are not yet done with it since we are not migrating to PyTorch-Lightning 2.

kylebgorman mentioned this issue Apr 8, 2024

Attempts migration to PyTorch >= 2.0.0. #173

Merged

This was referenced Jul 29, 2024

Adds support for additional Python versions #217

Merged

Unpin Numpy #218

Closed

Hydra Config File Support (Post LightningCLI integration) #232

Open

kylebgorman mentioned this issue Aug 27, 2024

Epoch-based warmup #243

Closed

kylebgorman linked a pull request Nov 20, 2024 that will close this issue

Migrates to LightningCLI #265

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Torch and Lightning 2.0.0 #60

Torch and Lightning 2.0.0 #60

kylebgorman commented Mar 17, 2023

kylebgorman commented Mar 22, 2023

kylebgorman commented Jul 2, 2023

Adamits commented Jul 3, 2023

kylebgorman commented Jul 3, 2023

kylebgorman commented Apr 4, 2024

kylebgorman commented Apr 8, 2024

kylebgorman commented Jul 28, 2024

kylebgorman commented Oct 5, 2024

Torch and Lightning 2.0.0 #60

Torch and Lightning 2.0.0 #60

Comments

kylebgorman commented Mar 17, 2023

kylebgorman commented Mar 22, 2023

kylebgorman commented Jul 2, 2023

Adamits commented Jul 3, 2023

kylebgorman commented Jul 3, 2023

kylebgorman commented Apr 4, 2024

kylebgorman commented Apr 8, 2024

kylebgorman commented Jul 28, 2024

kylebgorman commented Oct 5, 2024