bug: Boolean values are represented as strings in default fsdp config translates to True #80

kmehant · 2024-03-06T06:28:51Z

Describe the bug

Boolean values in fsdp config (https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/tuning/config/fsdp_config.json#L4-L6) are represented as string values. This does not translate to actual boolean values when loaded in hf/transformers library https://github.com/huggingface/transformers/blob/2a002d073a337051bdc3fbdc95ff1bc0399ae2bb/src/transformers/training_args.py#L1654

Solution

Use json boolean values instead of string representation. Meanwhile, I have as well raised an issue for discussion on transformers library to add a dataclass and argument parser with a motivating example (huggingface/transformers#29476)

fabianlim · 2024-03-12T01:20:26Z

@kmehant while the fix is as you described, now that #53 is merged, I think it may be best to switch to accelerate, which uses a yaml config defaults file. With yaml explicit encasement of strings using " and ' are not necessary, and it will be more robust to such issues. I suggest the following changes

update the README.md removing instructions for torch.run and replace with accelerate.launch
replace the FSDP JSON with a config yaml like this.
- BTW I suggest to move fsdp_config.json out of tuning/config (which houses code) into somewhere which only houses config fixtures.

@Ssukriti

Ssukriti · 2024-03-13T04:46:19Z

Thanks Fabian, created issue for README updates #87 . We will prioritize it at earliest

kmehant · 2024-03-13T04:57:31Z

#80 (comment)

I think it may be best to switch to accelerate, which uses a yaml config defaults file.

Thanks @fabianlim, I am aware of this, isn't accelerate a wrapper over torch.distributed?

I suggest the following changes

I guess @Ssukriti is tracking them in a different issue #87

Ssukriti · 2024-03-13T05:36:32Z

@kmehant I was planning to get to issue #87 in next 2 days as its high priority for our deliverables, but if you are interested and want to contribute instead, feel free to do so. Just let me know so I can plan accordingly :) .
We do need it completed at earliest so we can also start some testing with multi-GPU on our end as well

kmehant · 2024-03-13T06:05:03Z

#80 (comment)

@Ssukriti I will be glad to raise a PR in a couple of hours.

fabianlim · 2024-03-13T06:10:58Z

@kmehant its up to you but I should be able to get to #87 pretty soon.

kmehant · 2024-03-13T09:39:37Z

@Ssukriti @fabianlim I have raised a PR here #92 Thanks.

fabianlim · 2024-03-13T12:23:27Z

@Ssukriti @fabianlim I have raised a PR here #92 Thanks.

@kmehant ok looks like we duplicated work, see #91

kmehant changed the title ~~bug: Boolean values are represented as strings in default fsdp config defaults to True~~ bug: Boolean values are represented as strings in default fsdp config translates to True Mar 6, 2024

Ssukriti mentioned this issue Mar 13, 2024

Switch to accelerate for Multi GPU #87

Closed

fabianlim mentioned this issue Mar 13, 2024

Multi-GPU switch from TorchRun to Accelerate #91

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: Boolean values are represented as strings in default fsdp config translates to True #80

bug: Boolean values are represented as strings in default fsdp config translates to True #80

kmehant commented Mar 6, 2024 •

edited

Loading

fabianlim commented Mar 12, 2024

Ssukriti commented Mar 13, 2024

kmehant commented Mar 13, 2024

Ssukriti commented Mar 13, 2024 •

edited

Loading

kmehant commented Mar 13, 2024

fabianlim commented Mar 13, 2024

kmehant commented Mar 13, 2024

fabianlim commented Mar 13, 2024

bug: Boolean values are represented as strings in default fsdp config translates to True #80

bug: Boolean values are represented as strings in default fsdp config translates to True #80

Comments

kmehant commented Mar 6, 2024 • edited Loading

Describe the bug

Solution

fabianlim commented Mar 12, 2024

Ssukriti commented Mar 13, 2024

kmehant commented Mar 13, 2024

Ssukriti commented Mar 13, 2024 • edited Loading

kmehant commented Mar 13, 2024

fabianlim commented Mar 13, 2024

kmehant commented Mar 13, 2024

fabianlim commented Mar 13, 2024

kmehant commented Mar 6, 2024 •

edited

Loading

Ssukriti commented Mar 13, 2024 •

edited

Loading