[WAN] Use different sharding strategy for self and cross attention.#250
Open
hyeygit wants to merge 1 commit intoAI-Hypercomputer:mainfrom
Open
[WAN] Use different sharding strategy for self and cross attention.#250hyeygit wants to merge 1 commit intoAI-Hypercomputer:mainfrom
hyeygit wants to merge 1 commit intoAI-Hypercomputer:mainfrom