Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial gfx1036 and gfx1103 support #111

Merged
merged 2 commits into from
Jul 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion binfo/040_02_onnxruntime_deepspeed.binfo
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ BINFO_APP_PRE_CONFIG_CMD_ARRAY=(

BINFO_APP_BUILD_CMD_ARRAY=(
"cd ${BINFO_APP_SRC_DIR}"
"./build_rocm.sh \"${SEMICOLON_SEPARATED_GPU_TARGET_LIST_DEFAULT}\""
"./build_rocm.sh ${SEMICOLON_SEPARATED_GPU_TARGET_LIST_DEFAULT}"
)

BINFO_APP_INSTALL_CMD_ARRAY=(
Expand Down
4 changes: 2 additions & 2 deletions binfo/user_config.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# Function to select ROCM SDK build target GPUs
func_build_cfg_user() {
local message="Select ROCM SDK build target GPUs. Space to select, Enter to finish save, ESC to cancel."
local options="gfx906|gfx90a|gfx940|gfx1010|gfx1011|gfx1012|gfx1030|gfx1031|gfx1035|gfx1100|gfx1101|gfx1102|gfx1150|gfx1151"
local options="gfx906|gfx90a|gfx940|gfx1010|gfx1011|gfx1012|gfx1030|gfx1031|gfx1035|gfx1036|gfx1100|gfx1101|gfx1102|gfx1103|gfx1150|gfx1151"
local script_path="./build/checkbox.sh"

# Check if the script exists and is executable
Expand All @@ -14,4 +14,4 @@ func_build_cfg_user() {

# Execute the checkbox script with the specified parameters
"$script_path" --message="$message" --options="$options" --multiple
}
}
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
From 108755dda9e284382bd0b63e5e351cba9f7abb16 Mon Sep 17 00:00:00 2001
From 3d237f904f4f74618da216179f9279fd027b05e8 Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Mon, 20 May 2024 22:36:23 -0700
Subject: [PATCH 1/4] deepspeed rocm preconfig and build_install scripts
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
From 40f44ac1ca3cc73851bfb7c3016276894022eb4f Mon Sep 17 00:00:00 2001
From ff7f27c49a5ac7ac31f7ebc70e02c18f3aa0ac91 Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Tue, 21 May 2024 07:57:53 -0700
Subject: [PATCH 2/4] check rocm path from installed pytorch variables
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
From 59fe85c08a0900acaf07bd2aa506853af3fbd83e Mon Sep 17 00:00:00 2001
From e423db20d2eb75934214c4487fb4193f3df389a9 Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Wed, 26 Jun 2024 14:44:04 -0700
Subject: [PATCH 3/4] allow building deepspeed for rocm in virtual linux
Expand All @@ -10,24 +10,26 @@ fixes: https://github.com/lamikr/rocm_sdk_builder/issues/75

Signed-off-by: Mika Laitio <[email protected]>
---
build_rocm.sh | 2 ++
build_rocm.sh | 4 +++-
op_builder/builder.py | 2 +-
setup.py | 18 ++++++++++--------
3 files changed, 13 insertions(+), 9 deletions(-)
3 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/build_rocm.sh b/build_rocm.sh
index f4a75855..3f3ce526 100755
index f4a75855..b48a0ad9 100755
--- a/build_rocm.sh
+++ b/build_rocm.sh
@@ -15,6 +15,8 @@ mkdir -p deepspeed/ops/spatial
@@ -15,7 +15,9 @@ mkdir -p deepspeed/ops/spatial
#export CFLAGS="-I/usr/include"
#export LDFLAGS="-L/usr/lib64"

+# needed by real accelerator.py to detect the cuda when build on virtual linux without access to real hardware
+export DS_ACCELERATOR=cuda
# install command will create wheel and install it. bdist_wheel comamnd would only create the wheel
AMDGPU_TARGETS=${amd_target_gpu} DS_BUILD_AIO=0 DS_BUILD_FP_QUANTIZER=0 DS_BUILD_QUANTIZER=0 DS_BUILD_SPARSE_ATTN=0 DS_BUILD_RAGGED_DEVICE_OPS=0 DS_BUILD_CUTLASS_OPS=0 DS_BUILD_EVOFORMER_ATTN=0 DS_BUILD_OPS=1 python setup.py bdist_wheel
-AMDGPU_TARGETS=${amd_target_gpu} DS_BUILD_AIO=0 DS_BUILD_FP_QUANTIZER=0 DS_BUILD_QUANTIZER=0 DS_BUILD_SPARSE_ATTN=0 DS_BUILD_RAGGED_DEVICE_OPS=0 DS_BUILD_CUTLASS_OPS=0 DS_BUILD_EVOFORMER_ATTN=0 DS_BUILD_OPS=1 python setup.py bdist_wheel
+AMDGPU_TARGETS="${amd_target_gpu}" DS_BUILD_AIO=0 DS_BUILD_FP_QUANTIZER=0 DS_BUILD_QUANTIZER=0 DS_BUILD_SPARSE_ATTN=0 DS_BUILD_RAGGED_DEVICE_OPS=0 DS_BUILD_CUTLASS_OPS=0 DS_BUILD_EVOFORMER_ATTN=0 DS_BUILD_OPS=1 python setup.py bdist_wheel

#DS_BUILD_UTILS=1 DS_BUILD_CPU_ADAGRAD=1 DS_BUILD_RANDOM_LTD=1 DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 DS_BUILD_FUSED_LAMB=1 DS_BUILD_CCL_COMM=1 python setup.py develop
diff --git a/op_builder/builder.py b/op_builder/builder.py
index a27b134c..4980a528 100644
--- a/op_builder/builder.py
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
From bae0b1c89880e23f8ec885eba5f18043726319a0 Mon Sep 17 00:00:00 2001
From 44b7a213e104e7f7b9e6f9b7e63f5eb8ff7be88f Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Tue, 21 May 2024 11:41:20 -0700
Subject: [PATCH 4/4] remove linear_kernel which fails on rocm
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From 915a3c074ff1c3694566fd74f2e83003a1253104 Mon Sep 17 00:00:00 2001
From 7b51b6d06842de4851b6c20b61dda2ead78e68ff Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Wed, 8 May 2024 13:48:21 -0700
Subject: [PATCH 1/4] Do not fail on install for missing kdb.bz2 file
Subject: [PATCH 1/5] Do not fail on install for missing kdb.bz2 file

Do not fail if kdb.bz2 file for some GPU does not exist because
these kdb files are not mandatory. Their function is to speed up the
Expand Down Expand Up @@ -43,5 +43,5 @@ index 32d9a2e5b..d6c2db704 100644
endif()

--
2.41.1
2.45.2

Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From ecb981cd1b66749186404fa76c56237c758953b5 Mon Sep 17 00:00:00 2001
From fdc0e268b2edbb0a57af8e8820b388816af38471 Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Fri, 3 May 2024 14:15:09 -0700
Subject: [PATCH 2/4] fix libroctx64.so linking error
Subject: [PATCH 2/5] fix libroctx64.so linking error

search the library and if found link it from there
instead of expecting it to be in the ld library path
Expand Down Expand Up @@ -31,5 +31,5 @@ index 0741a6023..ae4405eed 100644

############################################################
--
2.41.1
2.45.2

Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From 6578a68e3226e97716aad12d445632358f2a463e Mon Sep 17 00:00:00 2001
From 4b65dfcb0208bfe1eb64c474fb97e31e2b8bf60f Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Tue, 19 Dec 2023 15:13:46 -0800
Subject: [PATCH 3/4] MIOpen gfx1010 and gfx1035 support
Subject: [PATCH 3/5] MIOpen gfx1010 and gfx1035 support

- todo: check gfx1010 specific parts

Expand Down Expand Up @@ -174,5 +174,5 @@ index 16ce78f04..2ec3eaf09 100644
"gfx1101",
"gfx1102"};
--
2.41.1
2.45.2

6 changes: 3 additions & 3 deletions patches/rocm-6.1.2/MIOpen/0004-improved-gfx1010-support.patch
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From 5e7803271cbbe475da352ab188f09b345006d9c0 Mon Sep 17 00:00:00 2001
From 3c0552223df66774e8f3613826a6094939763cce Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Mon, 8 Jul 2024 21:44:10 +0300
Subject: [PATCH 4/4] improved gfx1010 support
Subject: [PATCH 4/5] improved gfx1010 support

- allows running pytorch gpu benchmark
on gfx1010/amd rx 5700
Expand Down Expand Up @@ -135,5 +135,5 @@ index bf02d4d55..c3fa2bd3a 100644
{"Rembrandt", "gfx1035"},
};
--
2.41.1
2.45.2

119 changes: 119 additions & 0 deletions patches/rocm-6.1.2/MIOpen/0005-gfx1036-and-gfx1103-support.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
From 08071937d4c2c34f619ed5b49bd0ced4805875fa Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Sat, 13 Jul 2024 21:07:11 -0400
Subject: [PATCH 5/5] gfx1036 and gfx1103 support

Signed-off-by: Mika Laitio <[email protected]>
---
.../composable_kernel/include/utility/config.hpp | 15 ++++++++-------
src/include/miopen/solver/ck_utility_common.hpp | 8 +++++++-
src/target_properties.cpp | 14 ++++++++++++--
3 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/src/composable_kernel/composable_kernel/include/utility/config.hpp b/src/composable_kernel/composable_kernel/include/utility/config.hpp
index 5957a79d8..6ca920b5e 100644
--- a/src/composable_kernel/composable_kernel/include/utility/config.hpp
+++ b/src/composable_kernel/composable_kernel/include/utility/config.hpp
@@ -16,8 +16,8 @@
#if !(defined(CK_AMD_GPU_GFX803) || defined(CK_AMD_GPU_GFX900) || defined(CK_AMD_GPU_GFX906) || \
defined(CK_AMD_GPU_GFX940) || defined(CK_AMD_GPU_GFX908) || defined(CK_AMD_GPU_GFX90A) || \
defined(CK_AMD_GPU_GFX941) || defined(CK_AMD_GPU_GFX942) || defined(CK_AMD_GPU_GFX1010) || defined(CK_AMD_GPU_GFX1030) || \
- defined(CK_AMD_GPU_GFX1031) || defined(CK_AMD_GPU_GFX1035) || defined(CK_AMD_GPU_GFX1100) || defined(CK_AMD_GPU_GFX1101) || \
- defined(CK_AMD_GPU_GFX1102))
+ defined(CK_AMD_GPU_GFX1031) || defined(CK_AMD_GPU_GFX1035) || defined(CK_AMD_GPU_GFX1036) || defined(CK_AMD_GPU_GFX1100) || defined(CK_AMD_GPU_GFX1101) || \
+ defined(CK_AMD_GPU_GFX1102) || defined(CK_AMD_GPU_GFX1103))
#error Need to define (only) one GPU target
#endif

@@ -29,14 +29,15 @@
#define CK_MIN_BLOCK_PER_CU 2
#endif

-// TODO: gfx1010 check CK_BUFFER_RESOURCE_3RD_DWORD
+// TODO: composable_kernel has differend CK_BUFFER_RESOURCE_3RD_DWORD for gfx110* devices
// buffer resourse
#if defined(CK_AMD_GPU_GFX803) || defined(CK_AMD_GPU_GFX900) || defined(CK_AMD_GPU_GFX906) || \
defined(CK_AMD_GPU_GFX941) || defined(CK_AMD_GPU_GFX942) || defined(CK_AMD_GPU_GFX940) || \
- defined(CK_AMD_GPU_GFX908) || defined(CK_AMD_GPU_GFX90A) || defined(CK_AMD_GPU_GFX1010)
+ defined(CK_AMD_GPU_GFX908) || defined(CK_AMD_GPU_GFX90A)
#define CK_BUFFER_RESOURCE_3RD_DWORD 0x00020000
-#elif defined(CK_AMD_GPU_GFX1030) || defined(CK_AMD_GPU_GFX1031) || defined(CK_AMD_GPU_GFX1035) || defined(CK_AMD_GPU_GFX1100) || \
- defined(CK_AMD_GPU_GFX1101) || defined(CK_AMD_GPU_GFX1102)
+#elif defined(CK_AMD_GPU_GFX1010) || defined(CK_AMD_GPU_GFX1030) || defined(CK_AMD_GPU_GFX1031) || \
+ defined(CK_AMD_GPU_GFX1035) || defined(CK_AMD_GPU_GFX1036) || \
+ defined(CK_AMD_GPU_GFX1100) || defined(CK_AMD_GPU_GFX1101) || defined(CK_AMD_GPU_GFX1102) || defined(CK_AMD_GPU_GFX1103)
#define CK_BUFFER_RESOURCE_3RD_DWORD 0x31014000
#endif

@@ -49,7 +50,7 @@
#elif defined(CK_AMD_GPU_GFX906) || defined(CK_AMD_GPU_GFX908) || defined(CK_AMD_GPU_GFX90a) || \
defined(CK_AMD_GPU_GFX941) || defined(CK_AMD_GPU_GFX942) || defined(CK_AMD_GPU_GFX940) || \
defined(CK_AMD_GPU_GFX1030) || defined(CK_AMD_GPU_GFX1031) || defined(CK_AMD_GPU_GFX1100) || \
- defined(CK_AMD_GPU_GFX1101) || defined(CK_AMD_GPU_GFX1102)
+ defined(CK_AMD_GPU_GFX1101) || defined(CK_AMD_GPU_GFX1102) || defined(CK_AMD_GPU_GFX1103)
#define CK_USE_AMD_V_FMAC_F32
#define CK_USE_AMD_V_DOT2_F32_F16
#define CK_USE_AMD_V_DOT4_I32_I8
diff --git a/src/include/miopen/solver/ck_utility_common.hpp b/src/include/miopen/solver/ck_utility_common.hpp
index aea036066..ea5629871 100644
--- a/src/include/miopen/solver/ck_utility_common.hpp
+++ b/src/include/miopen/solver/ck_utility_common.hpp
@@ -61,9 +61,11 @@ static inline bool is_ck_supported_hardware(const Handle& handle)
StartsWith(handle.GetDeviceName(), "gfx1030") ||
StartsWith(handle.GetDeviceName(), "gfx1031") ||
StartsWith(handle.GetDeviceName(), "gfx1035") ||
+ StartsWith(handle.GetDeviceName(), "gfx1036") ||
StartsWith(handle.GetDeviceName(), "gfx1100") ||
StartsWith(handle.GetDeviceName(), "gfx1101") ||
- StartsWith(handle.GetDeviceName(), "gfx1102");
+ StartsWith(handle.GetDeviceName(), "gfx1102") ||
+ StartsWith(handle.GetDeviceName(), "gfx1103");
}

// MI100 : gfx908
@@ -121,12 +123,16 @@ static inline auto get_ck_common_compiler_flag(const Handle& handle)
compiler_flag << " -DCK_AMD_GPU_GFX1031";
else if(StartsWith(device_name, "gfx1035"))
compiler_flag << " -DCK_AMD_GPU_GFX1035";
+ else if(StartsWith(device_name, "gfx1036"))
+ compiler_flag << " -DCK_AMD_GPU_GFX1036";
else if(StartsWith(device_name, "gfx1100"))
compiler_flag << " -DCK_AMD_GPU_GFX1100";
else if(StartsWith(device_name, "gfx1101"))
compiler_flag << " -DCK_AMD_GPU_GFX1101";
else if(StartsWith(device_name, "gfx1102"))
compiler_flag << " -DCK_AMD_GPU_GFX1102";
+ else if(StartsWith(device_name, "gfx1103"))
+ compiler_flag << " -DCK_AMD_GPU_GFX1103";
// NOLINTEND(*-braces-around-statements)

// buffer atomic-fadd
diff --git a/src/target_properties.cpp b/src/target_properties.cpp
index c3fa2bd3a..de979aae9 100644
--- a/src/target_properties.cpp
+++ b/src/target_properties.cpp
@@ -52,9 +52,19 @@ static std::string GetDeviceNameFromMap(const std::string& in)
{"gfx804", "gfx803"},
{"Vega10", "gfx900"},
{"gfx901", "gfx900"},
- {"Navi10", "gfx1010"},
+ {"navi10", "gfx1010"},
+ {"navi12", "gfx1011"},
+ {"navi14", "gfx1012"},
{"10.3.0 Sienna_Cichlid 18", "gfx1030"},
- {"Rembrandt", "gfx1035"},
+ {"navi22", "gfx1031"},
+ {"navi23", "gfx1032"},
+ {"navi24", "gfx1034"},
+ {"rembrandt", "gfx1035"},
+ {"raphael", "gfx1036"},
+ {"navi31", "gfx1100"},
+ {"navi32", "gfx1101"},
+ {"navi33", "gfx1102"},
+ {"phoenix", "gfx1103"},
};

const auto& dev_str = miopen::GetStringEnv(ENV(MIOPEN_DEBUG_ENFORCE_DEVICE));
--
2.45.2

Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From dd39290f90df98f928452720e622c1497b4fd7da Mon Sep 17 00:00:00 2001
From 3e2d2f891001a8d2a8f74a46884cbec84a2fd8c8 Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Fri, 10 May 2024 20:34:13 -0700
Subject: [PATCH 1/3] Tensile fix fallback arch build
Subject: [PATCH 1/4] Tensile fix fallback arch build

fixes build error which happens if only the rx 5700 is enabled
(only GPU_BUILD_AMD_NAVI10_GFX1010=1 enabled in rocm_sdk_builder envsetup.sh)
Expand Down Expand Up @@ -61,5 +61,5 @@ index ca3ef322..9e37b4b0 100644
for arch in archs:
if arch in architectureMap:
--
2.41.0
2.45.2

Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From 73281f46189f7012334d3b1a7e52baffade5295f Mon Sep 17 00:00:00 2001
From cbec649b52abb1e45d72a3755fa01a77cf9784e7 Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Fri, 3 May 2024 13:13:02 -0700
Subject: [PATCH 2/3] Tensile, add gfx1035 support
Subject: [PATCH 2/4] Tensile, add gfx1035 support

Signed-off-by: Mika Laitio <[email protected]>
---
Expand Down Expand Up @@ -84,5 +84,5 @@ index 6ececf1c..a89b7c39 100644
else:
printWarning("Assembler not present, asm caps loaded from cache are unverified")
--
2.41.0
2.45.2

15 changes: 8 additions & 7 deletions patches/rocm-6.1.2/Tensile/0003-llvm-path-changes.patch
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From 3ffffba383528d997372861fb5d940d09ebe2996 Mon Sep 17 00:00:00 2001
From 80776357e8f44019675a224474e314bbb551bc8a Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Wed, 15 May 2024 21:09:56 -0700
Subject: [PATCH 3/3] llvm path changes
Subject: [PATCH 3/4] llvm path changes

Signed-off-by: Mika Laitio <[email protected]>
---
Expand All @@ -18,17 +18,18 @@ index a89b7c39..9b24aee7 100644
else:
- globalParameters["AssemblerPath"] = locateExe(os.path.join(globalParameters["ROCmPath"], "llvm/bin"), "clang++")
+ globalParameters["AssemblerPath"] = locateExe(os.path.join(globalParameters["ROCmPath"], "bin"), "clang++")

globalParameters["ROCmSMIPath"] = locateExe(globalParameters["ROCmBinPath"], "rocm-smi")

@@ -2277,7 +2277,7 @@ def assignGlobalParameters( config ):
if os.name == "nt":
globalParameters["ClangOffloadBundlerPath"] = locateExe(globalParameters["ROCmBinPath"], "clang-offload-bundler.exe")
else:
- globalParameters["ClangOffloadBundlerPath"] = locateExe(os.path.join(globalParameters["ROCmPath"], "llvm/bin"), "clang-offload-bundler")
+ globalParameters["ClangOffloadBundlerPath"] = locateExe(os.path.join(globalParameters["ROCmPath"], "bin"), "clang-offload-bundler")

if "ROCmAgentEnumeratorPath" in config:
globalParameters["ROCmAgentEnumeratorPath"] = config["ROCmAgentEnumeratorPath"]
--
2.41.0
--
2.45.2

Loading