Skip to content

Commit

Permalink
initial rocBLAS logic files for iGPUs
Browse files Browse the repository at this point in the history
- add initial rocBLAS logic files for
  rembrandt (gfx1035), raphael (gfx1036)
  and phoenix (gfx1103) iGPUs.
- when testing with the
  https://github.com/LeiWang1999/rocblas-benchmark
  by using the std::make_tuple(8192, 8192, 8192, false, false, enable_tune),
  the speedup was about 4-5x.
- gfx1035 without logic files

Device 0: AMD Radeon Graphics
m,n,k,a_t,b_t,enable_tune,fp32 time (msec),fp16-f32 time (msec), f16-f16 time (msec), int8-int32 time (msec)
8192,8192,8192,n,n,0,912.287,814.502,854.257,865.103

- gfx1035 with logic files

Device 0: AMD Radeon Graphics
m,n,k,a_t,b_t,enable_tune,fp32 time (msec),fp16-f32 time (msec), f16-f16 time (msec), int8-int32 time (msec)
8192,8192,8192,n,n,0,652.499,834.796,237.42,189.945

- gfx1103 without logic files
Device 0: AMD Radeon 780M
m,n,k,a_t,b_t,enable_tune,fp32 time (msec),fp16-f32 time (msec), f16-f16 time (msec), int8-int32 time (msec)
8192,8192,8192,n,n,0,916.684,820.721,823.48,1018.46

- gfx1103 with logic files
ROCR_VISIBLE_DEVICES="1" ./rocblas_benchmark
Device 0: AMD Radeon 780M
m,n,k,a_t,b_t,enable_tune,fp32 time (msec),fp16-f32 time (msec), f16-f16 time (msec), int8-int32 time (msec)
8192,8192,8192,n,n,0,1346.02,634.836,193.613,119.29

Signed-off-by: Mika Laitio <[email protected]>
  • Loading branch information
lamikr committed Jul 17, 2024
1 parent 44e5e79 commit ee4b0fa
Show file tree
Hide file tree
Showing 5 changed files with 2,909,834 additions and 9 deletions.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From b4e555f2d5c996b528cc13602f78012671383f3a Mon Sep 17 00:00:00 2001
From c651124a45f972bdb57e15d55c80b34007c55240 Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Sat, 18 May 2024 18:17:42 -0700
Subject: [PATCH 1/3] add mageia 9 support to install.sh
Subject: [PATCH 1/5] add mageia 9 support to install.sh

Signed-off-by: Mika Laitio <[email protected]>
---
Expand Down Expand Up @@ -54,5 +54,5 @@ index fc644b87..46c95775 100755
elevate_if_not_root zypper -n --no-gpg-checks install rocblas-*.rpm
;;
--
2.45.2
2.41.1

Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From 98c87b3db281d5048524ecb0c14c33c1fac0719c Mon Sep 17 00:00:00 2001
From 5f26b83d1d6483decbd393ebadacfaf3cabcb2f8 Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Sat, 18 May 2024 18:18:33 -0700
Subject: [PATCH 2/3] add gfx1035,gfx1036 and gfx1103 to gpulist
Subject: [PATCH 2/5] add gfx1035,gfx1036 and gfx1103 to gpulist

Signed-off-by: Mika Laitio <[email protected]>
---
Expand Down Expand Up @@ -106,5 +106,5 @@ index 1f0349fd..073bb244 100644
}

--
2.45.2
2.41.1

Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From 8e12ea9b4770c29d36ae205d55e19f085229ab1a Mon Sep 17 00:00:00 2001
From b1b3f0d3e9ea5d4dac0e9df3930ec6e3f188f461 Mon Sep 17 00:00:00 2001
From: Mika Laitio <[email protected]>
Date: Sat, 18 May 2024 18:15:13 -0700
Subject: [PATCH 3/3] OpenBLAS and BLIS library search improvements
Subject: [PATCH 3/5] OpenBLAS and BLIS library search improvements

- OpenBLAS and BLIS can now be found from
rocm_sdk build by rocm sdk builder
Expand Down Expand Up @@ -50,5 +50,5 @@ index dc8040ea..704414b5 100755
else() # WIN32
set( BLAS_INCLUDE_DIR ${OPENBLAS_DIR}/include CACHE PATH "OpenBLAS library include path" )
--
2.45.2
2.41.1

Loading

0 comments on commit ee4b0fa

Please sign in to comment.