-
Notifications
You must be signed in to change notification settings - Fork 15
Fix LLVM fdiv and fptosi to Neura conversion and reorganize benchmarks #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 25 commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
3b19db4
Add histogram testbench files
n0thingNoob 6da8b5c
Fix conversion for llvm.fdiv and llvm.fptosi, add e2e/histogram kerne…
n0thingNoob 40aa68f
Delete test/testbench/histogram/histogram.cpp
n0thingNoob 4b9f1c6
Delete test/testbench/histogram/histogram_kernel_neura.mlir
n0thingNoob 3cd3901
Update test/e2e/histogram/histogram_kernel.mlir
n0thingNoob cdb66da
Delete test/testbench/histogram/histogram_kernel.cpp
n0thingNoob d694f69
Delete test/testbench/histogram/histogram_kernel.ll
n0thingNoob 1b854f3
Delete test/testbench/histogram/histogram_kernel.mlir
n0thingNoob 4fb939a
Clean up .gitmodules by removing duplicates
n0thingNoob 71c28c3
add fir and modify LlvmToNeuraPass.cpp for llvm.fmuladd conversion
n0thingNoob 180c3ef
Add FIR kernel support and llvm.fmuladd conversion
n0thingNoob f70a11e
Merge remote-tracking branch 'origin/testbench'
n0thingNoob 411b066
Clean up repository: remove temporary and generated files
n0thingNoob 9a846b1
Fix Neura_OrOp type definition to support neura.data types
n0thingNoob e73bf40
Remove FFT and fusion test files
n0thingNoob 10dfd4b
remove histogram.cpp
n0thingNoob e59f4de
remove ll file
n0thingNoob 5a4c2ff
rm testbench folder
n0thingNoob 096359f
backup for fir kernel and histogram kernel
n0thingNoob db4e012
Use llvm extract to extract kernel from benchmarks
n0thingNoob ebcf014
unify the kernel name in the llvm extract command
n0thingNoob d4314ae
add issue link to mlir file
n0thingNoob a83fe7f
Fix GitHub CI: Add LLVM tools to PATH for llvm-extract
n0thingNoob 811a674
Add TODO, remove redundant file
n0thingNoob 6f683fa
rm extra file
n0thingNoob bf45e98
upload gitignore and remove the unnecessary mlir.llvm in test
n0thingNoob 3176f6f
rm adding the build/bin to PATH
n0thingNoob File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,3 +1,3 @@ | ||
| [submodule "test/CGRA-Bench"] | ||
| path = test/CGRA-Bench | ||
| url = https://github.com/tancheng/CGRA-Bench | ||
| [submodule "test/benchmark/CGRA-Bench"] | ||
| path = test/benchmark/CGRA-Bench | ||
| url = https://github.com/tancheng/CGRA-Bench.git |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Submodule CGRA-Bench
updated
from 000000 to f75e27
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,89 @@ | ||
| // Compiles the original C kernel to mlir, then lowers it via Neura. | ||
| // TODO: Got error when using -O3 -fno-vectorize -fno-slp-vectorize -mllvm -force-vector-width=1 | ||
| // Issue: https://github.com/coredac/dataflow/issues/164 | ||
| // RUN: clang++ -S -emit-llvm -O0 -o %t-kernel-full.ll %S/../../benchmark/CGRA-Bench/kernels/fir/fir.cpp | ||
| // RUN: llvm-extract --rfunc=".*kernel.*" %t-kernel-full.ll -o %t-kernel-only.ll | ||
| // RUN: mlir-translate --import-llvm %t-kernel-only.ll -o %t-kernel.mlir | ||
|
|
||
| // RUN: mlir-neura-opt %t-kernel.mlir \ | ||
| // RUN: --assign-accelerator \ | ||
| // RUN: --lower-llvm-to-neura \ | ||
| // RUN: --canonicalize-live-in \ | ||
| // RUN: --leverage-predicated-value \ | ||
| // RUN: --transform-ctrl-to-data-flow \ | ||
| // RUN: --promote-func-arg-to-const \ | ||
| // RUN: --insert-data-mov \ | ||
| // RUN: --map-to-accelerator="mapping-strategy=heuristic" \ | ||
| // RUN: --architecture-spec=../../arch_spec/architecture.yaml \ | ||
| // RUN: --generate-code -o %t-mapping.mlir | ||
| // RUN: FileCheck %s --input-file=%t-mapping.mlir -check-prefix=MAPPING | ||
| // RUN: FileCheck %s --input-file=tmp-generated-instructions.yaml --check-prefix=YAML | ||
| // RUN: FileCheck %s --input-file=tmp-generated-instructions.asm --check-prefix=ASM | ||
|
|
||
| #loop_annotation = #llvm.loop_annotation<mustProgress = true> | ||
| module attributes {dlti.dl_spec = #dlti.dl_spec<!llvm.ptr<271> = dense<32> : vector<4xi64>, !llvm.ptr<272> = dense<64> : vector<4xi64>, f128 = dense<128> : vector<2xi64>, f16 = dense<16> : vector<2xi64>, !llvm.ptr<270> = dense<32> : vector<4xi64>, i128 = dense<128> : vector<2xi64>, f64 = dense<64> : vector<2xi64>, i64 = dense<64> : vector<2xi64>, !llvm.ptr = dense<64> : vector<4xi64>, i1 = dense<8> : vector<2xi64>, f80 = dense<128> : vector<2xi64>, i32 = dense<32> : vector<2xi64>, i8 = dense<8> : vector<2xi64>, i16 = dense<16> : vector<2xi64>, "dlti.stack_alignment" = 128 : i64, "dlti.endianness" = "little">, llvm.ident = "clang version 20.1.7 (https://github.com/llvm/llvm-project.git 6146a88f60492b520a36f8f8f3231e15f3cc6082)"} { | ||
| llvm.func @_Z6kernelPfS_S_(%arg0: !llvm.ptr {llvm.noundef}, %arg1: !llvm.ptr {llvm.noundef}, %arg2: !llvm.ptr {llvm.noundef}) attributes {frame_pointer = #llvm.framePointerKind<all>, no_inline, no_unwind, optimize_none, passthrough = ["mustprogress", ["uwtable", "2"], ["min-legal-vector-width", "0"], ["no-trapping-math", "true"], ["stack-protector-buffer-size", "8"], ["target-cpu", "x86-64"]], target_cpu = "x86-64", target_features = #llvm.target_features<["+cmov", "+cx8", "+fxsr", "+mmx", "+sse", "+sse2", "+x87"]>, tune_cpu = "generic"} { | ||
| %0 = llvm.mlir.constant(1 : i32) : i32 | ||
| %1 = llvm.mlir.constant(0.000000e+00 : f32) : f32 | ||
| %2 = llvm.mlir.constant(0 : i32) : i32 | ||
| %3 = llvm.mlir.constant(32 : i32) : i32 | ||
| %4 = llvm.mlir.constant(0 : i64) : i64 | ||
| %5 = llvm.alloca %0 x !llvm.ptr {alignment = 8 : i64} : (i32) -> !llvm.ptr | ||
| %6 = llvm.alloca %0 x !llvm.ptr {alignment = 8 : i64} : (i32) -> !llvm.ptr | ||
| %7 = llvm.alloca %0 x !llvm.ptr {alignment = 8 : i64} : (i32) -> !llvm.ptr | ||
| %8 = llvm.alloca %0 x i32 {alignment = 4 : i64} : (i32) -> !llvm.ptr | ||
| %9 = llvm.alloca %0 x f32 {alignment = 4 : i64} : (i32) -> !llvm.ptr | ||
| llvm.store %arg0, %5 {alignment = 8 : i64} : !llvm.ptr, !llvm.ptr | ||
| llvm.store %arg1, %6 {alignment = 8 : i64} : !llvm.ptr, !llvm.ptr | ||
| llvm.store %arg2, %7 {alignment = 8 : i64} : !llvm.ptr, !llvm.ptr | ||
| llvm.store %1, %9 {alignment = 4 : i64} : f32, !llvm.ptr | ||
| llvm.store %2, %8 {alignment = 4 : i64} : i32, !llvm.ptr | ||
| llvm.br ^bb1 | ||
| ^bb1: // 2 preds: ^bb0, ^bb3 | ||
| %10 = llvm.load %8 {alignment = 4 : i64} : !llvm.ptr -> i32 | ||
| %11 = llvm.icmp "slt" %10, %3 : i32 | ||
| llvm.cond_br %11, ^bb2, ^bb4 | ||
| ^bb2: // pred: ^bb1 | ||
| %12 = llvm.load %5 {alignment = 8 : i64} : !llvm.ptr -> !llvm.ptr | ||
| %13 = llvm.load %8 {alignment = 4 : i64} : !llvm.ptr -> i32 | ||
| %14 = llvm.sext %13 : i32 to i64 | ||
| %15 = llvm.getelementptr inbounds %12[%14] : (!llvm.ptr, i64) -> !llvm.ptr, f32 | ||
| %16 = llvm.load %15 {alignment = 4 : i64} : !llvm.ptr -> f32 | ||
| %17 = llvm.load %7 {alignment = 8 : i64} : !llvm.ptr -> !llvm.ptr | ||
| %18 = llvm.load %8 {alignment = 4 : i64} : !llvm.ptr -> i32 | ||
| %19 = llvm.sext %18 : i32 to i64 | ||
| %20 = llvm.getelementptr inbounds %17[%19] : (!llvm.ptr, i64) -> !llvm.ptr, f32 | ||
| %21 = llvm.load %20 {alignment = 4 : i64} : !llvm.ptr -> f32 | ||
| %22 = llvm.load %9 {alignment = 4 : i64} : !llvm.ptr -> f32 | ||
| %23 = llvm.intr.fmuladd(%16, %21, %22) : (f32, f32, f32) -> f32 | ||
| llvm.store %23, %9 {alignment = 4 : i64} : f32, !llvm.ptr | ||
| llvm.br ^bb3 | ||
| ^bb3: // pred: ^bb2 | ||
| %24 = llvm.load %8 {alignment = 4 : i64} : !llvm.ptr -> i32 | ||
| %25 = llvm.add %24, %0 overflow<nsw> : i32 | ||
| llvm.store %25, %8 {alignment = 4 : i64} : i32, !llvm.ptr | ||
| llvm.br ^bb1 {loop_annotation = #loop_annotation} | ||
| ^bb4: // pred: ^bb1 | ||
| %26 = llvm.load %9 {alignment = 4 : i64} : !llvm.ptr -> f32 | ||
| %27 = llvm.load %6 {alignment = 8 : i64} : !llvm.ptr -> !llvm.ptr | ||
| %28 = llvm.getelementptr inbounds %27[%4] : (!llvm.ptr, i64) -> !llvm.ptr, f32 | ||
| llvm.store %26, %28 {alignment = 4 : i64} : f32, !llvm.ptr | ||
| llvm.return | ||
| } | ||
| } | ||
tancheng marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| // MAPPING: module | ||
| // MAPPING: func @_Z6kernelPfS_S_ | ||
| // MAPPING: neura.constant | ||
| // MAPPING: neura.fmul_fadd | ||
| // MAPPING: neura.load | ||
| // MAPPING: neura.store | ||
|
|
||
| // YAML: instructions: | ||
| // YAML: - opcode: "CONSTANT" | ||
| // YAML: - opcode: "FMUL_FADD" | ||
| // YAML: - opcode: "LOAD" | ||
| // YAML: - opcode: "STORE" | ||
|
|
||
| // ASM: PE(0,0): | ||
| // ASM: CONSTANT | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,85 @@ | ||
| // Compiles the original C kernel to mlir, then lowers it via Neura. | ||
| // TODO: Got error when using -O3 -fno-vectorize -fno-slp-vectorize -mllvm -force-vector-width=1 | ||
| // Issue: https://github.com/coredac/dataflow/issues/164 | ||
| // RUN: clang++ -S -emit-llvm -O2 -o %t-kernel-full.ll %S/../../benchmark/CGRA-Bench/kernels/histogram/histogram.cpp | ||
| // RUN: llvm-extract --rfunc=".*kernel.*" %t-kernel-full.ll -o %t-kernel-only.ll | ||
| // RUN: mlir-translate --import-llvm %t-kernel-only.ll -o %t-kernel.mlir | ||
|
|
||
| // RUN: mlir-neura-opt %t-kernel.mlir \ | ||
| // RUN: --assign-accelerator \ | ||
| // RUN: --lower-llvm-to-neura \ | ||
| // RUN: --canonicalize-live-in \ | ||
| // RUN: --leverage-predicated-value \ | ||
| // RUN: --transform-ctrl-to-data-flow \ | ||
| // RUN: --promote-func-arg-to-const \ | ||
| // RUN: --insert-data-mov \ | ||
| // RUN: --map-to-accelerator="mapping-strategy=heuristic" \ | ||
| // RUN: --architecture-spec=%S/../../arch_spec/architecture.yaml \ | ||
| // RUN: --generate-code -o %t-mapping.mlir | ||
| // RUN: FileCheck %s --input-file=%t-mapping.mlir -check-prefix=MAPPING | ||
| // RUN: FileCheck %s --input-file=tmp-generated-instructions.yaml --check-prefix=YAML | ||
| // RUN: FileCheck %s --input-file=tmp-generated-instructions.asm --check-prefix=ASM | ||
|
|
||
| #loop_annotation = #llvm.loop_annotation<mustProgress = true> | ||
| #tbaa_root = #llvm.tbaa_root<id = "Simple C++ TBAA"> | ||
| #tbaa_type_desc = #llvm.tbaa_type_desc<id = "omnipotent char", members = {<#tbaa_root, 0>}> | ||
| #tbaa_type_desc1 = #llvm.tbaa_type_desc<id = "float", members = {<#tbaa_type_desc, 0>}> | ||
| #tbaa_type_desc2 = #llvm.tbaa_type_desc<id = "int", members = {<#tbaa_type_desc, 0>}> | ||
| #tbaa_tag = #llvm.tbaa_tag<base_type = #tbaa_type_desc1, access_type = #tbaa_type_desc1, offset = 0> | ||
| #tbaa_tag1 = #llvm.tbaa_tag<base_type = #tbaa_type_desc2, access_type = #tbaa_type_desc2, offset = 0> | ||
| module attributes {dlti.dl_spec = #dlti.dl_spec<f80 = dense<128> : vector<2xi64>, !llvm.ptr = dense<64> : vector<4xi64>, !llvm.ptr<272> = dense<64> : vector<4xi64>, !llvm.ptr<271> = dense<32> : vector<4xi64>, i128 = dense<128> : vector<2xi64>, i64 = dense<64> : vector<2xi64>, f16 = dense<16> : vector<2xi64>, i32 = dense<32> : vector<2xi64>, f128 = dense<128> : vector<2xi64>, !llvm.ptr<270> = dense<32> : vector<4xi64>, f64 = dense<64> : vector<2xi64>, i1 = dense<8> : vector<2xi64>, i16 = dense<16> : vector<2xi64>, i8 = dense<8> : vector<2xi64>, "dlti.stack_alignment" = 128 : i64, "dlti.endianness" = "little">, llvm.ident = "clang version 20.1.7 (https://github.com/llvm/llvm-project.git 6146a88f60492b520a36f8f8f3231e15f3cc6082)"} { | ||
| llvm.func local_unnamed_addr @_Z6kernelPfPi(%arg0: !llvm.ptr {llvm.nocapture, llvm.noundef, llvm.readonly}, %arg1: !llvm.ptr {llvm.nocapture, llvm.noundef}) attributes {memory_effects = #llvm.memory_effects<other = none, argMem = readwrite, inaccessibleMem = none>, no_unwind, passthrough = ["mustprogress", "nofree", "norecurse", "nosync", ["uwtable", "2"], ["min-legal-vector-width", "0"], ["no-trapping-math", "true"], ["stack-protector-buffer-size", "8"], ["target-cpu", "x86-64"]], target_cpu = "x86-64", target_features = #llvm.target_features<["+cmov", "+cx8", "+fxsr", "+mmx", "+sse", "+sse2", "+x87"]>, tune_cpu = "generic"} { | ||
| %0 = llvm.mlir.constant(0 : i64) : i64 | ||
| %1 = llvm.mlir.constant(-1.000000e+00 : f32) : f32 | ||
| %2 = llvm.mlir.constant(5.000000e+00 : f32) : f32 | ||
| %3 = llvm.mlir.constant(1.800000e+01 : f32) : f32 | ||
| %4 = llvm.mlir.constant(1 : i32) : i32 | ||
| %5 = llvm.mlir.constant(1 : i64) : i64 | ||
| %6 = llvm.mlir.constant(2 : i64) : i64 | ||
| %7 = llvm.mlir.constant(20 : i64) : i64 | ||
| llvm.br ^bb1(%0 : i64) | ||
| ^bb1(%8: i64): // 2 preds: ^bb0, ^bb1 | ||
| %9 = llvm.getelementptr inbounds %arg0[%8] : (!llvm.ptr, i64) -> !llvm.ptr, f32 | ||
| %10 = llvm.load %9 {alignment = 4 : i64, tbaa = [#tbaa_tag]} : !llvm.ptr -> f32 | ||
| %11 = llvm.fadd %10, %1 : f32 | ||
| %12 = llvm.fmul %11, %2 : f32 | ||
| %13 = llvm.fdiv %12, %3 : f32 | ||
| %14 = llvm.fptosi %13 : f32 to i32 | ||
| %15 = llvm.sext %14 : i32 to i64 | ||
| %16 = llvm.getelementptr inbounds %arg1[%15] : (!llvm.ptr, i64) -> !llvm.ptr, i32 | ||
| %17 = llvm.load %16 {alignment = 4 : i64, tbaa = [#tbaa_tag1]} : !llvm.ptr -> i32 | ||
| %18 = llvm.add %17, %4 overflow<nsw> : i32 | ||
| llvm.store %18, %16 {alignment = 4 : i64, tbaa = [#tbaa_tag1]} : i32, !llvm.ptr | ||
| %19 = llvm.or disjoint %8, %5 : i64 | ||
| %20 = llvm.getelementptr inbounds %arg0[%19] : (!llvm.ptr, i64) -> !llvm.ptr, f32 | ||
| %21 = llvm.load %20 {alignment = 4 : i64, tbaa = [#tbaa_tag]} : !llvm.ptr -> f32 | ||
| %22 = llvm.fadd %21, %1 : f32 | ||
| %23 = llvm.fmul %22, %2 : f32 | ||
| %24 = llvm.fdiv %23, %3 : f32 | ||
| %25 = llvm.fptosi %24 : f32 to i32 | ||
| %26 = llvm.sext %25 : i32 to i64 | ||
| %27 = llvm.getelementptr inbounds %arg1[%26] : (!llvm.ptr, i64) -> !llvm.ptr, i32 | ||
| %28 = llvm.load %27 {alignment = 4 : i64, tbaa = [#tbaa_tag1]} : !llvm.ptr -> i32 | ||
| %29 = llvm.add %28, %4 overflow<nsw> : i32 | ||
| llvm.store %29, %27 {alignment = 4 : i64, tbaa = [#tbaa_tag1]} : i32, !llvm.ptr | ||
| %30 = llvm.add %8, %6 overflow<nsw, nuw> : i64 | ||
| %31 = llvm.icmp "eq" %30, %7 : i64 | ||
| llvm.cond_br %31, ^bb2, ^bb1(%30 : i64) {loop_annotation = #loop_annotation} | ||
| ^bb2: // pred: ^bb1 | ||
| llvm.return | ||
| } | ||
| } | ||
tancheng marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| // MAPPING: module | ||
| // MAPPING: func @_Z6kernelPfPi | ||
| // MAPPING: neura.constant | ||
| // MAPPING: neura.fdiv | ||
| // MAPPING: neura.cast | ||
|
|
||
| // YAML: instructions: | ||
| // YAML: - opcode: "CONSTANT" | ||
| // YAML: - opcode: "FDIV" | ||
| // YAML: - opcode: "CAST" | ||
|
|
||
| // ASM: PE(0,0): | ||
| // ASM: CONSTANT | ||
tancheng marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
tancheng marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.