Skip to content

Commit

Permalink
test
Browse files Browse the repository at this point in the history
  • Loading branch information
labath committed Jul 17, 2024
2 parents 30c7cef + e093109 commit 85daf40
Show file tree
Hide file tree
Showing 7,698 changed files with 344,162 additions and 175,976 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
14 changes: 7 additions & 7 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -67,11 +67,11 @@ clang/test/AST/Interp/ @tbaederr
/mlir/include/mlir/Dialect/Linalg @dcaballe @nicolasvasilache @rengolin
/mlir/lib/Dialect/Linalg @dcaballe @nicolasvasilache @rengolin
/mlir/lib/Dialect/Linalg/Transforms/DecomposeLinalgOps.cpp @MaheshRavishankar @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/DropUnitDims.cpp @MaheshRavishankar @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/DropUnitDims.cpp @dcaballe @MaheshRavishankar @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/ElementwiseOpFusion.cpp @MaheshRavishankar @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/DataLayoutPropagation.cpp @hanhanW @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp @hanhanW @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp @hanhanW @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp @dcaballe @hanhanW @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp @banach-space @dcaballe @hanhanW @nicolasvasilache

# MemRef Dialect in MLIR.
/mlir/lib/Dialect/MemRef/Transforms/EmulateNarrowType.cpp @MaheshRavishankar @nicolasvasilache
Expand All @@ -85,10 +85,10 @@ clang/test/AST/Interp/ @tbaederr
/mlir/**/*VectorToSCF* @banach-space @dcaballe @matthias-springer @nicolasvasilache
/mlir/**/*VectorToLLVM* @banach-space @dcaballe @nicolasvasilache
/mlir/**/*X86Vector* @aartbik @dcaballe @nicolasvasilache
/mlir/include/mlir/Dialect/Vector @dcaballe @nicolasvasilache
/mlir/lib/Dialect/Vector @dcaballe @nicolasvasilache
/mlir/lib/Dialect/Vector/Transforms/* @hanhanW @nicolasvasilache
/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp @MaheshRavishankar @nicolasvasilache
/mlir/include/mlir/Dialect/Vector @banach-space @dcaballe @nicolasvasilache
/mlir/lib/Dialect/Vector @banach-space @dcaballe @nicolasvasilache
/mlir/lib/Dialect/Vector/Transforms/* @banach-space @dcaballe @hanhanW @nicolasvasilache
/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp @banach-space @dcaballe @MaheshRavishankar @nicolasvasilache
/mlir/**/*EmulateNarrowType* @dcaballe @hanhanW

# Presburger library in MLIR
Expand Down
23 changes: 23 additions & 0 deletions .github/new-prs-labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -775,6 +775,29 @@ backend:AArch64:
- clang/include/clang/Sema/SemaARM.h
- clang/lib/Sema/SemaARM.cpp

backend:Hexagon:
- clang/include/clang/Basic/BuiltinsHexagon*.def
- clang/include/clang/Sema/SemaHexagon.h
- clang/lib/Basic/Targets/Hexagon.*
- clang/lib/CodeGen/Targets/Hexagon.cpp
- clang/lib/Driver/ToolChains/Hexagon.*
- clang/lib/Sema/SemaHexagon.cpp
- lld/ELF/Arch/Hexagon.cpp
- lldb/source/Plugins/ABI/Hexagon/**
- lldb/source/Plugins/DynamicLoader/Hexagon-DYLD/**
- llvm/include/llvm/BinaryFormat/ELFRelocs/Hexagon.def
- llvm/include/llvm/IR/IntrinsicsHexagon*
- llvm/include/llvm/Support/Hexagon*
- llvm/lib/Support/Hexagon*
- llvm/lib/Target/Hexagon/**
- llvm/test/CodeGen/Hexagon/**
- llvm/test/CodeGen/*/Hexagon/**
- llvm/test/DebugInfo/*/Hexagon/**
- llvm/test/Transforms/*/Hexagon
- llvm/test/MC/Disassembler/Hexagon/**
- llvm/test/MC/Hexagon/**
- llvm/test/tools/llvm-objdump/ELF/Hexagon/**

backend:loongarch:
- llvm/include/llvm/IR/IntrinsicsLoongArch.td
- llvm/test/MC/LoongArch/**
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/libcxx-build-and-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ jobs:
cxx: [ 'clang++-19' ]
include:
- config: 'generic-gcc'
cc: 'gcc-13'
cxx: 'g++-13'
cc: 'gcc-14'
cxx: 'g++-14'
steps:
- uses: actions/checkout@v4
- name: ${{ matrix.config }}.${{ matrix.cxx }}
Expand Down Expand Up @@ -101,8 +101,8 @@ jobs:
cxx: [ 'clang++-19' ]
include:
- config: 'generic-gcc-cxx11'
cc: 'gcc-13'
cxx: 'g++-13'
cc: 'gcc-14'
cxx: 'g++-14'
- config: 'generic-cxx23'
cc: 'clang-17'
cxx: 'clang++-17'
Expand Down
12 changes: 11 additions & 1 deletion bolt/docs/CommandLineArgumentReference.md
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,12 @@

List of functions to pad with amount of bytes

- `--print-mappings`

Print mappings in the legend, between characters/blocks and text sections
(default false).


- `--profile-format=<value>`

Format to dump profile output in aggregation mode, default is fdata
Expand Down Expand Up @@ -688,6 +694,10 @@

Use a modified clustering algorithm geared towards minimizing branches

- `--name-similarity-function-matching-threshold=<uint>`

Match functions using namespace and edit distance.

- `--no-inline`

Disable all inlining (overrides other inlining options)
Expand Down Expand Up @@ -1236,4 +1246,4 @@

- `--print-options`

Print non-default options after command line parsing
Print non-default options after command line parsing
Binary file added bolt/docs/HeatmapHeader.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
68 changes: 56 additions & 12 deletions bolt/docs/Heatmaps.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Code Heatmaps

BOLT has gained the ability to print code heatmaps based on
sampling-based LBR profiles generated by `perf`. The output is produced
in colored ASCII to be displayed in a color-capable terminal. It looks
something like this:
sampling-based profiles generated by `perf`, either with `LBR` data or not.
The output is produced in colored ASCII to be displayed in a color-capable
terminal. It looks something like this:

![](./Heatmap.png)

Expand Down Expand Up @@ -32,20 +32,64 @@ $ llvm-bolt-heatmap -p perf.data <executable>
```

By default the heatmap will be dumped to *stdout*. You can change it
with `-o <heatmapfile>` option. Each character/block in the heatmap
shows the execution data accumulated for corresponding 64 bytes of
code. You can change this granularity with a `-block-size` option.
E.g. set it to 4096 to see code usage grouped by 4K pages.
Other useful options are:
with `-o <heatmapfile>` option.

```bash
-line-size=<uint> - number of entries per line (default 256)
-max-address=<uint> - maximum address considered valid for heatmap (default 4GB)
```

If you prefer to look at the data in a browser (or would like to share
it that way), then you can use an HTML conversion tool. E.g.:

```bash
$ aha -b -f <heatmapfile> > <heatmapfile>.html
```

---

## Background on heatmaps:
A heatmap is effectively a histogram that is rendered into a grid for better
visualization.
In theory we can generate a heatmap using any binary and a perf profile.

Each block/character in the heatmap shows the execution data accumulated for
corresponding 64 bytes of code. You can change this granularity with a
`-block-size` option.
E.g. set it to 4096 to see code usage grouped by 4K pages.


When a block is shown as a dot, it means that no samples were found for that
address.
When it is shown as a letter, it indicates a captured sample on a particular
text section of the binary.
To show a mapping between letters and text sections in the legend, use
`-print-mappings`.
When a sampled address does not belong to any of the text sections, the
characters 'o' or 'O' will be shown.

The legend shows by default the ranges in the heatmap according to the number
of samples per block.
A color is assigned per range, except the first two ranges that distinguished by
lower and upper case letters.

On the Y axis, each row/line starts with an actual address of the binary.
Consecutive lines in the heatmap advance by the same amount, with the binary
size covered by a line dependent on the block size and the line size.
An empty new line is inserted for larger gaps between samples.

On the X axis, the horizontally emitted hex numbers can help *estimate* where
in the line the samples lie, but they cannot be combined to provide a full
address, as they are relative to both the bucket and line sizes.

In the example below, the highlighted `0x100` column is not an offset to each
row's address, but instead, it points to the middle of the line.
For the generation, the default bucket size was used with a line size of 128.


![](./HeatmapHeader.png)


Some useful options are:

```
-line-size=<uint> - number of entries per line (default 256)
-max-address=<uint> - maximum address considered valid for heatmap (default 4GB)
-print-mappings - print mappings in the legend, between characters/blocks and text sections (default false)
```
5 changes: 5 additions & 0 deletions bolt/docs/OptimizingLinux.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,11 @@ $ perf2bolt -p perf.data -o perf.fdata vmlinux

Under a high load, `perf.data` should be several gigabytes in size and you should expect the converted `perf.fdata` not to exceed 100 MB.

Profiles collected from multiple workloads could be joined into a single profile using `merge-fdata` utility:
```bash
$ merge-fdata perf.1.fdata perf.2.fdata ... perf.<N>.fdata > perf.merged.fdata
```

Two changes are required for the kernel build. The first one is optional but highly recommended. It introduces a BOLT-reserved space into `vmlinux` code section:


Expand Down
9 changes: 0 additions & 9 deletions bolt/include/bolt/Core/BinaryBasicBlock.h
Original file line number Diff line number Diff line change
Expand Up @@ -842,15 +842,6 @@ class BinaryBasicBlock {
bool analyzeBranch(const MCSymbol *&TBB, const MCSymbol *&FBB,
MCInst *&CondBranch, MCInst *&UncondBranch);

/// Return true if iterator \p I is pointing to the first instruction in
/// a pair that could be macro-fused.
bool isMacroOpFusionPair(const_iterator I) const;

/// If the basic block has a pair of instructions suitable for macro-fusion,
/// return iterator to the first instruction of the pair.
/// Otherwise return end().
const_iterator getMacroOpFusionPair() const;

/// Printer required for printing dominator trees.
void printAsOperand(raw_ostream &OS, bool PrintType = true) {
if (PrintType)
Expand Down
4 changes: 0 additions & 4 deletions bolt/include/bolt/Core/BinaryContext.h
Original file line number Diff line number Diff line change
Expand Up @@ -698,10 +698,6 @@ class BinaryContext {

/// Binary-wide aggregated stats.
struct BinaryStats {
/// Stats for macro-fusion.
uint64_t MissedMacroFusionPairs{0};
uint64_t MissedMacroFusionExecCount{0};

/// Stats for stale profile matching:
/// the total number of basic blocks in the profile
uint32_t NumStaleBlocks{0};
Expand Down
4 changes: 0 additions & 4 deletions bolt/include/bolt/Core/BinaryFunction.h
Original file line number Diff line number Diff line change
Expand Up @@ -835,10 +835,6 @@ class BinaryFunction {
/// them.
void calculateLoopInfo();

/// Calculate missed macro-fusion opportunities and update BinaryContext
/// stats.
void calculateMacroOpFusionStats();

/// Returns if BinaryDominatorTree has been constructed for this function.
bool hasDomTree() const { return BDT != nullptr; }

Expand Down
Loading

0 comments on commit 85daf40

Please sign in to comment.