Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,7 @@ charset = utf-8
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true

[*.md]
indent_style = space
indent_size = 4
2 changes: 1 addition & 1 deletion docs/Index.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,4 +116,4 @@ The Triton file implements a similar fused attention kernel entirely in Python u
By using PyTorch’s distributed package in the same script, the Triton implementation shows how to scale the fused attention kernel across multiple GPUs with minimal boilerplate code(which is pretty good). This provides a very accessible route for researchers to experiment with and iterate on advanced GPU kernels without delving deeply into low-level CUDA programming.


In short way, what i am actually doing is re-implementing a sophisticated, high-performance attention mechanism in a more maintainable and experiment-friendly environment and providing a research prototype that can serve as the basis for future production-grade attention mechanisms....
In short way, what i am actually doing is re-implementing a sophisticated, high-performance attention mechanism in a more maintainable and experiment-friendly environment and providing a research prototype that can serve as the basis for future production-grade attention mechanisms....