ardfork

ardfork

Achievements

ComfyUI-flash-attention-triton ComfyUI-flash-attention-triton Public

A ComfyUI node that allows you to select Flash Attention Triton implementation as sampling attention.

Python 2
exllama exllama Public

Forked from turboderp/exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.

Python 1
text-generation-webui text-generation-webui Public

Forked from oobabooga/text-generation-webui

A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.

Python
whisper.cpp whisper.cpp Public

Forked from ggml-org/whisper.cpp

Port of OpenAI's Whisper model in C/C++

C
exllamav2 exllamav2 Public

Forked from turboderp-org/exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs

Python
llama.cpp llama.cpp Public

Forked from ggml-org/llama.cpp

Port of Facebook's LLaMA model in C/C++

C