#

avx2

Here are 227 public repositories matching this topic...

h1me01 / SIMD

SIMD implementation in normal CPU code, SSE and AVX. I created this project to test if the AVX and SSE code for my neural network works correctly and to compare its performance with regular CPU code. The focus is on operations like dot product, Adam optimizer, and gradient updates.

sse simd avx2 adam-optimizer dot-product

Updated Jun 30, 2024
C++

rsonquery / rsonpath

Blazing fast JSONPath query engine written in Rust.

search rust cli json query simd avx2 command-line-tool jsonpath rq

Updated Jun 30, 2024
Rust

bgin / Radar-ElectroOptical-Simulation

(REOS) Radar and Electro-Optical Simulation Framework written in C++.

simulation radar avx radiative-transfer high-performance-computing gpu-acceleration avx2 cuda-kernels modelling vectorization avx512 simd-instructions control-theory fortran90 amd-gpu infrared-sensors atmosphere-model radar-signal-processing

Updated Jun 30, 2024
C++

axze-az / cftal

a template based C++ short vector library with vectorized faithfully rounded elementary functions

avx2 math-library intrinsic-functions sollya expression-templates short-vector vectorized-elementary-functions cxx20-library

Updated Jun 30, 2024
C++

eve

jfalcou / eve

Expressive Vector Engine - SIMD in C++ Goes Brrrr

cpp hpc neon avx simd avx2 sse2 simd-programming cpp-library aarch64 simd-parallelism altivec ssse3 simd-library

Updated Jun 30, 2024
C++

cloudflare / sliceslice-rs

A fast implementation of single-pattern substring search using SIMD acceleration.

search-in-text simd avx2 simd-programming text-processing simd-instructions substring-search

Updated Jun 29, 2024
Rust

andyD123 / DR3

DR3 enables users to write vectorised code using generic lambdas and filters. Switch instruction set just by changing enclosing namespace

gcc intel clang simd avx2 filters simd-programming avx512 lambdas simd-library simd-intrinsics

Updated Jun 29, 2024
C++

kimwalisch / libpopcnt

🚀 Fast C/C++ bit population count library

c cpp neon simd avx2 avx512 popcnt popcount sve

Updated Jun 29, 2024
C

google / highway

Performance-portable, length-agnostic SIMD with runtime dispatch

neon wasm avx simd intrinsics avx2 simd-programming avx512 simd-parallelism simd-instructions simd-library sse42 avx-instructions simd-intrinsics avx-512

Updated Jun 28, 2024
C++

bgin / Radar_ElectroOptical_Simulation

(REOS) Radar and ElectroOptical Simulation Framework written in Fortran.

simulation modeling radar c99 openmp avx radiative-transfer simd high-performance-computing gpu-acceleration avx2 cuda-kernels control-systems vectorization amdgpu fortran90 infrared-sensors avx-512

Updated Jun 28, 2024
Fortran

OpenNMT / CTranslate2

Fast inference engine for Transformer models

Updated Jun 28, 2024
C++

RoaringBitmap / CRoaring

Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, and StarRocks

c bitset arm visual-studio roaring-bitmaps neon gcc clang avx2 bitset-library avx-512

Updated Jun 27, 2024
C

simdjson

simdjson / simdjson

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

c-plus-plus json arm neon x64 clang cpp11 simd json-parser avx2 json-pointer arm64 aarch64 avx512 gcc-compiler sse42 vs2019 clang-cl loongarch

Updated Jun 27, 2024
C++

simd-everywhere / simde

Implementations of SIMD instruction sets for systems which don't natively support them.

Updated Jun 27, 2024
C

libxsmm / libxsmm

Library for specialized dense and sparse matrix operations, and deep learning primitives.

machine-learning fortran vector matrix intel avx sse jit simd matrix-multiplication sparse blas convolution avx2 amx tensor avx512 transpose bfloat16

Updated Jun 26, 2024
C

minio / highwayhash

Native Go version of HighwayHash with optimized assembly implementations on Intel and ARM. Able to process over 10 GB/sec on a single core on Intel CPUs - https://en.wikipedia.org/wiki/HighwayHash

neon assembly hash-functions plan9 avx2 highway-hash

Updated Jun 26, 2024
Go

simdutf

simdutf / simdutf

Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension. Part of Node.js and Bun.

unicode cpp transcoding neon simd avx2 sse2 utf8 risc-v utf16 avx-512

Updated Jun 25, 2024
C++

DWScript

EricGrange / DWScript

Delphi Web Script general purpose scripting engine

javascript windows programming-language components delphi pascal json script webserver jit transpiler sqlite3 avx2 pascal-compiler pascal-language

Updated Jun 24, 2024
Pascal

JohT / convolution-benchmarks

Benchmark convolution implementations in C++ with Catch2 visualized with Vega-Lite

testing charts benchmark performance neon avx simd vega-lite convolution avx2 sse2 avx512

Updated Jun 24, 2024
C++

Daniel-Liu-c0deb0t / block-aligner

SIMD-accelerated library for computing global and X-drop affine gap penalty sequence-to-sequence or sequence-to-profile alignments using an adaptive block-based algorithm.

rust bioinformatics algorithms neon webassembly wasm simd alignment avx2

Updated Jun 23, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the avx2 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the avx2 topic, visit your repo's landing page and select "manage topics."