Achieve peak performance on x86 CPUs and NVIDIA GPUs
performance cpu gpu assembly cuda avx nvidia intrinsics microarchitecture cpu-frequency microbenchmark cpu-microarchitecture gflop
-
Updated
Apr 5, 2026 - C++
Achieve peak performance on x86 CPUs and NVIDIA GPUs
Add a description, image, and links to the gflop topic page so that developers can more easily learn about it.
To associate your repository with the gflop topic, visit your repo's landing page and select "manage topics."