CUDA’s Fine-Grained Multithreading: Why GPUs are SPMD, not True SIMD

CUDA
Author

Imad Dabbura

Published

March 20, 2025

CUDA uses a Single Program, Multiple Data (SPMD) programming model, not Single Instruction, Multiple Data (SIMD). While threads within a GPU’s warp execute in a SIMD-like fashion, with the same instruction issued to all threads simultaneously, the overall architecture is SPMD. This is because:

In essence, CUDA’s SPMD model allows programmers to write a single kernel program that is executed by many threads, with the flexibility for those threads to have unique execution paths, while the underlying hardware uses SIMD-like execution for performance within a warp.