flash-attention-prebuild-wheels

History

v0.7.2

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.6.3, 2.7.4, 2.8.3	3.10, 3.11, 3.12, 3.13	2.5, 2.6, 2.7, 2.8	12.8, 12.9

Manylinux 2_24 x86_64

Flash-Attention	Python	PyTorch	CUDA
2.6.3, 2.7.4, 2.8.3	3.10, 3.11, 3.12, 3.13	2.6, 2.7, 2.8	12.8, 12.9

Manylinux2014 x86_64

Flash-Attention	Python	PyTorch	CUDA
2.6.3, 2.7.4, 2.8.3	3.10, 3.11, 3.12, 3.13	2.5	12.8

v0.7.0

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.6.3, 2.7.4, 2.8.3	3.10, 3.11, 3.12	2.9	12.8, 13.0

v0.6.9

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.6.3	3.14	2.9	13.0

v0.6.4

Release

Linux arm64

Flash-Attention	Python	PyTorch	CUDA
2.7.4, 2.8.3	3.10, 3.11, 3.12	2.5, 2.6, 2.7, 2.9	12.4, 12.8, 13.0

v0.6.3

Release

Linux arm64

Flash-Attention	Python	PyTorch	CUDA
2.6.3	3.10, 3.11, 3.12	2.5, 2.6, 2.7, 2.9	12.4, 12.8, 13.0

v0.5.4

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.6.3, 2.7.4, 2.8.3	3.10, 3.11, 3.12	2.5, 2.6, 2.7, 2.8, 2.9	12.4, 12.6, 12.8, 13.0

v0.4.22

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.8.1	3.10, 3.11, 3.12, 3.13	2.9	12.8, 13.0

v0.4.18

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.6.3, 2.8.3	3.10, 3.11, 3.12, 3.13	2.9	13.0

v0.4.17

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.6.3, 2.8.3	3.10, 3.11, 3.12, 3.13	2.9	12.6, 12.8

v0.4.16

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.6.3, 2.8.3	3.9	2.5, 2.6, 2.7, 2.8	12.4, 12.6

v0.4.15

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.8.3	3.11, 3.12, 3.13	2.9	12.6, 12.8

Windows x86_64

Flash-Attention	Python	PyTorch	CUDA
2.8.3	3.11, 3.12, 3.13	2.9	12.6

v0.4.12

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.8.3	3.13	2.6, 2.7, 2.8	12.4, 12.6, 12.8, 12.9

Windows x86_64

Flash-Attention	Python	PyTorch	CUDA
2.8.2	3.13	2.6, 2.7, 2.8	12.4, 12.6

v0.4.11

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.8.3	3.10, 3.11, 3.12	2.5, 2.6, 2.7, 2.8	12.4, 12.6, 12.8, 12.9

v0.4.10

Release

Windows x86_64

Flash-Attention	Python	PyTorch	CUDA
2.7.4, 2.8.2	3.10, 3.11, 3.12	2.7, 2.8	12.8

v0.4.9

Release

Windows x86_64

Flash-Attention	Python	PyTorch	CUDA
2.7.4	3.11	2.7	12.8

v0.3.18

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.7.4	3.10, 3.11, 3.12	2.5, 2.6, 2.7, 2.8	12.4, 12.8, 12.9

v0.3.14

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.6.3, 2.8.2	3.10, 3.11, 3.12	2.5.1, 2.6.0, 2.7.1, 2.8.0	12.4.1, 12.8.1, 12.9.1

v0.3.13

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.8.1	3.10, 3.11, 3.12	2.4.1, 2.5.1, 2.6.0, 2.7.1	12.8.1

v0.3.12

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.8.0	3.10, 3.11, 3.12	2.4.1, 2.5.1, 2.6.0, 2.7.1	12.4.1, 12.8.1

v0.3.10

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.7.4	3.10, 3.11, 3.12	2.7.1	12.8.1

v0.3.9

Release

Linux x86_64

Flash-Attention	Python	PyTorch	CUDA
2.4.3, 2.5.9, 2.6.3	3.10, 3.11, 3.12	2.7.1	12.8.1

Windows x86_64

Flash-Attention	Python	PyTorch	CUDA
2.5.9, 2.6.3, 2.7.4	3.10, 3.11, 3.12	2.4.1, 2.5.1, 2.6.0	12.4.1

[!IMPORTANT] ⚠️ Building flash-attn v2.7.4 with CUDA 12.8 on Windows cannot be completed because of GitHub Actions’ processing-time limits. In the future, I plan to add a self-hosted Windows runner to resolve this issue.

v0.3.1

Release

Windows x86_64

Flash-Attention	Python	PyTorch	CUDA
2.6.3	3.11	2.6.0	12.6.3

From this version, Wheels for Windows are released.
However, we are waiting for a report on how it works because we have not tested it enough.

v0.2.1

Release

Flash-Attention	Python	PyTorch	CUDA
2.4.3, 2.5.9, 2.6.3, 2.7.4	3.10, 3.11, 3.12	2.8.0.dev20250523	12.8.1

v0.2.0

Release

Flash-Attention	Python	PyTorch	CUDA
2.4.3, 2.5.9, 2.6.3	3.10, 3.11, 3.12	2.8.0.dev20250523	12.8.1

v0.1.0

Release

Flash-Attention	Python	PyTorch	CUDA
2.4.3, 2.5.9, 2.6.3, 2.7.4	3.10, 3.11, 3.12	2.7.0	12.8.1

v2.7.4 and v2.7.4.post1 are the same version.

From this release, self-hosted runners are used for building some wheels.

v0.0.9

Release

Flash-Attention	Python	PyTorch	CUDA
2.4.3, 2.5.9, 2.6.3	3.10, 3.11, 3.12	2.7.0	12.8.1

v0.0.8

Release

Flash-Attention	Python	PyTorch	CUDA
2.4.3, 2.5.9, 2.6.3, 2.7.4.post1	3.10, 3.11, 3.12	2.4.1, 2.5.1, 2.6.0, 2.7.0	11.8.0, 12.4.1, 12.6.3

v0.0.7

Skip for experimental reasons.

v0.0.6

Release

Flash-Attention	Python	PyTorch	CUDA
2.4.3, 2.5.9, 2.6.3, 2.7.4.post1	3.10, 3.11, 3.12	2.2.2, 2.3.1, 2.4.1, 2.5.1, 2.6.0	12.4.1, 12.6.3

v0.0.5

Release

Flash-Attention	Python	PyTorch	CUDA
2.6.3, 2.7.4.post1	3.10, 3.11, 3.12	2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1, 2.6.0	12.4.1, 12.6.3

v0.0.4

Release

Flash-Attention	Python	PyTorch	CUDA
2.7.3	3.10, 3.11, 3.12	2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1	11.8.0, 12.1.1, 12.4.1

v0.0.3

Release

Flash-Attention	Python	PyTorch	CUDA
2.7.2.post1	3.10, 3.11, 3.12	2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1	11.8.0, 12.1.1, 12.4.1

v0.0.2

Release

Flash-Attention	Python	PyTorch	CUDA
2.4.3, 2.5.6, 2.6.3, 2.7.0.post2	3.10, 3.11, 3.12	2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1	11.8.0, 12.1.1, 12.4.1

v0.0.1

Release

flash-attention	Python	PyTorch	CUDA
1.0.9, 2.4.3, 2.5.6, 2.5.9, 2.6.3	3.10, 3.11, 3.12	2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.0	11.8.0, 12.1.1, 12.4.1

v0.0.0

Release

flash-attention	Python	PyTorch	CUDA
2.4.3, 2.5.6, 2.5.9, 2.6.3	3.11, 3.12	2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.0	11.8.0, 12.1.1, 12.4.1

This site is open source. Improve this page.