flash-attention-prebuild-wheels

History

v0.7.2

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.6.3, 2.7.4, 2.8.3 3.10, 3.11, 3.12, 3.13 2.5, 2.6, 2.7, 2.8 12.8, 12.9

Manylinux 2_24 x86_64

Flash-Attention Python PyTorch CUDA
2.6.3, 2.7.4, 2.8.3 3.10, 3.11, 3.12, 3.13 2.6, 2.7, 2.8 12.8, 12.9

Manylinux2014 x86_64

Flash-Attention Python PyTorch CUDA
2.6.3, 2.7.4, 2.8.3 3.10, 3.11, 3.12, 3.13 2.5 12.8

v0.7.0

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.6.3, 2.7.4, 2.8.3 3.10, 3.11, 3.12 2.9 12.8, 13.0

v0.6.9

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.6.3 3.14 2.9 13.0

v0.6.4

Release

Linux arm64

Flash-Attention Python PyTorch CUDA
2.7.4, 2.8.3 3.10, 3.11, 3.12 2.5, 2.6, 2.7, 2.9 12.4, 12.8, 13.0

v0.6.3

Release

Linux arm64

Flash-Attention Python PyTorch CUDA
2.6.3 3.10, 3.11, 3.12 2.5, 2.6, 2.7, 2.9 12.4, 12.8, 13.0

v0.5.4

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.6.3, 2.7.4, 2.8.3 3.10, 3.11, 3.12 2.5, 2.6, 2.7, 2.8, 2.9 12.4, 12.6, 12.8, 13.0

v0.4.22

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.8.1 3.10, 3.11, 3.12, 3.13 2.9 12.8, 13.0

v0.4.18

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.6.3, 2.8.3 3.10, 3.11, 3.12, 3.13 2.9 13.0

v0.4.17

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.6.3, 2.8.3 3.10, 3.11, 3.12, 3.13 2.9 12.6, 12.8

v0.4.16

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.6.3, 2.8.3 3.9 2.5, 2.6, 2.7, 2.8 12.4, 12.6

v0.4.15

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.8.3 3.11, 3.12, 3.13 2.9 12.6, 12.8

Windows x86_64

Flash-Attention Python PyTorch CUDA
2.8.3 3.11, 3.12, 3.13 2.9 12.6

v0.4.12

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.8.3 3.13 2.6, 2.7, 2.8 12.4, 12.6, 12.8, 12.9

Windows x86_64

Flash-Attention Python PyTorch CUDA
2.8.2 3.13 2.6, 2.7, 2.8 12.4, 12.6

v0.4.11

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.8.3 3.10, 3.11, 3.12 2.5, 2.6, 2.7, 2.8 12.4, 12.6, 12.8, 12.9

v0.4.10

Release

Windows x86_64

Flash-Attention Python PyTorch CUDA
2.7.4, 2.8.2 3.10, 3.11, 3.12 2.7, 2.8 12.8

v0.4.9

Release

Windows x86_64

Flash-Attention Python PyTorch CUDA
2.7.4 3.11 2.7 12.8

v0.3.18

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.7.4 3.10, 3.11, 3.12 2.5, 2.6, 2.7, 2.8 12.4, 12.8, 12.9

v0.3.14

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.6.3, 2.8.2 3.10, 3.11, 3.12 2.5.1, 2.6.0, 2.7.1, 2.8.0 12.4.1, 12.8.1, 12.9.1

v0.3.13

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.8.1 3.10, 3.11, 3.12 2.4.1, 2.5.1, 2.6.0, 2.7.1 12.8.1

v0.3.12

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.8.0 3.10, 3.11, 3.12 2.4.1, 2.5.1, 2.6.0, 2.7.1 12.4.1, 12.8.1

v0.3.10

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.7.4 3.10, 3.11, 3.12 2.7.1 12.8.1

v0.3.9

Release

Linux x86_64

Flash-Attention Python PyTorch CUDA
2.4.3, 2.5.9, 2.6.3 3.10, 3.11, 3.12 2.7.1 12.8.1

Windows x86_64

Flash-Attention Python PyTorch CUDA
2.5.9, 2.6.3, 2.7.4 3.10, 3.11, 3.12 2.4.1, 2.5.1, 2.6.0 12.4.1

[!IMPORTANT] ⚠️ Building flash-attn v2.7.4 with CUDA 12.8 on Windows cannot be completed because of GitHub Actions’ processing-time limits. In the future, I plan to add a self-hosted Windows runner to resolve this issue.

v0.3.1

Release

Windows x86_64

Flash-Attention Python PyTorch CUDA
2.6.3 3.11 2.6.0 12.6.3

From this version, Wheels for Windows are released.
However, we are waiting for a report on how it works because we have not tested it enough.

v0.2.1

Release

Flash-Attention Python PyTorch CUDA
2.4.3, 2.5.9, 2.6.3, 2.7.4 3.10, 3.11, 3.12 2.8.0.dev20250523 12.8.1

v0.2.0

Release

Flash-Attention Python PyTorch CUDA
2.4.3, 2.5.9, 2.6.3 3.10, 3.11, 3.12 2.8.0.dev20250523 12.8.1

v0.1.0

Release

Flash-Attention Python PyTorch CUDA
2.4.3, 2.5.9, 2.6.3, 2.7.4 3.10, 3.11, 3.12 2.7.0 12.8.1

v2.7.4 and v2.7.4.post1 are the same version.

From this release, self-hosted runners are used for building some wheels.

v0.0.9

Release

Flash-Attention Python PyTorch CUDA
2.4.3, 2.5.9, 2.6.3 3.10, 3.11, 3.12 2.7.0 12.8.1

v0.0.8

Release

Flash-Attention Python PyTorch CUDA
2.4.3, 2.5.9, 2.6.3, 2.7.4.post1 3.10, 3.11, 3.12 2.4.1, 2.5.1, 2.6.0, 2.7.0 11.8.0, 12.4.1, 12.6.3

v0.0.7

Skip for experimental reasons.

v0.0.6

Release

Flash-Attention Python PyTorch CUDA
2.4.3, 2.5.9, 2.6.3, 2.7.4.post1 3.10, 3.11, 3.12 2.2.2, 2.3.1, 2.4.1, 2.5.1, 2.6.0 12.4.1, 12.6.3

v0.0.5

Release

Flash-Attention Python PyTorch CUDA
2.6.3, 2.7.4.post1 3.10, 3.11, 3.12 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1, 2.6.0 12.4.1, 12.6.3

v0.0.4

Release

Flash-Attention Python PyTorch CUDA
2.7.3 3.10, 3.11, 3.12 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1 11.8.0, 12.1.1, 12.4.1

v0.0.3

Release

Flash-Attention Python PyTorch CUDA
2.7.2.post1 3.10, 3.11, 3.12 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1 11.8.0, 12.1.1, 12.4.1

v0.0.2

Release

Flash-Attention Python PyTorch CUDA
2.4.3, 2.5.6, 2.6.3, 2.7.0.post2 3.10, 3.11, 3.12 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1 11.8.0, 12.1.1, 12.4.1

v0.0.1

Release

flash-attention Python PyTorch CUDA
1.0.9, 2.4.3, 2.5.6, 2.5.9, 2.6.3 3.10, 3.11, 3.12 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.0 11.8.0, 12.1.1, 12.4.1

v0.0.0

Release

flash-attention Python PyTorch CUDA
2.4.3, 2.5.6, 2.5.9, 2.6.3 3.11, 3.12 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.0 11.8.0, 12.1.1, 12.4.1