History
v0.7.2
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.6.3, 2.7.4, 2.8.3 |
3.10, 3.11, 3.12, 3.13 |
2.5, 2.6, 2.7, 2.8 |
12.8, 12.9 |
Manylinux 2_24 x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.6.3, 2.7.4, 2.8.3 |
3.10, 3.11, 3.12, 3.13 |
2.6, 2.7, 2.8 |
12.8, 12.9 |
Manylinux2014 x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.6.3, 2.7.4, 2.8.3 |
3.10, 3.11, 3.12, 3.13 |
2.5 |
12.8 |
v0.7.0
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.6.3, 2.7.4, 2.8.3 |
3.10, 3.11, 3.12 |
2.9 |
12.8, 13.0 |
v0.6.9
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.6.3 |
3.14 |
2.9 |
13.0 |
v0.6.4
Release
Linux arm64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.7.4, 2.8.3 |
3.10, 3.11, 3.12 |
2.5, 2.6, 2.7, 2.9 |
12.4, 12.8, 13.0 |
v0.6.3
Release
Linux arm64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.6.3 |
3.10, 3.11, 3.12 |
2.5, 2.6, 2.7, 2.9 |
12.4, 12.8, 13.0 |
v0.5.4
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.6.3, 2.7.4, 2.8.3 |
3.10, 3.11, 3.12 |
2.5, 2.6, 2.7, 2.8, 2.9 |
12.4, 12.6, 12.8, 13.0 |
v0.4.22
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.8.1 |
3.10, 3.11, 3.12, 3.13 |
2.9 |
12.8, 13.0 |
v0.4.18
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.6.3, 2.8.3 |
3.10, 3.11, 3.12, 3.13 |
2.9 |
13.0 |
v0.4.17
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.6.3, 2.8.3 |
3.10, 3.11, 3.12, 3.13 |
2.9 |
12.6, 12.8 |
v0.4.16
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.6.3, 2.8.3 |
3.9 |
2.5, 2.6, 2.7, 2.8 |
12.4, 12.6 |
v0.4.15
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.8.3 |
3.11, 3.12, 3.13 |
2.9 |
12.6, 12.8 |
Windows x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.8.3 |
3.11, 3.12, 3.13 |
2.9 |
12.6 |
v0.4.12
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.8.3 |
3.13 |
2.6, 2.7, 2.8 |
12.4, 12.6, 12.8, 12.9 |
Windows x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.8.2 |
3.13 |
2.6, 2.7, 2.8 |
12.4, 12.6 |
v0.4.11
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.8.3 |
3.10, 3.11, 3.12 |
2.5, 2.6, 2.7, 2.8 |
12.4, 12.6, 12.8, 12.9 |
v0.4.10
Release
Windows x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.7.4, 2.8.2 |
3.10, 3.11, 3.12 |
2.7, 2.8 |
12.8 |
v0.4.9
Release
Windows x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.7.4 |
3.11 |
2.7 |
12.8 |
v0.3.18
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.7.4 |
3.10, 3.11, 3.12 |
2.5, 2.6, 2.7, 2.8 |
12.4, 12.8, 12.9 |
v0.3.14
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.6.3, 2.8.2 |
3.10, 3.11, 3.12 |
2.5.1, 2.6.0, 2.7.1, 2.8.0 |
12.4.1, 12.8.1, 12.9.1 |
v0.3.13
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.8.1 |
3.10, 3.11, 3.12 |
2.4.1, 2.5.1, 2.6.0, 2.7.1 |
12.8.1 |
v0.3.12
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.8.0 |
3.10, 3.11, 3.12 |
2.4.1, 2.5.1, 2.6.0, 2.7.1 |
12.4.1, 12.8.1 |
v0.3.10
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.7.4 |
3.10, 3.11, 3.12 |
2.7.1 |
12.8.1 |
v0.3.9
Release
Linux x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.4.3, 2.5.9, 2.6.3 |
3.10, 3.11, 3.12 |
2.7.1 |
12.8.1 |
Windows x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.5.9, 2.6.3, 2.7.4 |
3.10, 3.11, 3.12 |
2.4.1, 2.5.1, 2.6.0 |
12.4.1 |
[!IMPORTANT]
⚠️ Building flash-attn v2.7.4 with CUDA 12.8 on Windows cannot be completed because of GitHub Actions’ processing-time limits. In the future, I plan to add a self-hosted Windows runner to resolve this issue.
v0.3.1
Release
Windows x86_64
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.6.3 |
3.11 |
2.6.0 |
12.6.3 |
From this version, Wheels for Windows are released.
However, we are waiting for a report on how it works because we have not tested it enough.
v0.2.1
Release
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.4.3, 2.5.9, 2.6.3, 2.7.4 |
3.10, 3.11, 3.12 |
2.8.0.dev20250523 |
12.8.1 |
v0.2.0
Release
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.4.3, 2.5.9, 2.6.3 |
3.10, 3.11, 3.12 |
2.8.0.dev20250523 |
12.8.1 |
v0.1.0
Release
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.4.3, 2.5.9, 2.6.3, 2.7.4 |
3.10, 3.11, 3.12 |
2.7.0 |
12.8.1 |
v2.7.4 and v2.7.4.post1 are the same version.
From this release, self-hosted runners are used for building some wheels.
v0.0.9
Release
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.4.3, 2.5.9, 2.6.3 |
3.10, 3.11, 3.12 |
2.7.0 |
12.8.1 |
v0.0.8
Release
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.4.3, 2.5.9, 2.6.3, 2.7.4.post1 |
3.10, 3.11, 3.12 |
2.4.1, 2.5.1, 2.6.0, 2.7.0 |
11.8.0, 12.4.1, 12.6.3 |
v0.0.7
Skip for experimental reasons.
v0.0.6
Release
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.4.3, 2.5.9, 2.6.3, 2.7.4.post1 |
3.10, 3.11, 3.12 |
2.2.2, 2.3.1, 2.4.1, 2.5.1, 2.6.0 |
12.4.1, 12.6.3 |
v0.0.5
Release
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.6.3, 2.7.4.post1 |
3.10, 3.11, 3.12 |
2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1, 2.6.0 |
12.4.1, 12.6.3 |
v0.0.4
Release
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.7.3 |
3.10, 3.11, 3.12 |
2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1 |
11.8.0, 12.1.1, 12.4.1 |
v0.0.3
Release
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.7.2.post1 |
3.10, 3.11, 3.12 |
2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1 |
11.8.0, 12.1.1, 12.4.1 |
v0.0.2
Release
| Flash-Attention |
Python |
PyTorch |
CUDA |
| 2.4.3, 2.5.6, 2.6.3, 2.7.0.post2 |
3.10, 3.11, 3.12 |
2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1 |
11.8.0, 12.1.1, 12.4.1 |
v0.0.1
Release
| flash-attention |
Python |
PyTorch |
CUDA |
| 1.0.9, 2.4.3, 2.5.6, 2.5.9, 2.6.3 |
3.10, 3.11, 3.12 |
2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.0 |
11.8.0, 12.1.1, 12.4.1 |
v0.0.0
Release
| flash-attention |
Python |
PyTorch |
CUDA |
| 2.4.3, 2.5.6, 2.5.9, 2.6.3 |
3.11, 3.12 |
2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.0 |
11.8.0, 12.1.1, 12.4.1 |