The Wire · Showcase
FLASH-ATTN3 TARGETS TORCH 2.9 AS UPLOAD TOOLING GETS OVERHAUL
By RepoJournal · Filed · About Hugging Face
Flash-attn3 is moving to Torch 2.9's stable ABI while Hugging Face Hub deprecates the redundant upload_large_folder in favor of the faster, interrupt-safe upload_folder.
The kernels team is locking flash-attn3 onto Torch 2.9's stable ABI [1], a move that standardizes kernel compatibility and experimentally tests CUDA 12.6 builds. Meanwhile, the Hub team is sunsetting upload_large_folder [2] across both API and CLI after fixing upload_folder to handle interruptions and leverage Xet [2]. Users will see loud deprecation warnings pointing to the exact equivalent hf upload command [2]. On the modeling front, Transformers landed Qwen3 ASR support [3], adding speech recognition and forced alignment to the library. In continuous batching optimization, transformers increased max_batch_tokens with a redesigned cache estimator [4] that applies VRAM-based bounds and defaults to 8192 tokens per batch, unlocking bigger prefill operations. Diffusers fixed DreamLite model loading [5] by adding legacy block type alias normalization, letting carlofkl/DreamLite-base load without failure. The diffusers team also bumped safetensors to 0.8.0 [6], bringing the latest serialization improvements.
Action items
- → Test flash-attn3 Torch 2.9 builds in your kernel CI pipeline huggingface/kernels-community [plan]
- → Migrate from upload_large_folder to hf upload or upload_folder before deprecation window closes huggingface/huggingface_hub [plan]
- → Verify DreamLite model loading works in your diffusers workflows huggingface/diffusers [monitor]
- → Test Qwen3 ASR in production if you're shipping speech models huggingface/transformers [monitor]
References
- [1] flash-attn3: target Torch 2.9 stable ABI ↗ huggingface/kernels-community
- [2] [Upload] Deprecate upload_large_folder (API + CLI) ↗ huggingface/huggingface_hub
- [3] Qwen3 ASR and Forced Aligner ↗ huggingface/transformers
- [4] [CB] Changes to increase max_batch_tokens ↗ huggingface/transformers
- [5] Fix DreamLite legacy block type aliases ↗ huggingface/diffusers
- [6] feat: bump safetensors to 0.8.0 ↗ huggingface/diffusers
FAQ
- What changed in Hugging Face on June 27, 2026?
- Flash-attn3 is moving to Torch 2.9's stable ABI while Hugging Face Hub deprecates the redundant upload_large_folder in favor of the faster, interrupt-safe upload_folder.
- What should Hugging Face teams do about it?
- Test flash-attn3 Torch 2.9 builds in your kernel CI pipeline • Migrate from upload_large_folder to hf upload or upload_folder before deprecation window closes • Verify DreamLite model loading works in your diffusers workflows
- Which Hugging Face repositories shipped on June 27, 2026?
- huggingface/kernels-community, huggingface/huggingface_hub, huggingface/transformers, huggingface/diffusers