RepoJournal
Hugging Face

@huggingface

Transformers, Datasets, and the open AI-model layer

Pick a date

The Wire · Showcase

TRANSFORMERS SHIPS ROPE FIXES AND FINE-GRAINED QUANTIZATION WHILE HUB PATCHES PYTHON 3.15 BREAKAGE

By RepoJournal · Filed · About Hugging Face

Transformers landed critical correctness fixes for GLM sparse attention and new Triton-backed fp8/fp4 quantization, while huggingface_hub addressed a hard dependency break on Python 3.15.

The transformers team shipped interleaved RoPE fixes for MLA and DSA indexer caching on GLM5 [1], which accelerates sparse attention by reusing previous layers' top-k indices instead of recomputing them. Alongside that, fine-grained fp8/fp4 quantization via Triton landed [2], giving you native support for sub-tensor quantization with torch compile compatibility. Image processing got a speed bump with native torchvision LANCZOS interpolation replacing the PIL fallback [3], which matters for batch inference throughput. On the hub side, huggingface_hub fixed a critical import error on Python 3.15 where the private _MISSING_TYPE disappeared from dataclasses [4], causing immediate startup failures. The same release improves auth precedence in Colab environments [5], so user-provided tokens now take priority over Colab's vault token. CLI quiet mode now actually stays quiet [6]. Over in xet-core, russh bumped to 0.61 [7] and the team is working through an sdist release issue where LICENSE wasn't being included in the tarball [8].

Action items

References

  1. [1] Fix: interleaved RoPE application for MLA and Support Index Cache DSA indexer skip-topk sharing for GLM5 (#46372) huggingface/transformers
  2. [2] Triton finegrained fp8/fp4 (#46407) huggingface/transformers
  3. [3] Use torchvision's native LANCZOS interpolation instead of PIL fallback (#46496) huggingface/transformers
  4. [4] [Fix] Remove private _MISSING_TYPE import from dataclasses module (#4322) huggingface/huggingface_hub
  5. [5] [Auth] Take google colab token from env first ↗ huggingface/huggingface_hub
  6. [6] [CLI] Suppress hints in quiet output mode ↗ huggingface/huggingface_hub
  7. [7] chore: bump russh from 0.60 to 0.61 ↗ huggingface/xet-core
  8. [8] Try to fix sdist release due to LICENSE missing from tarball root directory (#867) huggingface/xet-core

FAQ

What changed in Hugging Face on June 8, 2026?
Transformers landed critical correctness fixes for GLM sparse attention and new Triton-backed fp8/fp4 quantization, while huggingface_hub addressed a hard dependency break on Python 3.15.
What should Hugging Face teams do about it?
If you support Python 3.15, update huggingface_hub immediately to avoid import failures on startup • Test transformers upgrade if you use GLM5 or sparse attention workflows to validate RoPE behavior • Evaluate new Triton fp8/fp4 quantization for inference performance gains in your pipelines
Which Hugging Face repositories shipped on June 8, 2026?
huggingface/transformers, huggingface/huggingface_hub, huggingface/xet-core

Related across the cluster

For your repos

The showcase is a teaser.
Your wire is the product.

Same engine. Different stack. Below: what changes when the wire is yours.

Showcase wire

  • 14 famous open source orgs
  • One wire per day
  • Public, generic
  • Read on the web, when you remember

Your wire

  • Up to 1,500 of your repos - orgs, deps, vendors
  • Morning and evening briefs
  • Action items routed to your team
  • Slack delivery, email, breaking-news CVE alerts

Want a hands-on demo first? Ask a current user for an invite link.