RepoJournal
Hugging Face

@huggingface

Transformers, Datasets, and the open AI-model layer

Pick a date

The Wire · Showcase

FSDP DISTRIBUTED TRAINING AND PARAKEET ASR LAND IN TRANSFORMERS

By RepoJournal · Filed · About Hugging Face

Transformers shipped native FSDP and tensor parallelism support with distributed save/load in one release, while kernels team hardened security auditing and operation registration validation across all repos.

The transformers FSDP integration [1] adds fully sharded data parallel training with auto/manual mode detection, shard-on-read loading via DtensorShardOperation, and FSDP-aware flash attention checks. This is the distributed training foundation teams have been waiting for. Parakeet ASR models [2] now support Token-and-Duration Transducer decoding with per-token timestamps and full AutoModel integration, extending beyond CTC-only to match production ASR pipelines. PP-OCRv6 [3] arrives as the new standard for document recognition, adding four new model variants with updated backbones and dedicated image processors. Across kernels and kernels-community, the team locked down security workflows [4] [5] [6] and rolled out operation registration validation [7] to catch misconfigured custom ops at build time. Flash attention variable-length backward pass now works on XPU [8], removing a gap for Intel accelerator users. AsyncGRPOTrainer [9] now handles models with final logits softcapping like Gemma 2, fixing a trainer compatibility issue.

Action items

References

  1. [1] FSDP + TP & native save/load distributed (#45028) huggingface/transformers
  2. [2] Parakeet tdt (#44171) huggingface/transformers
  3. [3] [Model] Add PP-OCRv6 Models Support (#45838) huggingface/transformers
  4. [4] feat: mention maintainers in the slack security auditing. (#567) huggingface/kernels
  5. [5] fix(security): remediate workflow vulnerability in .github/workflows/security-audit.yml (#884) huggingface/kernels-community
  6. [6] feat: mention maintainers in the slack security auditing. (#881) huggingface/kernels-community
  7. [7] nix-builder: add a hook to detect incorrect op registrations ↗ huggingface/kernels
  8. [8] flash-attn2: Add flash_attn_varlen_func backward support for XPU ↗ huggingface/kernels-community
  9. [9] Final logits softcapping support for async GRPO Trainer (#5691) huggingface/trl

FAQ

What changed in Hugging Face on May 20, 2026?
Transformers shipped native FSDP and tensor parallelism support with distributed save/load in one release, while kernels team hardened security auditing and operation registration validation across all repos.
What should Hugging Face teams do about it?
Test FSDP integration in staging if you run distributed training at scale - this replaces DDP for large model training • Pin kernels builds and run security audit workflow on your fork to verify the remediation took effect [ref:13] • Update any custom ops to follow the naming prefix convention documented in the nix-builder hook [ref:2]
Which Hugging Face repositories shipped on May 20, 2026?
huggingface/transformers, huggingface/kernels, huggingface/kernels-community, huggingface/trl

Related across the cluster

For your repos

The showcase is a teaser.
Your wire is the product.

Same engine. Different stack. Below: what changes when the wire is yours.

Showcase wire

  • 14 famous open source orgs
  • One wire per day
  • Public, generic
  • Read on the web, when you remember

Your wire

  • Up to 1,500 of your repos - orgs, deps, vendors
  • Morning and evening briefs
  • Action items routed to your team
  • Slack delivery, email, breaking-news CVE alerts

Want a hands-on demo first? Ask a current user for an invite link.