What should PyTorch teams do about it?

If using sparse half-precision ops on CUDA, upgrade to pick up sampled_addmm fixes [ref:5] • Review TorchTitan EP overlap scheduler and chunking PRs if building distributed expert-parallel models [ref:9] [ref:10] • ExecuTorch Windows CI is stable again - Windows unittest jobs should go green with the CPU-only fix [ref:17]

Which PyTorch repositories shipped on June 27, 2026?

pytorch/pytorch, pytorch/torchtitan, pytorch/test-infra, pytorch/executorch

PYTORCH SHIPS METAL ACCELERATORS AND SPARSE DTYPE SUPPORT WHILE TORCHTITAN ADVANCES EXPERT PARALLELISM

By RepoJournal · Filed 06:03 UTC on June 27, 2026 · About PyTorch

PyTorch landed critical MPS kernel migrations and sparse tensor improvements overnight while the Titan team built out the distributed training pipeline for expert-parallel models.

The core team completed Metal Performance Shaders (MPS) implementations for two essential ops: GLU forward pass [1] now runs 2x faster via MPSGraph instead of TensorIterator, and CTC loss [2] shipped with full forward-pass support optimized for batch parallelism. On the sparse front, `torch.sparse.sampled_addmm` now handles float16 and bfloat16 on CUDA [3], fixing a critical backward-pass gap that broke half-precision sparse matrix multiplication. Meanwhile, TorchTitan merged the graph trainer's expert parallelism (EP) infrastructure: the EP overlap scheduler [5] and chunking pass [6] enable Inductor to optimize token dispatch across distributed model experts, plus a new `DPRequestRouter` [4] centralizes data-parallel routing logic. Test infrastructure tightened its CI safeguards with an AI-advisor outage guard [7] that won't bail entire PRs on expected broad failures, and the CRCR zombie-workflow cleaner [8] now purges stale cross-repo CI entries from Redis. ExecuTorch fixed Windows CI by forcing CPU-only builds [10] to avoid CUDA toolkit conflicts, shipped Arm TOSA binary op support [11], and bumped Vela to 5.1.0 [9].

FAQ

What changed in PyTorch on June 27, 2026?: PyTorch landed critical MPS kernel migrations and sparse tensor improvements overnight while the Titan team built out the distributed training pipeline for expert-parallel models.
What should PyTorch teams do about it?: If using sparse half-precision ops on CUDA, upgrade to pick up sampled_addmm fixes [ref:5] • Review TorchTitan EP overlap scheduler and chunking PRs if building distributed expert-parallel models [ref:9] [ref:10] • ExecuTorch Windows CI is stable again - Windows unittest jobs should go green with the CPU-only fix [ref:17]
Which PyTorch repositories shipped on June 27, 2026?: pytorch/pytorch, pytorch/torchtitan, pytorch/test-infra, pytorch/executorch

@pytorch

PYTORCH SHIPS METAL ACCELERATORS AND SPARSE DTYPE SUPPORT WHILE TORCHTITAN ADVANCES EXPERT PARALLELISM

The showcase is a teaser.
Your wire is the product.

PYTORCH SHIPS METAL ACCELERATORS AND SPARSE DTYPE SUPPORT WHILE TORCHTITAN ADVANCES EXPERT PARALLELISM

The showcase is a teaser. Your wire is the product.

The showcase is a teaser.
Your wire is the product.