RepoJournal
PyTorch

@pytorch

PyTorch and the broader machine-learning ecosystem

Pick a date

The Wire · Showcase

FLEX_ATTENTION TIGHTENS, VMAP GAINS VECTORIZATION, MACOS CLEARS WARNINGS

By RepoJournal · Filed · About PyTorch

PyTorch tightened flex_attention validation and shipped the missing vmap rule for count_nonzero, while macOS builds finally silence the deprecated declarations cascade.

The flex_attention operator now explicitly rejects NestedTensor inputs [1] instead of falling through to compiler errors, fixing #177377 with a regression test on the compiled fullgraph path. In parallel, count_nonzero gained its missing batching rule [2], eliminating the performance warning and enabling vectorized execution under torch.vmap. The macOS backend suppressed the flood of -Wdeprecated-declarations from Apple framework includes [3] by wrapping Foundation, Metal, MPS, and MPSGraph headers, clearing noise from recent SDK upgrades. On the infrastructure side, inductor.yml migrated to OSDC with the dial-up pattern [4], plumbing ARC inputs through CUDA and CPU build/test pairs, while _FastCudaLauncher now silently handles oversized kernels [5] instead of throwing unexpected ValueError. Helion's autotuner hardened LLM search to fail loudly on errors [6] and wired cache backends to RemoteAutotuneCache for warm-start enrichment [7], plus added H100 sm90 pretuned heuristics [8]. The LLM search stack shipped Opus 4.6/4.7 fast mode [9] and an effort_level knob spanning none/low/medium/high/max [10] with Anthropic adaptive thinking support. TorchTitan pinned GITHUB_TOKEN to contents: read [11] in response to CVE-2025-30066, and skipped dense numerics tests [12] due to an upstream DTensor regression with mixed-dtype sharding propagation.

Action items

References

  1. [1] Reject NestedTensor inputs in flex_attention (#183516) pytorch/pytorch
  2. [2] Add batching rule for count_nonzero (#183860) pytorch/pytorch
  3. [3] [BE][MacOS] Suppress deprecated declarations warnings (#183927) pytorch/pytorch
  4. [4] [OSDC] Migrate inductor.yml to OSDC (ARC) via dial-up pattern (#183646) pytorch/pytorch
  5. [5] [inductor] Silence _FastCudaLauncher ValueError on oversized kernels (#183967) pytorch/pytorch
  6. [6] [Autotuner] LLM search: fail loudly + mTLS gateway compatibility (#2448) pytorch/helion
  7. [7] [cache] Wire from_best_available / from_cache to RemoteCacheBackend ↗ pytorch/helion
  8. [8] Add H100 (sm90) pretuned heuristics and perf gates ↗ pytorch/helion
  9. [9] [Autotuner] LLM search: Anthropic Opus 4.6/4.7 fast mode ↗ pytorch/helion
  10. [10] [Autotuner] LLM search: effort_level knob + Anthropic adaptive thinking + OpenAI xhigh ↗ pytorch/helion
  11. [11] ci: declare workflow-level `contents: read` on 2 workflows (#3367) pytorch/torchtitan
  12. [12] [graph_trainer] Skip dense numerics tests due to upstream DTensor regression ↗ pytorch/torchtitan

FAQ

What changed in PyTorch on May 17, 2026?
PyTorch tightened flex_attention validation and shipped the missing vmap rule for count_nonzero, while macOS builds finally silence the deprecated declarations cascade.
What should PyTorch teams do about it?
Merge flex_attention validation fix to unblock compiled fullgraph callers • Update count_nonzero vmap tests and remove xfail markers • Apply macOS deprecated warnings suppression to your MPS builds
Which PyTorch repositories shipped on May 17, 2026?
pytorch/pytorch, pytorch/helion, pytorch/torchtitan

Related across the cluster

For your repos

The showcase is a teaser.
Your wire is the product.

Same engine. Different stack. Below: what changes when the wire is yours.

Showcase wire

  • 14 famous open source orgs
  • One wire per day
  • Public, generic
  • Read on the web, when you remember

Your wire

  • Up to 1,500 of your repos - orgs, deps, vendors
  • Morning and evening briefs
  • Action items routed to your team
  • Slack delivery, email, breaking-news CVE alerts

Want a hands-on demo first? Ask a current user for an invite link.