RepoJournal
PyTorch

@pytorch

PyTorch and the broader machine-learning ecosystem

Pick a date

The Wire · Showcase

CUDA BACKWARD KERNEL BUG FIXED AS EXECUTORCH EXPANDS DYNAMIC SHAPE SUPPORT

By RepoJournal · Filed · About PyTorch

PyTorch shipped a critical fix for avg_pool2d CUDA backward on channels_last inputs that was computing wrong window coordinates, while ExecuTorch landed SymInt arithmetic ops to unlock truly dynamic shape inference on WebGPU.

The avg_pool2d_backward_out_cuda_frame_nhwc kernel was recovering flat input coordinates without accounting for padding, causing window formulas to operate on unpadded indices [1]. This affected any channels_last workflow using pooling with padding on CUDA. Separately, PyTorch merged a scalar bias gradient fix [2] that required waiting for upstream buffer grad changes to land first, unblocking a FlexAttention learnable scalar pattern that had workarounds but was too clunky for production. On the build side, print_sccache_stats now warns when sccache is optional but missing and fails loudly when it's expected [3], solving the silent failure problem on Linux CI where engineers couldn't detect whether the cache layer was actually present. ExecuTorch shipped SymInt arithmetic operations (add, sub, mul, floordiv) for dynamic shapes on WebGPU [6], letting mobile models express variable dimensions in compiled graphs without recompilation. Infrastructure improvements landed across both repos: PyTorch hardened its CI with absolute action references for cross-repo compatibility [4] and extended the Scalar(long long) constructor guard to NetBSD and other LP64 BSDs that were failing builds [5]. ExecuTorch also cached SwiftShader prebuilts to skip per-run compilation [7], cutting CI overhead for graphics-heavy test suites.

Action items

References

  1. [1] Fix avg_pool2d CUDA backward for channels_last inputs with padding (#188345) pytorch/pytorch
  2. [2] Fix from 20260702-pytorch-adhoc-6f35e0 (#188869) pytorch/pytorch
  3. [3] print_sccache_stats: warn if sccache optional, fail if expected (#188920) pytorch/pytorch
  4. [4] [CI] Use absolute action references in teardown-xpu for cross-repo compatibility (#188769) pytorch/pytorch
  5. [5] Add Scalar(long long) constructor guard for NetBSD and other LP64 BSDs (#188941) pytorch/pytorch
  6. [6] [ExecuTorch][WebGPU] SymInt arithmetic ops (add/sub/mul/floordiv) for dynamic shapes (#20712) pytorch/executorch
  7. [7] CI: cache SwiftShader prebuilt to skip per-run from-source build (#20203) (#20203) pytorch/executorch

FAQ

What changed in PyTorch on July 5, 2026?
PyTorch shipped a critical fix for avg_pool2d CUDA backward on channels_last inputs that was computing wrong window coordinates, while ExecuTorch landed SymInt arithmetic ops to unlock truly dynamic shape inference on WebGPU.
What should PyTorch teams do about it?
If using channels_last avg_pool2d with padding on CUDA, upgrade immediately to pick up the coordinate fix • Review sccache configuration on Linux CI systems - the new warnings will surface gaps in cache setup • If shipping dynamic shapes on WebGPU, integrate the new SymInt arithmetic ops into your model export pipeline
Which PyTorch repositories shipped on July 5, 2026?
pytorch/pytorch, pytorch/executorch

Related across the cluster

For your repos

The showcase is a teaser.
Your wire is the product.

Same engine. Different stack. Below: what changes when the wire is yours.

Showcase wire

  • 14 famous open source orgs
  • One wire per day
  • Public, generic
  • Read on the web, when you remember

Your wire

  • Up to 1,500 of your repos - orgs, deps, vendors
  • Morning and evening briefs
  • Action items routed to your team
  • Slack delivery, email, breaking-news CVE alerts

Want a hands-on demo first? Ask a current user for an invite link.