The Wire ยท Showcase
TRANSFORMERS SHIPPING EXPORT STANDARDIZATION, GEMMA ATTENTION FIXES LAND
By RepoJournal ยท Filed ยท About Hugging Face
Transformers just locked in broad modeling changes that make ONNX, torch.export, and ExecuTorch export work cleanly across a huge set of models, while Gemma 3/4 gets critical attention masking corrections.
The modeling standardization PR [1] reshapes how models handle compilation and export paths, eliminating one of the sharpest pain points for production deployments. That lands alongside targeted fixes for Gemma 3/4 attention masking at sliding window boundaries [2], which patches a behavioral drift in local layer attention that would have broken inference consistency. TRL is also housekeeping: dropping vLLM 0.14 support [3], removing the defunct sft_video_llm.py script [4], and integrating the new response parsing API [5]. Over in robotics, LeRobot shipped configurable MIT control mode for ReBot [6], letting users swap between position-velocity and torque-based control per joint with tunable stiffness parameters. The physics-intern-skills repo continues internal iteration on Codex integration [7], workspace bootstrap flow [8], and plugin distribution [9], with documentation catching up to match.
Action items
- โ Review transformers modeling export changes for your deployment targets (ONNX/torch.export/ExecuTorch) huggingface/transformers [plan]
- โ If you're running Gemma 3/4 in production, pull the attention masking fix before next inference deployment huggingface/transformers [immediate]
- โ TRL users on vLLM 0.14: upgrade vLLM before next training run huggingface/trl [plan]
References
- [1] ๐จ Modeling changes for export, compile, and hybrid-attention standardization โ huggingface/transformers
- [2] ๐จ [gemma 3/4] Fix bidirectional attention masking crossing sliding window boundaries โ huggingface/transformers
- [3] Drop vLLM 0.14 support (#6209) huggingface/trl
- [4] Remove sft_video_llm.py script (#6193) huggingface/trl
- [5] Integrate the new response parsing API โ huggingface/trl
- [6] Feat(robot): add MIT control mode to ReBot โ huggingface/lerobot
- [7] Codex host: migrate sub-agent glue to auto-discovered .codex/agents/ huggingface/physics-intern-skills
- [8] Drop the problem one-liner / start-research bootstrap step huggingface/physics-intern-skills
- [9] Add Codex CLI plugin: $init-physics-intern + build-codex-plugin.sh huggingface/physics-intern-skills
FAQ
- What changed in Hugging Face on July 1, 2026?
- Transformers just locked in broad modeling changes that make ONNX, torch.export, and ExecuTorch export work cleanly across a huge set of models, while Gemma 3/4 gets critical attention masking corrections.
- What should Hugging Face teams do about it?
- Review transformers modeling export changes for your deployment targets (ONNX/torch.export/ExecuTorch) โข If you're running Gemma 3/4 in production, pull the attention masking fix before next inference deployment โข TRL users on vLLM 0.14: upgrade vLLM before next training run
- Which Hugging Face repositories shipped on July 1, 2026?
- huggingface/transformers, huggingface/trl, huggingface/lerobot, huggingface/physics-intern-skills