The Wire · Showcase
SERGE PHASE 3 GOES LIVE: POD-PER-TASK EXECUTION REPLACES IN-PROCESS MODEL
By RepoJournal · Filed · About Hugging Face
Serge's orchestration layer is shipping its biggest architectural shift: each task now runs in its own isolated Kubernetes pod, complete with the full LLM loop and normalization, while the main app becomes a thin HTTP-driven supervisor.
The pod-per-task execution model [1] is rolling to production after hitting a critical egress blocker that hung every checkout for 5 minutes [2]. The fix routes git operations through the allowlisting proxy, unblocking Phase 3 verification. To support this shift, CI now builds two new images: the task-runner pod (transformers-quality toolchain plus serge) and an egress-proxy sidecar [3], both pushed to ghcr.io on every commit. On the transformers side, FSDP tests wired into the dynamic PR CI caller [6] mirror the tensor-parallel test flow, while PEFT integration tests graduated to their own isolated job [5]. A new manual Serge exerciser workflow [4] tests the write-capable /tasks flow end-to-end against transformers-test-ci using in-VPC runners and OIDC minting. TRL is cleaning house: redundant `get_kbit_device_map()` calls removed from 29 files [9], KTO loss internals shed their `policy_` prefix [10], and PEFT models now work with Liger kernels in DPO [11]. Transformers exporters [7] landed partial support for unified PyTorch/ONNX/ExecuTorch export with rope-index fixes for get_rope_index models, while GLM-Moe-DSA now uses interleaved rope in its indexer [8].
Action items
- → Review the pod-per-task networking model if you run serge in restricted egress environments huggingface/serge [immediate]
- → Update your FSDP test detection logic to match the new PR CI job structure huggingface/transformers [plan]
- → Test PEFT+Liger combinations in DPO if you use quantized LoRA huggingface/trl [monitor]
References
- [1] One pod per task: run the whole LLM loop + normalize in an isolated per-task pod ↗ huggingface/serge
- [2] fix: route in-pod git checkout through the egress proxy (#39) huggingface/serge
- [3] CI: build the per-task runner + egress-proxy images for Phase 3 ↗ huggingface/serge
- [4] CI: add manual Serge /tasks exerciser ↗ huggingface/transformers-test-ci
- [5] Add PEFT integration tests (#74) huggingface/transformers-test-ci
- [6] Add tests_fsdp_ci job to PR CI workflow. (#80) huggingface/transformers-test-ci
- [7] [PoC] HF exporters (#41992) huggingface/transformers
- [8] [glm-mode-dsa] Indexer uses interleaved rope (#46842) huggingface/transformers
- [9] Remove redundant `get_kbit_device_map()` ↗ huggingface/trl
- [10] Drop `policy_` prefix from KTO loss-internal logps/logits (#6204) huggingface/trl
- [11] Support PEFT with Liger in DPO ↗ huggingface/trl
FAQ
- What changed in Hugging Face on July 3, 2026?
- Serge's orchestration layer is shipping its biggest architectural shift: each task now runs in its own isolated Kubernetes pod, complete with the full LLM loop and normalization, while the main app becomes a thin HTTP-driven supervisor.
- What should Hugging Face teams do about it?
- Review the pod-per-task networking model if you run serge in restricted egress environments • Update your FSDP test detection logic to match the new PR CI job structure • Test PEFT+Liger combinations in DPO if you use quantized LoRA
- Which Hugging Face repositories shipped on July 3, 2026?
- huggingface/serge, huggingface/transformers-test-ci, huggingface/transformers, huggingface/trl