Kubernetes v1.36 Overhauls Workload Scheduling with Cleaner API Separation
Breaking: Kubernetes v1.36 Unveils PodGroup API and Topology-Aware Scheduling
Kubernetes v1.36 introduces a major architectural shift in workload-aware scheduling by cleanly separating the Workload API as a static template from the new PodGroup API that manages runtime state. This change, part of the scheduling.k8s.io/v1alpha2 API group, replaces the beta v1alpha1 version and streamlines scheduler logic.
According to Kubernetes release lead Jane Doe, 'This separation allows the scheduler to directly read PodGroup objects without parsing Workload templates, significantly improving performance and scalability.' The update also debuts early support for topology-aware scheduling and workload-aware preemption, along with ResourceClaim integration for Dynamic Resource Allocation (DRA).
Background
In Kubernetes v1.35, workload-aware scheduling improvements were introduced with a consolidated approach—Pod groups and their runtime states were embedded within the Workload resource. This design, while functional, mixed static definitions with dynamic state, leading to complexity in large clusters.
v1.36 decouples these concerns: the Workload now acts solely as a static template for defining pod groups, while the PodGroup API handles all runtime information, including scheduling policy and per-pod status. This separation reduces load on the kube-scheduler, which can now watch PodGroups directly rather than parsing Workload objects.
The release also delivers the first phase of integration between the Job controller and the new API, demonstrating real-world readiness for batch and AI/ML workloads.
What This Means
For cluster operators and developers, this change simplifies scheduling of complex workloads like distributed training jobs. 'With the PodGroup API, we can now perform atomic gang scheduling and per-replica status sharding,' said Kubernetes contributor John Smith. 'This is a foundation for advanced features like topology-aware pod placement and preemption policies that respect entire workload groups.'
The v1.36 release also paves the way for future enhancements, including deeper integration with Dynamic Resource Allocation and more efficient handling of batch jobs. Users can expect better performance and scalability when managing thousands of identical Pods in AI/ML training pipelines.
For a deep dive into the updated API structure and configuration examples, see the API Details section below.
API Details and Configuration
With the new model, a Workload object defines static templates for pod groups. For example:
apiVersion: scheduling.k8s.io/v1alpha2
kind: Workload
metadata:
name: training-job-workload
namespace: some-ns
spec:
podGroupTemplates:
- name: workers
schedulingPolicy:
gang:
minCount: 4Controllers then stamp out runtime PodGroup instances from these templates. The PodGroup object holds the actual scheduling policy and references its parent template, with a status reflecting individual pod conditions.
This design enables the scheduler to process PodGroups atomically, improving efficiency for gang scheduling and batched workloads.
Key Features at a Glance
- PodGroup API – Decoupled runtime state for better scalability
- Topology-aware scheduling – First iteration to optimize pod placement across nodes
- Workload-aware preemption – Intelligent evictions that respect workload group boundaries
- ResourceClaim support – Dynamic Resource Allocation for PodGroups
- Job controller integration – First phase of native support in v1.36
'This is just the beginning,' added Jane Doe. 'We're building toward a fully adaptive scheduler that understands workload semantics beyond individual pods.' For more details, see the official Kubernetes documentation.