In this scenario, a job scheduler has allocated a third of a virtualized GPU to each job. When Job #1 finishes after two hours its allocated resources will remain idle, the same goes for Job #2 when it finishes after 4 hours. Therefore, jobs are limited to the resources they are initially allocated when using a Scheduler. This isn't the case while utilizing GVM Server.
It should be noted that these idle resources would be reallocated to inactive jobs that are queued up, if there were any.
Unlike MIG, SMVGPU can allocate VRAM on a continuous and non-continuous basis, enabling the automated redistribution of idle resources at runtime.
With GVM Server, when Job #1 finishes after 2 hours its GPU resources are automatically reallocated to Job #2 and #3. This enables both remaining jobs to train more efficiently. Job #2 can finish an hour faster than it can while using MIG. When Job #2 finishes, its resources are reallocated to Job #3, enabling Job #3 to finish 2 hours faster, as it’s allocated the entire GPU for its last hour of training. This automated reallocation of GPU resources at runtime enables the ability to reach high levels of GPU optimization and 100% utilization.
While the premier version of GVM Server doesn't completely replace all of the functionalities of a GPU Scheduler, it sits below them in the data center tech stack, meaning that a Scheduler can be seamlessly integrated into GVM Server.
GPU Job Scheduler
Increases GPU Utilization
Enables REAL Utilization
Reduces workflow bottlenecks
Allocates VRAM on non-continuous basis
GPU architecture backwards compatibility
Enables 100% GPU utilization