Tech Comparisons

GPU Job Scheduler Vs. GVM Server

GVM Server Vs.Request Consultation

What is a GPU Job Scheduler?

A GPU Job Scheduler is a tool that manages and schedules the allocation of GPUs in a cluster environment. They enable the efficient utilization of GPU resources by allocating them to the jobs that need them. Schedulers also provide a unified interface for submitting, monitoring, and controlling the execution of GPU jobs in clusters.

Although schedulers can be very useful to Systems Administrators, they have drawbacks when it comes to truly maximizing utilization and performance. These issues are addressed by Arc Compute's GVM Server.

To increase GPU utilization when multiple jobs need to be trained, many Schedulers rely on multi-tenancy through MIG (Multi-Instance GPU). Let's run through an example to highlight the differences between Schedulers utilizing MIG and Arc Compute utilizing GVM Server's feature SMVGPU (Simultaneous Multi-Virtual GPU).
Data Center tech stack with GVM Server

MIG Vs. SMVGPU

GPU Utilization with GPU Job Scheduler (MIG)GPU Utilization with SMVGPU
In both instances three jobs are sharing a single virtualized GPU to train their workloads with no additional jobs queued.
GPU Allocation Across Jobs
In this scenario, a job scheduler has allocated a third of a virtualized GPU to each job. When Job #1 finishes after two hours its allocated resources will remain idle, the same goes for Job #2 when it finishes after 4 hours. Therefore, jobs are limited to the resources they are initially allocated when using a Scheduler. This isn't the case while utilizing GVM Server.

It should be noted that these idle resources would be reallocated to inactive jobs that are queued up, if there were any.

Unlike MIG, SMVGPU can allocate VRAM on a continuous and non-continuous basis, enabling the automated redistribution of idle resources at runtime.

With GVM Server, when Job #1 finishes after 2 hours its GPU resources are automatically reallocated to Job #2 and #3. This enables both remaining jobs to train more efficiently. Job #2 can finish an hour faster than it can while using MIG. When Job #2 finishes, its resources are reallocated to Job #3, enabling Job #3 to finish 2 hours faster, as it’s allocated the entire GPU for its last hour of training. This automated reallocation of GPU resources at runtime enables the ability to reach high levels of GPU optimization and 100% utilization.


While the premier version of GVM Server doesn't completely replace all of the functionalities of a GPU Scheduler, it sits below them in the data center tech stack, meaning that a Scheduler can be seamlessly integrated into GVM Server.

GPU Job Scheduler

Increases GPU Utilization

GVM Server

Enables REAL Utilization

GPU Job Scheduler

GVM Server

Reduces workflow bottlenecks

Does Have Feature
Does Have Feature

GPU pooling

Does Have Feature
Does Have Feature

Ensures VRAM assignment

Does Have Feature
Does Have Feature

Cluster management

Does Have Feature
Does Have Feature

Allocates VRAM on non-continuous basis

Does Not Have Feature
Does Have Feature

Automated optimization

Does Not Have Feature
Does Have Feature

VDI capabilities

Does Not Have Feature
Does Have Feature

No command line required

Does Not Have Feature
Does Have Feature

GPU architecture backwards compatibility

Does Not Have Feature
Does Have Feature

Enables 100% GPU utilization

Does Not Have Feature
Does Have Feature

Virtualize Your GPUs with GVM Server

Building on open-source, GVM Server is a KVM + GVM based hypervisor solution. GVM Server offers a browser-based suite of tools for the provisioning and management of GPU-accelerated virtual machines in enterprise environments.
Learn More
Reserved GPU Cloud
All-In-One GPU
Virtualization
Leveraging powerful open-source technology, everything needed to virtualize your GPU infrastructure comes included, with the ability to spin up fully customizable VMs.
Organization-Level Provisioning
Organization-Level
Provisioning
Nested roles allow organizations to manage data and resources for teams hierarchically. Admins can assign roles for managers and their staff, allocating compute resources as needed.
Browser-Based Remote Management
Browser-based
Remote Management
Virtual machines are visible through the browser-based management dashboard ensuring that you're always connected to the work that matters to you, regardless of your location.
Hardware-Enforced Security
Hardware-Enforced Security
Enforces memory separation using hardware MMU controllers. Separation between VMs is enforced by the hardware itself, ensuring the max level of protection for your data.