devices: add support for first_available device priotisation by chrisboulton · Pull Request #27391 · hashicorp/nomad

chrisboulton · 2026-01-22T03:17:00Z

(note: this is in a bit of a draft state right now - that said, I'd love feedback from HashiCorp on the chances of having something like this incorporated and how to best align it to y'alls design goals - a rough first pass at the design would be amazing -- don't spend a lot of time on the changes themselves until we're happy with the design and I've done more of my own homework)

This PR introduces a new first_available block for device requests in Nomad job specifications. This enables more flexible device scheduling by allowing you to specify a prioritized list of device reservation sizes, where the scheduler attempts each option in order and selects the first one that can be fulfilled.

This is particularly useful in heterogeneous clusters with varying device types (such as a bunch of different GPU models) where you want to prioritize one type of GPU over another, but to carry the workload you need to reserve a different number of devices (GPUs).

A concrete example: I've got a workload which fits on a single 96GB GH200, but if I don't have that available I can also carry it on two H100s with 80GB memory each. I want to be able to do this in one job, and have Nomad figure out what the resource reservation should be. Today, this needs multiple jobs or multiple task groups, because device only accepts a single reservation size (count).

To support this, the following is introduced:

device "nvidia/gpu" {
  # i would prefer this workload to land on a GH200 and if it does, it needs one GPU
  first_available {
    count = 1
    constraint {
      attribute = "${device.attr.model}"
      value = "GH200"
    }
  }
  # otherwise, i'll take a pair of H100s
  first_available {
    count = 2
    constraint {
      attribute = "${device.attr.model}"
      value = "H100"
    }
  }
}

With a job configuration like this, Nomad will first try to schedule the workload on a GH200. If that's not available, it will then try to schedule on two H100 SDMs. If that's not available, it will fail the job.

count, affinity, and constraint without first_available are supported as before.

Implementation Notes

I'm open to feedback on the implementation of this -- this was just the first take that came to mind.

first_available is an ordered list of options where the first match wins. Inside first_available, constraint is supported, which lets you perform the additional filtering.

first_available and count are mutually exclusive at the device level.

Alternative Approach

Would it make sense to have a syntax like this instead, where the constraints are specified inline instead of in their own constraint block?

device "nvidia/gpu" {
  first_available {
    count = 1
    attribute = "${device.attr.model}"
    value = "GH200"
  }
}

Testing Notes

I've gone in and added a bunch of E2E tests for this as a first pass - these cover the existing device scheduling functionality and the new first_available functionality.

I've only run these tests locally - I've not used the Terraform E2E test suite, and am mostly certain (given I let Claude do the work almost exclusively for the tests) that at least the TF test setup needs some work.. but otherwise, the tests themselves are passing and seem to do the right thing.

AI Use

I noticed a new callout for this in the contributing guidelines, so to call it out:

A bunch of the initial scaffolding I implemented myself. For work around the scheduler (especially the feasibility checks), I asked for help from Opus 4.5 w/ CC -- mostly because I've not navigated that part of Nomad much before. On review, it looks like the right things are happening - these are design decisions I'd probably make myself.. but I still need to do a more exhaustive review before I'm willing to say I'm happy with the approach.
CC was used to generate the bulk of the tests, especially in the E2E suites -- I'm pretty happy with these

jrasell · 2026-01-22T14:14:01Z

Hi @chrisboulton and thanks for raising this PR, adding all the detail, and clearly having read our documentation. Given the size of the addition I think a good first step would be to open up an issue where we can better discuss the use cases and design specifics. I'll be able to raise this internally and get the right people involved to try and move it forward. That being, said a quick glance by a few of us indicates we do like this idea, so would be keen to see it progress.

chrisboulton · 2026-01-23T06:44:05Z

Hey @jrasell let's do it! 🚀 #27402

devices: add support for first_available device priotisation

deada11

jrasell added this to Nomad - Community Issues Triage Jan 22, 2026

github-project-automation bot moved this to Needs Triage in Nomad - Community Issues Triage Jan 22, 2026

jrasell added the stage/waiting-reply label Jan 22, 2026

chrisboulton mentioned this pull request Jan 23, 2026

Support for dynamic/flexible device reservations #27402

Open

tgross moved this from Needs Triage to In Progress in Nomad - Community Issues Triage Jan 23, 2026

tgross added stage/needs-discussion type/enhancement theme/devices theme/scheduling and removed stage/waiting-reply labels Jan 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

devices: add support for first_available device priotisation#27391

devices: add support for first_available device priotisation#27391
chrisboulton wants to merge 1 commit intohashicorp:mainfrom
chrisboulton:device-first-available

chrisboulton commented Jan 22, 2026

Uh oh!

jrasell commented Jan 22, 2026

Uh oh!

chrisboulton commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

chrisboulton commented Jan 22, 2026

Implementation Notes

Alternative Approach

Testing Notes

AI Use

Uh oh!

jrasell commented Jan 22, 2026

Uh oh!

chrisboulton commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants