NERC - H100 GPU Availability Notice – Incident details

GETTING HELP
Email: help@nerc.mghpcc.org or, using the NERC's Support Ticketing System
NERC Documentation: https://nerc-project.github.io/nerc-docs/
Status page for the New England Research Cloud (NERC) and other resources.
Please scroll down to see details on any Incidents or maintenance notices.

H100 GPU Availability Notice

Resolved
Operational
Started 7 days agoLasted 6 days

Affected

NERC OPENSHIFT CONTAINER PLATFORM - PRODUCTION

Operational from 4:53 PM to 9:27 PM

OPENSHIFT WORKERS NODES

Operational from 4:53 PM to 9:27 PM

NERC RED HAT OPENSHIFT AI (RHOAI)

Operational from 4:53 PM to 9:27 PM

Updates
  • Resolved
    Resolved

    This incident has been resolved. H100's are available now.

  • Investigating
    Investigating

    The H100 GPUs are currently unavailable for use while a required firmware update is being completed. We will provide an update once the firmware process has finished and the GPUs have been validated for production use.

    Thank you for your patience.