rdc@fri - RDC cooling malfunction – Incident details

Storage experiencing degraded performance

RDC cooling malfunction

Resolved
Major outage
Started 6 months agoLasted 1 day

Affected

Frida

Operational from 1:00 PM to 9:35 PM

Compute

Operational from 1:00 PM to 9:35 PM

Updates
  • Resolved
    Resolved

    We implemented a fix, we're bringing the cluster back into operation. We'll be monitoring the cooling. Please make sure to perform regular checkpoints to avoid data loss in case of additional shutdowns.

    We appreciate your patience.

  • Identified
    Identified

    The RDC cooling is experiencing malfunction. As a preventative measure we're forced to shutdown the cluster; all jobs will be canceled. While working to fix the issue we'll also perform the scheduled FRIDA maintenance to keep the downtime at minimum.

    We appreciate your patience.