rdc@fri - RDC cooling malfunction – Incident details

Storage experiencing degraded performance

RDC cooling malfunction

Resolved
Major outage
Started 7 months agoLasted 13 days

Affected

Frida

Degraded performance from 3:37 PM to 9:47 AM, Operational from 3:37 PM to 9:47 AM, Under maintenance from 9:47 AM to 10:51 AM, Major outage from 10:51 AM to 9:58 AM, Operational from 10:51 AM to 9:58 AM

Login

Operational from 3:37 PM to 9:47 AM, Under maintenance from 9:47 AM to 10:51 AM, Operational from 10:51 AM to 9:58 AM

Storage

Operational from 3:37 PM to 9:47 AM, Under maintenance from 9:47 AM to 10:51 AM, Operational from 10:51 AM to 9:58 AM

Compute

Degraded performance from 3:37 PM to 9:47 AM, Under maintenance from 9:47 AM to 10:51 AM, Major outage from 10:51 AM to 9:58 AM

Updates
  • Resolved
    Resolved

    The RDC cooling was partially fixed, we're bringing the cluster back to production. We'll be monitoring the status. During the week the remaining RDC cooling issues will be resolved.

    We appreciate your patience

  • Identified
    Identified

    The RDC cooling is experiencing malfunction. As a preventative measure we're forced to shutdown the cluster; all jobs will be canceled.

    We appreciate your patience.

  • Monitoring
    Monitoring

    The RDC cooling was fixed, we're currently bringing the cluster back to life, performing the scheduled FRIDA maintenance, and monitoring the status.

  • Identified
    Identified

    The RDC cooling is experiencing malfunction. As a preventative measure we're forced to shutdown the cluster; all jobs will be canceled. While working to fix the issue we'll also perform the scheduled FRIDA maintenance to keep the downtime at minimum.

    We appreciate your patience.