rdc@fri - Major malfunction of one of the nodes – Podrobnosti o dogodku

Storage Izkušnje z zmanjšano zmogljivostjo

Major malfunction of one of the nodes

Odpravljeno
Poslabšano delovanje
Začetek pred 5 meseciTrajalo 13 dni

Prizadete storitve

Frida

Poslabšano delovanje od 12:21 PM do 10:40 AM

Compute

Poslabšano delovanje od 12:21 PM do 10:40 AM

Posodobitve
  • Odpravljeno
    Odpravljeno

    The GPUs on node ixh has been successfully replaced and the node is back in production. Please, benchmark your runs against earlier ones and report any discrepancies.

    Thank you for your patience.

  • Nadgradnja
    Nadgradnja

    Node ixh is down as during the replacement of GPU6 issues with GPU0 have been detected. We're coordinating a resolution with support.

    Thank you for your patience.

  • Napaka odkrita
    Napaka odkrita

    Node ixh is down due to overheating of GPU6. We are working on a resolution with support.

    Thank you for your patience.

  • Raziskovanje
    Raziskovanje
    We are currently investigating this incident.