Unplanned Space Downtime 8 July 2016

luke
11 Jul 2016

Late in the evening of 7 July an unexpected event triggered an incident in the Space.intersect.org.au storage subsystems. As a result of pre-provisioning a number of customer filesystems a virtual/physical disk ratio threshold was crossed on 8 July at 6pm AEST that was unanticipated.

The standard safety protocol in all underlying Space hardware arrays is to enter a locked – that is, customer inaccessible – state. This measure is by design to ensure data integrity above all else, but caused the appearance of general Space unavailability to customers.

A combined team from Intersect and suppliers began resolving the problem immediately to minimise consumer impact. However, the standard protocol to return to service requires verification of integrity for every aspect of each Space Plan, a time consuming operation over many thosands of terabytes of data.

Normal service returned for all Space customers in 54 hours. No data was lost, corrupted or had to be replicated or restored from redundant copies. We publicised the status of the issue at Twitter.com/@intersectops throughout, declaring the incident resolved as of midnight Sunday.

The problem that caused this incident will be permanently resolved  in the planned August 2nd downtime window, and in the meantime additional controls are in place within our procedures to ensure it does not recur.

We sincerely apologise for the unplanned downtime and any disruption caused. Please get in touch via space@intersect.org.au if you experience any residual effects or would like more information.

Back to news
Your browser is not supported. Please upgrade your browser.