I had this email earlier:
02/11/2010 12:30 AM
My sincerest apologies to everyone affected by this absolutely unintended downtime. What started as a routine maintenance upgrade, turned into a catastrophic nightmare!
What Happened?
The back-up drive was scanned on Monday afternoon looking for corruption or security issues. There was some identified and I was contacted late Monday evening and was told that they would need to do this ASAP and it would only require 4-6 hours to change the drives - though it would require the server to be offline for that time. We discussed the best possible time and decided that 7pm EST was the best time to start as most North American / South American clients would be done for the day and EU clients would be in bed and that by the time they got to work - they wouldn’t know of the downtime - on both sides of the pond.
I agreed that Tuesday night would be best as given the severity (though routine) of the issue, I did not want to jeopardize any of the clients’ data.
First Mistake…
I did not send out an immediate notification… Honestly, we’ve done many such maintenance upgrades, over the years - that I 1) was confident in the techs doing the job and 2) didn’t want to cause you, the client - any undue stress for something that wasn’t really a big deal… and nobody would be really affected based on the time we’d chosen to proceed…
Murphy’s Law
So the maintenance started on schedule at 7pm EST Tuesday night… All seemed to be going well… and hearing nothing from the techs - I went to bed at 11pm… only to wake up to find that all he** had broken loose!
During the changing of the drives, the main and secondary RAID drives were also scanned and they too had the same issue! The techs immediately - without first consulting me - decided to do a complete change of all three drives… Now, had I known that this was the case, I would most asssuredly stopped the process and shifted the maintenance to begin Friday night into the weekend… But as I said, I was not consulted and like you, I woke to a Wednesday morning mess.
Why No Communication?
Unfortunately, the support system that I am sending this announcement with - resides on the same server that was down! So, I was completely unable to do anything but try to respond as quickly as I could to the emails that I did receive from clients. At times, I had no idea what was going on and was on hold for a very long time today - trying to get info about what was going on so that I was at least abreast of the progress (albeit very slow progress) and to keep pushing for the whole mess to be resolved as quickly as possible…
So, now what?
I do hope that you continue to host with me! I absolutely would never do anything to intentionally cause any of you any loss of business!
What’s the Go-Forward Plan, then?
I’ve immediately set-up a 3rd Party, off-server support site. You can see what is coming up for maintenance and also submit tickets if you are experiencing any issues or have questions. Here’s the link: http://havehost.zendesk.com
In addition, you can also just send an email to: email@hidden
As a new iPhone owner, I have also installed the support system’s ticket system app and will be pushed notifications immediately upon your ticket or forum submission directly and immediately to my iPhone.
In addition, I am switching all support email to utilize GMail, instead of the old have-host.com email as in today’s case - if the server is down, I didn’t receive many of your emails until late this evening when the server went back to normalcy.
In Closing…
Once again, I apologize for any inconvenience that the downtime caused you and/or your organization. Rest assured that with the new 3rd Party Support Site you will always have immediate access to a means to communicate with me.
What if my site isn’t working, right now?
Please submit a ticket via either the new site or via the new email and I will have a fix in place as soon as possible.
Best Regards,
James Wilkinson
offtopic mailing list
email@hidden
Update your subscriptions at:
http://freewaytalk.net/person/options