Portal Home > Knowledgebase > Industry Announcements > Web Hosting Main Forums > Providers and Network Outages and Updates > SingleHop Down???
SingleHop Down???
Posted by webhostinggeek, 08-11-2013, 03:12 AM |
Hello,
All our nodes seem down on SingleHop. Some major network outage? |
Posted by webhostinggeek, 08-11-2013, 03:16 AM |
Worst part, leap is also not working to open ticket! |
Posted by Visiba, 08-11-2013, 03:24 AM |
Confirmed, it's down here as well. |
Posted by VortexServers, 08-11-2013, 03:35 AM |
Same issues with some of our nodes there as well :/ |
Posted by webhostinggeek, 08-11-2013, 04:57 AM |
Around 3 hours now and still not resolved. Is SingleHop quality going down? |
Posted by onzehost, 08-11-2013, 05:25 AM |
I managed to open a ticket about 2 hours ago, but still no answer, to access the leap the following message appears:
mongoroute01.dft.singlehop.net:27017: cursor timed out (timeout: 30000, time left: 0:0, status: 0) |
Posted by webhostinggeek, 08-11-2013, 05:28 AM |
The situation is now getting out of hand. All premium clients hosted on SH and all are going crazy now.
I guess we need to move on from here.
A router issue taking 3+ hours to resolve is not acceptable. |
Posted by inferno[DGT], 08-11-2013, 05:33 AM |
What the reason of this downtime? We cant create ticket coz LEAP not working. We should know when it can be solved. |
Posted by webhostinggeek, 08-11-2013, 05:34 AM |
Router issue. Details here: http://ot.singlehop.com/status-details.php
Quote:
Originally Posted by inferno[DGT]
What the reason of this downtime? We cant create ticket coz LEAP not working. We should know when it can be solved.
|
|
Posted by Ethernet Servers, 08-11-2013, 05:39 AM |
My machine in single hop is down too |
Posted by webhostinggeek, 08-11-2013, 06:01 AM |
I get this feeling that being Sunday morning, there are not enough DC tech expert to fix this.
Otherwise it wouldn't take such long.
And they have not even posted updates after 3.12AM which makes me worry even more. |
Posted by onzehost, 08-11-2013, 06:37 AM |
I think now going to solve the problem. |
Posted by webhostinggeek, 08-11-2013, 06:56 AM |
Back online after around 5 hours. But the confidence is all gone now. |
Posted by onzehost, 08-11-2013, 07:01 AM |
Worst of all, it had no way of contacting them. |
Posted by MannDude, 08-11-2013, 07:46 AM |
Looks back up now. |
Posted by danushman, 08-11-2013, 11:05 AM |
Hi guys,
I am very sorry about the outage. We had some pretty serious issues last night that impacted a number of clients in our largest data center, the Chicago-2 facility. A full team of our staff were working from 3am until 7am on Sunday night to identify and resolve the issues. We posted status updates to our status blog (http://ot.singlehop.com) every 30-45 minutes during the incident to let our clients know that we were working on it and trying our best to resolve it.
During the outage, we had over 500 tickets opened, over 300 phone calls come in, on our slowest typical shift. The normal Sunday night team was rapidly expanded from a handful of people to well over a dozen techs and engineers. Despite the increased staffing we struggled to respond to the influx of tickets and calls as quickly as we are usually able--the load was simply huge. Despite that, we did not rest until every ticket was answered, every voicemail and email returned and every server back online
@webhostinggeek and everyone else -- We have had many years of perfect uptime. Murphy's law provides that eventually something will break and something will go wrong. This has happened at many other large providers over the years, we have been fortunate to have avoided it and been able to provide 100% uptime.
What matters now is not that we had an outage. What matters is how we handle the aftermath. All customers that were impacted will receive a detailed RFO form -- that will be sent tomorrow. This will explain 3 things: 1) What happened in detail, 2) What was done last night to identify and resolve the issue and 3) What will be done to prevent it from happening again. In addition, all impacted customers will receive automatic SingleHop Bill of Rights Credits to their account to compensate them for the issues.
I know this is unusual. I know this was bad. I know that you all expect more from SIngleHop. Hopefully our 7 year track record of providing a highly dependable service coupled with the way in which we are handling this outage, our first major outage in years and years, helps you feel comfortable with us again going forward. Our goal is only to provide the best in automated infrastructure. And Murphy is a jerk.
Details coming soon to all inboxes of impacted customers (tomorrow morning most likely) -- in the meanwhile, sorry again. If you need something escalated, want to talk to me directly, my personal e-mail is dan-at-singlehop-dot-com. |
Posted by onzehost, 08-11-2013, 11:11 AM |
Dan,
Actually the problem in just leaving a little "frustrated" but we appreciate the efforts made to resolve the issue ASAP.
Thank you. |
Posted by stablehost, 08-11-2013, 12:05 PM |
Every provider has outages, what matters is how the provider handles it, communicates it to customers and prevents it from happening in the future.
Singlehop handled this perfectly. Keep up the good work! |
Posted by danushman, 08-11-2013, 12:48 PM |
Hi Encrypted, we are aware of that page being inaccurate right now and are working to update it. For some reason the data center uptime percentages did not update correctly despite the staff logging the outage in the system. I notified one of our developers about this a few hours ago and we expect the uptime percentages to reflect reality shortly. This issue impacted the Chicago-2 data center's network, a range of network devices, and good number of clients in the facility (exact list of impacted customers is being worked on as well.) The uptime percentages will be updated today and the detailed RFO form will be sent to all impacted customers tomorrow, along with automated service credits. |
Posted by danushman, 08-11-2013, 01:19 PM |
Encrypted, I am being told that we are fixing some code that updates the percentages right now and pushing the change shortly. The status site should reflect accurate uptime in just a few minutes on the data center level, then a little later, at the device level.
Quick update for everyone: The network team has been monitoring the things all morning and has informed me that everything has been operating normally since they identified and repaired the issue that was causing the problems last night. They will continue to watch things throughout the day and are on a heightened state of alert. We have extra staff available to make sure that customers are able to get through, and hold times and ticket response times have been and are normal. The RFO should be out tomorrow early in the morning at latest.
As always, if anyone has questions or comments, please do not hesitate to post them here, call, chat or email the support team, or contact/email me directly at dan-at-singlehop-dot-com. |
Posted by danushman, 08-14-2013, 04:25 PM |
Hi Everyone,
I just wanted to chime in. If you are a client who was impacted by this outage, just log into LEAP and visit the Bill of Rights Report Card. You should be able to see the details of how your services were impacted and claim a service credit there. We have also published the RFO on our blog for those who are interested. If you have any questions and are a current client, we ask that you please submit them via a ticket in LEAP and we will respond to it promptly. Thanks for your patience and for the support of the community.
Dan |
Add to Favourites Print this Article
Also Read
Cogent (Views: 1114)