Portal Home > Knowledgebase > Industry Announcements > Web Hosting Main Forums > Providers and Network Outages and Updates > Secure Dragon Down


Secure Dragon Down




Posted by MziB, 06-06-2013, 07:36 AM
Is it offline for anyone else? I have a small server with them,

Posted by DWS2006, 06-06-2013, 07:39 AM
Up for me.

Posted by Tuguhost, 06-06-2013, 07:42 AM
did you mean securedragon main website?
i can access it from indonesia

Posted by MziB, 06-06-2013, 07:44 AM
Their VPS service. I have one with them. Just want to know are there others who have VPS with them experincing the same.

Posted by Tuguhost, 06-06-2013, 07:47 AM
Quote:
Originally Posted by MziB
Their VPS service. I have one with them. Just want to know are there others who have VPS with them experincing the same.
did you have contact them?

Posted by OakHosting_James, 06-06-2013, 07:52 AM
Mine went off a 45 minutes ago. I'm in Tampa - which DC are you in?

Posted by OakHosting_James, 06-06-2013, 07:53 AM
... and back up.

Posted by Loon, 06-06-2013, 07:54 AM
Moved to outages.

Posted by ZKuJoe, 06-06-2013, 09:46 AM
For some reason there was a routing loop between our routers and our data center's switches. We're not sure why at this time but according to our NodePing monitors and SolusVM the downtime lasted about 3-4 minutes, Pingdom shows 5 minutes, and StatusCake shows 6 minutes.

We are waiting to hear back from our data center because routing loops don't normally appear without some human interaction. Hopefully we can provide an accurate RFO later today.

Can somebody confirm the outage times for me since it looks like the outage may have been longer for some than others?

http://drgn.biz
http://status.securedragon.net
http://stats.pingdom.com/f9bng3rw047t

Posted by OakHosting_James, 06-06-2013, 09:53 AM
Quote:
Originally Posted by ZKuJoe
Can somebody confirm the outage times for me since it looks like the outage may have been longer for some than others?
Happily.

According to my NodePing probe: Down at 11.09:09 GMT; Up at 11.53:15 GMT. 44 minutes downtime
According to my Pingdom probe: Down at 11:08:17 GMT; up at 11:53:17 GMT. 45 minutes downtime

Posted by ZKuJoe, 06-06-2013, 10:43 AM
Quote:
Originally Posted by JamesOakley
Happily.

According to my NodePing probe: Down at 11.09:09 GMT; Up at 11.53:15 GMT. 44 minutes downtime
According to my Pingdom probe: Down at 11:08:17 GMT; up at 11:53:17 GMT. 45 minutes downtime
Thanks. I am not sure why some people experienced 45 minutes while all of our monitors only show a few minutes. We're waiting for a network tech to take a look for us at the DC to get some answers hopefully.

Posted by StatusCake, 06-06-2013, 11:33 AM
ZKuJoe, in addition to your standard pings for your StatusCake account we also tested on all locations (part of a beta new feature) and I can confirm for all 40 of our locations the downtime was only in the region of 6 minutes. We use different DNS servers (including OpenDNS and Google DNS) to try to ensure no limited issue and have no signs of that - so I would say the vast vast majority of your users would of only had a few minutes downtime.

Posted by OakHosting_James, 06-06-2013, 11:59 AM
The public reports from StatusCake were curious. When I looked at it, I had been off for 15-20 minutes. The actual node I'm on (fl1ovz02) showed as up, but the location (Tampa) showed as down - and had been for some time. After some time, Tampa came back up on the public status page, but my container was still unreachable. A little while later, I was back online. When I looked back at the public status page later on, Tampa only showed as having been off for 6 minutes.

Posted by ZKuJoe, 06-06-2013, 12:36 PM
This is indeed very confusing. There are a lot of things that were experienced that do not add up.

It looks like only some of our IPs were offline which makes no sense if the problem was a routing loop like we experienced. I'm still trying to piece everything together and hope to have a time line of the events to best analyze what happened. Unfortunately I woke up after it was already fixed so I am working off of log files and a 2nd hand account from my partner who's primary goal was resolving the problem versus finding the cause which if the downtime was at the 45 minute mark I would complete agree with him.

Posted by DotVPS-J, 06-07-2013, 09:23 AM
Nodeping: Host Secure Dragon : PING is back up after 45 minutes of downtime as of Thu Jun 06 2013 12:53:27 GMT+1.

OnePoundWebHosting server monitoring:
Up Alert
Date/Time: Jun 6 2013 12:53:06 PM BST
Check Name: Securedragon
Hostname/URL: 199.167.30.xx
Type: PING
Went down at: Jun 6 2013 12:08:16 PM BST
Was down for 45 minute(s)

Posted by ZKuJoe, 06-07-2013, 10:12 PM
Thanks. It looks like we had a problem with our routers going split brain. I'm not sure how that's possible with VRRP and 2 different connections to each router but we're hoping that if we add a 3rd cross-connect between the routers it will prevent it in the future. I've also added a watchdog script on the primary router to reboot itself if it looses connection to the WAN and LAN interfaces, although in the event of a split brain situation it won't help much.

What's weird is that I'm pretty sure our data center's network is designed for failover and not load-balancing so only one link should ever be active at a time.

The fact that we also have some IPs only experience ~5 minutes of downtime while others experience 45 minutes tells me there is more going on than the logs are telling us.



Was this answer helpful?

Add to Favourites Add to Favourites    Print this Article Print this Article

Also Read
VolumeDrive Down? (Views: 970)
Ecatel Down 10/16/2013 (Views: 999)


Language: