Portal Home > Knowledgebase > Industry Announcements > Web Hosting Main Forums > Providers and Network Outages and Updates > PowerVPS / Virtacore hosting down for 80hrs and counting
PowerVPS / Virtacore hosting down for 80hrs and counting
Posted by kish, 07-10-2012, 04:43 AM |
The VPS I host with PowerVPS (now Virtacore) has been down since Friday evening (6 July). Apparently a couple of their nodes failed and had to be rebuilt. Since the evening of 7 July, they've been telling me that they're trying to restore from an R1Soft backup, which has been running since apparently taking forever.
I've been a customer for nearly 7 years and haven't had any serious problems with them. But this is something else entirely. They aren't able to tell me how long the restore will take or what contingency plans they have to get us back online quickly. Support keep repeating the same boilerplate messages - "we're working on it" and "we're monitoring the situation". Their status page has gone offline too, and they tell me that is unrelated =/
A real nightmare situation. Has anyone had a similar experience with them, or another company? How did it work out? Any tips? |
Posted by NYCServers-Nick, 07-10-2012, 05:11 AM |
Do you have a backup of your files saved locally to your PC or another server? If so, why not purchase another VPS at least temporarily just to get your site back up. |
Posted by kish, 07-10-2012, 05:14 AM |
it's not just one site. it's a couple hundred customer sites.
The backup is part of the service we have from powervps, but we don't have direct access to it ourselves. They've always been so responsive in the past - this is just out of character. |
Posted by morrisonhosting, 07-10-2012, 12:33 PM |
Quote:
Originally Posted by kish
it's not just one site. it's a couple hundred customer sites.
The backup is part of the service we have from powervps, but we don't have direct access to it ourselves. They've always been so responsive in the past - this is just out of character.
|
Sometimes when things go bad, they really go bad. Just hang in there. If they have full backups of the servers then it's just watching paint dry until it's done.
-Tyler Morrison |
Posted by kish, 07-10-2012, 12:35 PM |
thanks Tyler. This is certainly a bad one. Not much else we can do other than wait. Not sure that this amount of time is normal though |
Posted by morrisonhosting, 07-10-2012, 12:37 PM |
Quote:
Originally Posted by kish
thanks Tyler. This is certainly a bad one. Not much else we can do other than wait. Not sure that this amount of time is normal though
|
That entirely depends on the size of the backup and the network speed. It could take a very long time.
-Tyler Morrison |
Posted by 48-14, 07-10-2012, 12:41 PM |
7 years is a good run to not have such an issue. Trust me I've seen some hosts that go through this every 2 months.
I would say if this is their first time going through this, there most likely in the same panic mode as you fixing everything. They most likely turned their chat off (which is why it's unrelated) due to everyone coming on asking the same thing over and over. Answering everyone would turn a 5 minute task to an hour.
Backups do take time to transfer and restore. If it were a few accounts, maybe a few hours, but if it's hundreds, there's no exact time frame. The right backup system can move a lot of data quickly between servers, but there is also manual work involved in make sure everything is restored properly.
Your best option if you backups like NYCServers-Nick mentioned, find another VPS and setup your sites there until your host resolves everything. |
Posted by kish, 07-12-2012, 01:33 PM |
We're approaching 6 days offline now. The restore is still in progress and actually getting slower.
:-( |
Posted by HmH-Melissa, 07-12-2012, 02:33 PM |
Kish,
Is there a number you may call for support or a live chat?
Six days is a very long time to be down, even if there are backups that are being restored. |
Posted by kish, 07-13-2012, 04:51 AM |
Yes. I have spoken to them by phone. They apologised and have basically said that there's nothing more they can do to make this go any faster. It will take as long as it takes.
Such a painful process :-( |
Posted by awaken, 07-14-2012, 03:30 PM |
Hi Kish,
My company was also affected by this. We have four "Force" servers with PowerVPS, two of which are on the failed node and are still down almost 8 days later.
We had a little over 200 websites on those two servers. Luckily, we keep nightly cpanel account backups on Dropbox. I was able to get another VPS provider to provision me a server last Sunday after PowerVPS support told me that it would be at least another 72 hours before the servers were back up and refused to provision me replacements.
It was an all-nighter, but we were fully operational Monday morning excluding a couple of sites that host their DNS outside of our infrastructure.
A PowerVPS support tech told me via email that the backup they needed to restore was only 1.1TB. Seems like 6 days is a long time to restore that amount of data, but I don't know much about their infrastructure.
Apart from the clear failure to have a reasonable restoration process, communication has been equally poor. The only communication I've had with them is when I initiate it. The web link given out for updates stopped working last Sunday and they haven't bothered to fix it.
What really bums me out is that I liked and recommended them for almost 5 years. Their VPSs are fast compared with others I've tried. We've had very good uptime before this. My other two servers haven't been restarted in almost a year. The only other time I've had to contact their support is when their billing system failed to allocate one of my payments resulting in a VPS being taken offline. They were able to resolve that issue within a few hours.
I emailed support asking how they intended to compensate for the downtime. I haven't received a response, but it hasn't been long since I sent it. I'm now evaluating other providers as a long term solution. I'm not looking forward to configuring more servers and moving another 200 sites.
Best of luck Kish. I hope they are able to restore your server long enough for you to retrieve your account data.
-Ken |
Posted by Greg-HostsVault, 07-14-2012, 04:46 PM |
Wow that is awful that it's been down for so long. I hope that the company learned from this and will improve their recovery process for future issues.
Ken does make a good point that having an outside backup provider makes a world of difference in a worst-case scenario like this.
When you do get back online, I'd be interested to see what powervps has to say about the cause of the delay.
I hope they get back online soon! |
Posted by DracH, 07-14-2012, 07:47 PM |
Any update Kish? Do you get your backup and websites online? |
Posted by gasyoun, 07-14-2012, 10:39 PM |
Quote:
Originally Posted by awaken
backup they needed to restore was only 1.1TB. Seems like 6 days is a long time to restore that amount of data, but I don't know much about their infrastructure.
-Ken
|
Ken, do you backup to DropBox with a cron? 1.1 Tb is not small either |
Posted by awaken, 07-15-2012, 10:13 AM |
Quote:
Originally Posted by gasyoun
Ken, do you backup to DropBox with a cron?
|
The short answer is yes, I use the WHM backup tool which is invoked by cron. I leave dropbox running under an unprivileged account.
Here is how I set up my backups:
- Set up an unprivileged account, i.e.
- "su dropbox" and install Dropbox for Linux
- Instruct WHM to place account backups in the Dropbox folder, i.e. /home/dropbox/Dropbox/<servername>/
- Modify or create /scripts/postcpbackup to change the ownership of files in the Dropbox folders since they will be set to root by the WHM backup tool, i.e.
Code:
chown dropbox:dropbox /home/dropbox/Dropbox/* -Rf
- We store all of our server backups under the same dropbox account, so I have to exclude folders on each server so the backups from other servers don't sync and eat up all the disk space.
Code:
./dropbox.py exclude add /home/dropbox/Dropbox/<server_folder_to_exclude>
One note is that since some of my servers are running an older version of python, I had to set up a newer Python version under the dropbox account in order to use the dropbox.py script. I ran into this on my CentOS5 servers, but not on my new CentOS6 server.
Overall, this is working well and gives us full account history in Dropbox. The major downside is that we're storing upwards of 50GB of backups on each of our servers.
Since we have 200GB servers, I add accounts until they take up up around 100GB. Then with the 50GB of backups, we are left with 25% free disk space. I receive a warning when the disk space hits 80% of capacity and provision a new server at that time.
-Ken |
Posted by awaken, 07-15-2012, 10:32 AM |
Quote:
Originally Posted by gasyoun
1.1 Tb is not small either
|
Perhaps not, but I was able to restore nearly 150GB worth of CPanel accounts from Dropbox in one night using what I'm guessing is much less sophisticated means than PowerVPS has available to them via R1Soft.
-Ken |
Posted by awaken, 07-15-2012, 10:39 AM |
Quote:
Originally Posted by DracH
Any update Kish? Do you get your backup and websites online?
|
I can't speak for Kish, but our servers are still down. It's now been 9 days since the node failed and 7 days since they started the restore process.
I sent them support request asking how they plan to compensate us for the downtime. Here is the response:
Quote:
Hello Ken,
Firstly, our sincere apologies for the downtime involved.
I'll escalate this ticket to our support manager for further assistance. We will get back to you as soon as possible.
Thank you.
Best Regards,
Prasad S
Technical Support Analyst
Virtacore Systems/PowerVPS Support
|
I'll update when I hear more.
-Ken |
Posted by awaken, 07-16-2012, 04:35 PM |
PowerVPS has just brought the repaired node back online. The total downtime was approx. 9 days, 20 hours.
I received this email from them about 10 minutes ago:
Quote:
Hello.
First I wanted to say Thank You regarding your patience regarding the full restore process of your vps container. We had a complete raid failure several days ago and all data was lost and due to the nature of the force containers there was a sheer amount of data that needed restored hence why it took several days for a complete restore process. What we did to resolve the issue was replaced the hw node raid controller to elminate another hardware failure and we are also are going to be looking into other backup solutions other than r1 to ensure backups complete in a more faster time table than what we saw.
Again we apologize for the inconvenience and we appreciate your continued support
Any questions regarding this matter please email us @ support@powervps.com
Thank You
|
|
Posted by Jawany, 07-16-2012, 07:28 PM |
I thought i'm alone! it's been 10 days now. anyone got his websites online? |
Posted by ewelin, 07-16-2012, 08:52 PM |
Our node is still down :-( |
Posted by DracH, 07-16-2012, 08:54 PM |
You guys really need to change provider... |
Posted by ewelin, 07-16-2012, 08:56 PM |
First major issue I've had with them in over 7 years. I'll see how they handle it once it's back up and how they compensate me. |
Posted by DracH, 07-16-2012, 09:00 PM |
Quote:
Originally Posted by ewelin
First major issue I've had with them in over 7 years. I'll see how they handle it once it's back up and how they compensate me.
|
You know best what's good for you But waiting 10 days for services to be online is too much. |
Posted by Jawany, 07-18-2012, 02:53 AM |
I had a Cpanel backups on the same vps and they restore it for me, the website are online now. |
Posted by Jonchun, 07-18-2012, 03:01 AM |
The compensation better be good and you better make sure it doesn't happen again... 10 days was way too long to wait. I'm surprised you didn't switch after 3. |
Posted by Jawany, 07-18-2012, 03:07 AM |
I was going to switch after 12 hours but I didn't have any backups! i'm gonna send them an email and will see what will they come up for the compensation! cuz i'm leaving next month to another provider! |
Posted by kish, 07-21-2012, 07:22 AM |
A quick update from me:
back online since yesterday... so down on 6th July 2012 and restored on 20th July 2012.
I'm trying to download individual cpanel backups for my own peace of mind... it's a long job though with hundreds of websites to go through :-( |
Posted by hostingvince, 07-21-2012, 11:46 AM |
Well it seems they haven't learnt from the issues from start of this thread, as same issue is happening now on another node that I am on.
Down 24hrs due to RAID failure - again - and apparently restoring 'metal to metal' as it's faster but will still take 3 more days!
Not looking forward to my clients anger :-( |
Posted by trinivps, 07-21-2012, 02:46 PM |
I also have a couple servers with Powervps. This delay in restoring is completely unacceptable. Would other providers have this same problem if they have a RAID failure? I also have separate servers with site5 and I will ask them what their procedure is like if there is a RAID failure.
Something is off with this entire scenario.
I am also very disappointed that they (powervps) cannot give me access to my backups. It seems my clients' sites will be down for the entire coming week. |
Posted by 48-14, 07-21-2012, 05:00 PM |
Quote:
Originally Posted by trinivps
I also have a couple servers with Powervps. This delay in restoring is completely unacceptable. Would other providers have this same problem if they have a RAID failure? I also have separate servers with site5 and I will ask them what their procedure is like if there is a RAID failure.
Something is off with this entire scenario.
I am also very disappointed that they (powervps) cannot give me access to my backups. It seems my clients' sites will be down for the entire coming week.
|
Replace the drive, rebuild and secure it, move all the accounts back on. I could easily say about a full 24 to do this whole process, but I don't know how many accounts they had on that server...but either way it should not take a month.
And to think in another discussion, one host admitted to only doing backups once a month because "his" system is not able to be backed up. Imagine if that host were in this scenario.
Who knows if they learned their lesson, but to everyone else...backup backup backup. You never knew. Even some of the hosts that have good recommendations on here can have serious moments. |
Posted by Ronald_Craft, 07-21-2012, 06:27 PM |
I have experience with bare metal restores via R1soft and to restore an entire server is generally a nightmare. First of all, I'm sure they store the backups on separate backup servers - so when they're restoring the data, the major bottleneck is the link speed between the servers. If it's only 10 mbps or 100 mbps that's going to cause a major slow down right there since the network will only be able to transfer so much data at a time. For a bare metal restore you should be going with a 1 gbps link, at least for the duration of the restore process between the servers.
Next, because of the way R1soft backs up and restores data this leads to further slowness. A server of 1 TB or more can take up to 2 - 3 days to restore in its entirety with a full bare metal restore.
The 9+ days it took them to restore this data is definitely not right, as it should not have taken that long to restore a single node. I'm guessing they were bottlenecked by something, the restore failed during the process (at least once), or they had much more than 1.1 TB of data to restore. Simply put, that restore shouldn't have taken that long.
While these kinds of issues can and do happen and a RAID failure is simply unavoidable, I do wonder if they have any kind of internal monitoring of their RAID arrays. If a drive fails out of the array they should be immediately aware of it and have that drive replaced before it leads to, well, this.
I'd be interested in hearing what happened from their perspective to see why it took that long to restore a single node. |
Posted by Sordell Media, 07-23-2012, 09:20 AM |
Same issues here, including the lacklustre responses from support. Already got an account with another host, conveniently my PowerVPS package is due for renewal at the end of the week.... just hope their 'metal' restore is done before then so I can get my data and cut my losses. |
Posted by Waddie, 07-25-2012, 07:06 PM |
Another is also down for 5 days now and they are still saying that its restoring but I don't understand how long it will take. Don't you think that 5 days are way more than enough time for any professional hosting provider to rebuild RAID? |
Posted by trinivps, 07-25-2012, 07:09 PM |
24 hours should be a worst-case scenario. This is 2012, after all. |
Posted by morrisonhosting, 07-26-2012, 11:06 AM |
Quote:
Originally Posted by trinivps
24 hours should be a worst-case scenario. This is 2012, after all.
|
That is not true at all. At 100 Mbp/s full speed a 1TB transfer would take 1 Day and 26 Minutes. That's theoretical. You are looking at a quarter to half a day for a dedicated 1Gbit connection to get all the data back realistically at a minimum. At 100Mbp/s it could take a day or two to transfer. Add in the time it took them to get new hardware and replace everything and you have a disaster.
-Tyler Morrison |
Posted by Waddie, 07-26-2012, 01:49 PM |
Is it different from failure of a drive in a RAID?
Or RAID controller failure is a different thing? |
Posted by Firm vps, 07-26-2012, 03:24 PM |
if you have backup then shift it to any other company |
Posted by morrisonhosting, 07-27-2012, 12:43 PM |
Quote:
Originally Posted by Firm vps
if you have backup then shift it to any other company
|
Some people that are waiting do not:
Quote:
Originally Posted by Jawany
I was going to switch after 12 hours but I didn't have any backups! i'm gonna send them an email and will see what will they come up for the compensation! cuz i'm leaving next month to another provider!
|
-Tyler Morrison |
Posted by Waddie, 07-27-2012, 07:25 PM |
I am afraid that its now completing one week and still rebuild is going on.. are they building Google's data? |
Posted by morrisonhosting, 07-30-2012, 12:16 PM |
Quote:
Originally Posted by Waddie
I am afraid that its now completing one week and still rebuild is going on.. are they building Google's data?
|
Nah, probably restoring over dialup.
-Tyler Morrison |
Posted by nexxterra, 07-31-2012, 12:37 AM |
This sounds more like a financial issue than an equipment issue.
I have seen systems go down and the DC when the provider did not have the $$ to get a new router or other piece of equipment that was needed.
When we first moved into our datacenter with 60 servers, we went in at 10 am and left at 7pm that same night with everything ready to go online (yes that included ordering pizza). 80 Hours seems like a bit more time for just fixing a few pieces of equipment. |
Posted by morrisonhosting, 07-31-2012, 03:10 PM |
Quote:
Originally Posted by nexxterra
This sounds more like a financial issue than an equipment issue.
I have seen systems go down and the DC when the provider did not have the $$ to get a new router or other piece of equipment that was needed.
When we first moved into our datacenter with 60 servers, we went in at 10 am and left at 7pm that same night with everything ready to go online (yes that included ordering pizza). 80 Hours seems like a bit more time for just fixing a few pieces of equipment.
|
It's the data transfer that would take the time. Also, you probably could have used the heat form 60 servers to make your own pizza. That's way more satisfying!
-Tyler Morrison |
Posted by 48-14, 08-01-2012, 07:31 PM |
Quote:
Originally Posted by nexxterra
This sounds more like a financial issue than an equipment issue.
I have seen systems go down and the DC when the provider did not have the $$ to get a new router or other piece of equipment that was needed.
When we first moved into our datacenter with 60 servers, we went in at 10 am and left at 7pm that same night with everything ready to go online (yes that included ordering pizza). 80 Hours seems like a bit more time for just fixing a few pieces of equipment.
|
Sounds like the host year (name I forgot) that owed the DC between $20,000 to $30,000 in server fees, and kept giving excuses, then asked all the customers to pay a 60% emergency increase, then closed shop and ran. |
Add to Favourites Print this Article
Also Read
Netrino.co.uk (Views: 976)