PDA

View Full Version : Enough Downtime!!! 3rd time in 2 months!



Necrosaro420
06-08-2007, 06:24 PM
Third time in 2 months today all of my sites have went down and hvae stayed down for a while. Last time it lasted 5 hours or so.

Is there going to be much more of this? I need for my sites to be up.

And today they were down for a few hours

admin
06-08-2007, 09:46 PM
Hours?

We haven't had any multi-hour outages ... 5 minutes MAYBE

http://www.hostpc.com/uptime

If you're experiencing multi-hour outages, we need to check your DNS - open a helpdesk ticket

tonydi
06-09-2007, 03:49 AM
I've found that the monitoring service isn't always an accurate representation of the true status of a server. For instance, www58 was inaccessable (no web, ftp or email access) on Friday for at least 30 mins and probably more, but the monitor shows only 7 mins of downtime for the whole month so far.

admin
06-09-2007, 06:11 AM
Are you saying the Hyperspin network isn't reporting true downtime?

http://hyperspin.com/network.php

They monitor from 9 locations, on 20 different networks:

Seattle, Washington
San Jose, California
Dallas, Texas
Ashburn, Virginia
Chicago, Illinois
Maidenhead, Berkshire, UK
Amsterdam, Netherlands
Singapore
Brisbane, Queensland, Australia



Now, as for www58, there was a downtime issue yesterday. I was out of town, and my techs contacted me to report the load was very high - and the server needed to be rebooted. While I dont need to "authorize" a reboot, they wanted me to be aware. Once I returned last night, I found the clients script that caused the high load, and disabled it (spam exploited script).

Other than that,I've never seen a discrepancy in the Hyperspin monitoring - and if there was I'm not sure we could find any other reporting that would have monitored it better than 9 locations and 20+ networks.

tonydi
06-09-2007, 04:09 PM
I'm saying that what Hyperspin considers downtime and what a user considers downtime can be be different.

The www58 server was still technically "up" and whatever process Hyperspin uses to gauge that must have still functioned. So to Hyperspin, things were just peachy. But to anyone wanting to access a web site, do ftp or send/receive email from a domain hosted on 58, it was essentially down. That, unfortunately, also included hostpc.com and these forums!

To be clear, I'm not bustin' your chops over this, Joe. System outages are unavoidable and thankfully they are pretty rare here at HostPC. I'm just saying that based on past experience I tend to believe what my browser/email client/ftp client tells me about the status of a site here rather than what Hyperspin does. Also, since Hyperpsin appears to only monitor HTTP Port 80, it's never helpful when I notice a problem with email access.

admin
06-09-2007, 07:25 PM
True, the public report only monitors port 80 - which should be an indicator if websites are up and accessible - but the private report to it also monitors and alerts me and all techs to email issues.

Part of the www58 issue is users have HUGE IMAP boxes - that REALLY need to be cleared out. One site in particular has like 10 active IMAP connections 24x7 - and there's several hundred MB in each mailbox :(

time to send out some email on that subject I guess.

Geologyrox
06-12-2007, 09:14 AM
I've also noticed that the uptime given via hyperspin does not match up with my actual experience. 89 has been down via my web browser, ftp, and e-mail client, but hyperspin says it's plugging along just fine.

I wonder if I'm part of the problem, since I use the IMAP & catchall addresses and noticed the insane number of messages I downloaded after my most recent reformat. I think I set all the old stuff to delete at that time, but I'll check just to make sure.

admin
06-12-2007, 11:11 AM
I just found a Major problem on www89 - user setup a crontab and obviously doesn't know how to do it

* * * * * perl awstats.pl

The server is trying to run awstats CONSTANTLY for that domain. Every minute, every hour, every day.

As soon as that user was modified, problem appears to have cleared right up.

1. Please Catchall should be set to Fail - please be sure your account, as well as your users are all set to FAIL. - Yes, FAIL.

champion6
06-12-2007, 01:54 PM
I thought a user-installed copy of AWStats is forbidden - that HostPC would provide an installation of AWStats from a link on the DA control panel.

rmcb5
06-14-2007, 04:19 PM
Not trying to add anything negative here, but www37 was just down for ~20+ minutes (no biggie). Hyperspin updates every 5 minutes so it obviously didn't report it being down until a few minutes into the downtime. However, while watching it like a hawk to wait for details on my sites being accessible again I noticed it stopped updating at 3:51pm though the rest of the servers continued updating at the 5 minute intervals. Meaning at 4:04pm ET the report showed the next update was scheduled for 3:56pm even though we were 8 minutes beyond that (I have a screen shot) now the report shows:

www37 HTTP port 80 (http://www.hyperspin.com/publicreport/55351/5259/3/46541)UP04:06 PM04:11 PM5 min99.949%10min1
But the server was down for more than 20 mins. There seems to be a 10 minute window it did not check. If this service is that reliable, and there are so many locations checking, why would it just stop for ten minutes? Beyond that it shows inaccurate data for downtime.

Now, I am not bitching about the downtime, 20 minutes is tiny and HostPC was very responsive to getting the server back up and running. In fact www37 has been a great server and this is surely just some little fluke. Hell we had a brief downtime last month too.

So don't take this as a complaint or a questioning of HostPC and the services, more of a I happened upon this thread, my site went down (coincidentally), just all a timing thing so I thought I would report on what I witnessed nothing more, nothing less.

:love

admin
06-14-2007, 10:13 PM
Hmm

Our internal log shows

<HostPC> HOSTPC www37 (72.35.81.203) HTTP (Port 80) is DOWN: HOSTPC Err: invalid HTTP response (first line of response: "")
<HostPC> HOSTPC www37 (72.35.81.203) HTTP (Port 80) is DOWN: HOSTPC Err: response timeout
<HostPC> HOSTPC www37 (72.35.81.203) HTTP (Port 80) is DOWN: HOSTPC Err: cannot connect to port 80
<HostPC> HOSTPC www37 (72.35.81.203) HTTP (Port 80) is UP: HOSTPC Was down for: 20min

I'll bring this up to Hyperspin ... maybe they can shed some light on it.

Necrosaro420
07-22-2007, 05:16 PM
I had about another 20-30 minutes downtime again last night. No connection to any of my sites, including hostpc.com ..... all others worked, and it was not just me, I had a few people from around the country that could not connect as well.

admin
07-22-2007, 06:15 PM
Did you open a ticket with this issue?

The only server we were doing maintenance on was www300 .. I'm not aware of any downtime, and I was here till about 4am ET

Necrosaro420
07-22-2007, 07:09 PM
Did you open a ticket with this issue?

The only server we were doing maintenance on was www300 .. I'm not aware of any downtime, and I was here till about 4am ET

No, no ticket...It wouldnt even let you get to hostpc.com itself. This happened around 10:30pm est last night for roughly 20-30 minutes.

app-o-rama.com
07-22-2007, 09:05 PM
According to my Clicky (http://getclicky.com/4680)* logs I had people visiting my site on a HostPC server at 9:26pm, 9:32pm, 9:34pm, 9:35pm, 9:36pm, 9:42pm, 9:49pm (all central time).

* This is a referral link. Remove the last four digits if you don't like the referral being there.

admin
07-22-2007, 09:15 PM
No, no ticket...It wouldnt even let you get to hostpc.com itself. This happened around 10:30pm est last night for roughly 20-30 minutes.

I dont see _any_ reason for the site to be inaccessible during that time .. I was right here, and my mail syncs from hostpc every 2 minutes - maybe you had a routing issue?

The helpdesk is on a server that's not related to hostpc - and will soon be on it's own dedicated server, with a backup in another network so that type of issue wont happen again. Be sure you bookmark support.yareo.net

_IF_ that happens again, PLEASE get a traceroute during the issue and either email me the results, or paste them here as soon as you can so we can investigate.

dbmasters
07-22-2007, 10:32 PM
I was actually on my HostPC server AND dealing with a support ticket late last night and early this morning and it was all functional for me, I didn't get offline until 4:30 AM or so...was on all evening troubleshooting code issues.

Necrosaro420
07-23-2007, 02:46 PM
I dont see _any_ reason for the site to be inaccessible during that time .. I was right here, and my mail syncs from hostpc every 2 minutes - maybe you had a routing issue?

The helpdesk is on a server that's not related to hostpc - and will soon be on it's own dedicated server, with a backup in another network so that type of issue wont happen again. Be sure you bookmark support.yareo.net

_IF_ that happens again, PLEASE get a traceroute during the issue and either email me the results, or paste them here as soon as you can so we can investigate.

Not sure. I am on the East Coast and was unable to access. I had 2 people from KY, 1 from Tennesse, 1 from Michigan and 2 from California who were unable to connect. That was to all of my sites, as well as hostpc.com itself.

It does this several times a month, its very aggrivating as I never had this problem before I switched to hostpc.

starfighter
07-23-2007, 10:59 PM
Not sure. I am on the East Coast and was unable to access. I had 2 people from KY, 1 from Tennesse, 1 from Michigan and 2 from California who were unable to connect. That was to all of my sites, as well as hostpc.com itself.

It does this several times a month, its very aggrivating as I never had this problem before I switched to hostpc.
This is just one very happy customers opinion, you probably want to put in for a move to the new server bank. There much nicer, and are running circles around the old servers. There making everyone really happy is what I've heard from joe. Just my 2 cents.