PDA

View Full Version : Packet Loss to Ft. Lauderdale area



admin
04-13-2007, 09:55 PM
We're experiencing approx 7-10% packet loss to our Ft Lauderdale datacenter.

The DC is aware of this, and is working on the issue - which is most likely a network issue beyond their control - but I will post updates tonight as they become available.

Joe

admin
04-13-2007, 10:57 PM
This has been narrowed down a bit. There's some majorly severe weather in the Dallas / Ft. Worth area, which is a MAJOR US internet hub. They've got some massive fiber outages as a result.

Packet loss should subside as the fibers are repaired, which I would assume would be overnight. Until then, the network and servers are up and running without issues.

The issue may be amplified depending on the area of the country you're in - or the trace that your connection takes to any particular datacenter.



From http://www.weather.com

Severe storms continue to cross northern and eastern sections of Texas into western Louisiana to southwest Arkansas through the overnight hours. Severe thunderstorms are capable of producing large hail, damaging winds and tornadoes.

There have been reports of tornadoes and severe thunderstorms with large hail that have produced damage all around north Texas, especially near Haltom City and Haskell, Texas.


Reports of 5 injuries and 1 death from Haltom City, Texas a suburb north of Dallas, from a reported tornado have come in from north Texas.

Residents from eastern Texas across the deep South need to monitor and be ready to take cover tonight into Saturday from these dangerous thunderstorms.

On Saturday, the severe weather threat will move eastward and will include parts of Mississippi, Alabama and the western Florida Panhandle during the day and into Georgia and the rest of the Florida Panhandle late in the day and through the evening. Tornadoes are also possible with these storms, as well. Severe weather will then shift into the Carolinas for Saturday night and Sunday morning.


There is a wintry side of the storm as heavy snow continues thorugh parts of Kansas. Parts of southwestern Kansas have already picked up nearly a foot of snow, along with gusty winds. Rain will change to snow in Kansas City overnight. Rain mixed with snow will fall on Saturday from central Illinois, thorugh Indiana to north central Ohio. Snow will have a difficult time accululating during the daylight hours.


Sunday will feature a major storm system along the Eastern Seaboard and that will produce heavy rain and strong winds along the coast from the Middle Atlantic to New England from Sunday through Monday. Interior areas of the Northeast, especially in the higher elevations, could pick up a considerable amount of snow before the storm winds down on Tuesday.

admin
04-14-2007, 09:46 AM
If you're still experiencing issues, please run a traceroute to your domain and post the results here.

Thank you

stuartf
04-14-2007, 10:10 AM
Access from UK (which is where 99% of our members are) not working for about 18 hours - timeout error on home page:

Description: Too much time has passed without sending any data for
document http://www.bikerwales.com/ (http://www.bikerwales.com/))

Hre's a tracert as requested above:

Tracing route to www.bikerwales.com (http://www.bikerwales.com) [72.35.81.213]
over a maximum of 30 hops:
1 1 ms 1 ms 1 ms 192.168.2.1
2 9 ms 9 ms 21 ms 10.217.244.1
3 8 ms 11 ms 9 ms swan-t2cam1-b-v105.inet.ntl.com [80.0.254.149]
4 9 ms 20 ms 11 ms swan-t2core-b-ge-wan61.inet.ntl.com [195.182.176
.9]
5 14 ms 12 ms 13 ms win-bb-b-so-710-0.inet.ntl.com [62.253.187.241]
6 14 ms 14 ms 14 ms po9-0-0.brsbb1.Bristol.opentransit.net [193.251.
254.161]
7 17 ms 29 ms 20 ms so-5-2-0-0.loncr5.London.opentransit.net [193.25
1.243.37]
8 15 ms 51 ms 14 ms p16-2-0-2.r22.londen03.uk.bb.gin.ntt.net [129.25
0.8.49]
9 15 ms 19 ms 16 ms xe-0-2-0.r23.londen03.uk.bb.gin.ntt.net [129.250
.2.66]
10 89 ms 84 ms 85 ms p64-1-0-0.r20.nycmny01.us.bb.gin.ntt.net [129.25
0.3.254]
11 108 ms 99 ms 98 ms as-0.r21.asbnva01.us.bb.gin.ntt.net [129.250.2.9
]
12 101 ms 91 ms 90 ms ae-0.r20.asbnva01.us.bb.gin.ntt.net [129.250.2.1
6]
13 119 ms 121 ms 117 ms as-1.r01.miamfl02.us.bb.gin.ntt.net [129.250.4.1
17]
14 119 ms 120 ms 117 ms so-1-2-0.r01.miamfl02.us.ce.gin.ntt.net [192.204
.248.66]
15 121 ms 143 ms 126 ms g60.c02.bb.1vault.net [192.204.10.6]
16 117 ms 118 ms * www98.hostpc.com [72.35.81.213]
17 118 ms 116 ms 121 ms www98.hostpc.com [72.35.81.213]
Trace complete.

app-o-rama.com
04-14-2007, 10:15 AM
I put my traceroute info in my ticket (tid=273310).

MGrisafi
04-14-2007, 10:15 AM
Getting some major slowdown here in Pennsylvania also.


Tracing route to www.treyalexander.net [72.35.81.198]
over a maximum of 30 hops:

1 11 ms 9 ms 6 ms 10.229.80.97
2 14 ms 8 ms 8 ms gateway-g3-1-250-ephblocal1.eph.ptd.net [216.144
.187.222]
3 9 ms 9 ms 9 ms gateway2-atm4-0-0-305-sm22eph.sm.ptd.net [207.44
.124.197]
4 19 ms 14 ms 12 ms POS4-1.GW2.PHL6.ALTER.NET [157.130.43.33]
5 17 ms 14 ms 12 ms 0.so-4-2-0.XL2.PHL6.ALTER.NET [152.63.37.178]
6 17 ms 14 ms 15 ms 0.so-4-0-2.XT2.DCA5.ALTER.NET [152.63.42.245]
7 13 ms 16 ms 14 ms 0.so-7-0-0.BR1.DCA5.ALTER.NET [152.63.43.177]
8 15 ms 21 ms 14 ms 204.255.169.46
9 29 ms 14 ms 31 ms dcx-core-01.inet.qwest.net [205.171.251.33]
10 48 ms 50 ms 50 ms tpa-core-01.inet.qwest.net [67.14.3.2]
11 53 ms 55 ms 56 ms nap-edge-01.inet.qwest.net [205.171.27.50]
12 55 ms 53 ms 54 ms 72.164.248.26
13 56 ms 56 ms 56 ms g60.c02.bb.1vault.net [192.204.10.6]
14 56 ms 56 ms 55 ms www35.hostpc.com [72.35.81.198]

Trace complete.

rmcb5
04-14-2007, 10:28 AM
Still having lots of problems, I can't even retrieve my email as it times out.

Tracing route to www.rmcb5.com [72.35.81.203]
over a maximum of 30 hops:

1 * * * Request timed out.
2 10 ms * 25 ms ge-2-4-ur01.littleton.co.denver.comcast.net [68.
86.105.89]
3 12 ms 12 ms 10 ms te-8-2-ar01.denver.co.denver.comcast.net [68.86.
103.33]
4 10 ms 16 ms 12 ms 68.86.103.150
5 20 ms 13 ms 13 ms 12.124.158.17
6 38 ms 38 ms 38 ms tbr1-p013302.dvmco.ip.att.net [12.123.207.162]
7 41 ms 41 ms 41 ms tbr2-cl31.sffca.ip.att.net [12.122.12.133]
8 36 ms 41 ms 41 ms ggr3-ge110.sffca.ip.att.net [12.122.82.169]
9 41 ms 42 ms 43 ms 192.205.33.178
10 43 ms 38 ms 44 ms ae-0.r21.snjsca04.us.bb.gin.ntt.net [129.250.2.9
7]
11 61 ms 61 ms 59 ms p64-1-0-0.r21.dllstx09.us.bb.gin.ntt.net [129.25
0.3.153]
12 72 ms 62 ms 59 ms ae-0.r20.dllstx09.us.bb.gin.ntt.net [129.250.2.5
8]
13 88 ms 88 ms 88 ms as-1.r00.miamfl02.us.bb.gin.ntt.net [129.250.5.1
3]
14 89 ms 89 ms 88 ms 129.250.12.90
15 99 ms 93 ms 105 ms 192.204.10.22
16 * 90 ms 91 ms www37.hostpc.com [72.35.81.203]

Trace complete.

champion6
04-14-2007, 10:49 AM
Very slow from central Illinois.

Tracing route to mikenmag.com [72.35.81.218]
over a maximum of 30 hops:
1 7 ms 8 ms 6 ms 74-136-192-1.dhcp.insightbb.com [74.136.192.1]
2 10 ms 7 ms 6 ms 74-134-3-61.dhcp.insightbb.com [74.134.3.61]
3 7 ms 6 ms 7 ms 74.128.8.225
4 11 ms 11 ms 11 ms sl-gw36-chi-6-0.sprintlink.net [144.228.154.197]
5 12 ms 10 ms 11 ms sl-bb20-chi-5-0.sprintlink.net [144.232.26.69]
6 11 ms 15 ms 13 ms sl-st20-chi-12-0.sprintlink.net [144.232.8.219]
7 11 ms 11 ms 11 ms sl-st21-chi-1-0.sprintlink.net [144.232.8.103]
8 17 ms 12 ms 11 ms 144.232.8.222
9 12 ms 17 ms 10 ms xe-0-0-0.r21.chcgil09.us.bb.gin.ntt.net [129.250
.2.238]
10 34 ms 35 ms 35 ms p64-2-2-0.r21.dllstx09.us.bb.gin.ntt.net [129.25
0.2.22]
11 36 ms 36 ms 34 ms ae-0.r20.dllstx09.us.bb.gin.ntt.net [129.250.2.5
8]
12 61 ms 61 ms 65 ms as-1.r00.miamfl02.us.bb.gin.ntt.net [129.250.5.1
3]
13 62 ms 61 ms 62 ms 129.250.12.90
14 65 ms 58 ms 53 ms 192.204.10.22
15 64 ms * 63 ms www99.hostpc.com [72.35.81.218]
Trace complete.

dbmasters
04-14-2007, 10:58 AM
I'm getting mostly request timeouts at some point or another in the traces...

admin
04-14-2007, 11:02 AM
We're trying a series of options. This issue appears to only be affecting one of our (many) racks. We've tried taking servers offline to see if the issue resolves - so far, it has not.

Right now, we're configuring a backup switch to swap - it will rule out part of the hardware, but it looks like its going to be trial and error for a bit. The folks are all "hands on" in the dataceter - we've got everyone working on it from there.

I'll post updates through the event as it unfolds.

Joe

admin
04-14-2007, 11:04 AM
This is only affecting the following servers (as far as we can tell)

www98
www97
www37
www35
www34
www33
www32
www98
www38
www47
www45
www95
www96
www93
www92
www91
www90

If you're on a server NOT listed above, I need to know that.

Joe

admin
04-14-2007, 11:23 AM
We're changing out the switch to this rack now.

dbmasters
04-14-2007, 11:26 AM
Since 98 is listed twise, I assume one of them is meant to be 99...I have a client with problems on 99, and another on 93.

oh, and myself on 91, all three are having issues, tracert's timing, out, etc.

MGrisafi
04-14-2007, 11:37 AM
I don't know what you guys did but whatever it was, it seemed to work. All my sites are back to there speedy old selves:)

admin
04-14-2007, 11:53 AM
We finished replacing the switch, this one has packet storm control on it, which should bring the response speeds back up.

The core issue is still being investigated, as we're experiencing some significant packet discards.

PLEASE folks - make sure your software is updated - that includes WordPress which has had some serious compromises in the past month.

Joe

champion6
04-14-2007, 12:07 PM
I'm on www99 - see my tracert above. Performance has improved, but it's not back to normal.

champion6
04-14-2007, 12:15 PM
Minutes later and performance is MUCH BETTER - back to normal level, I'd say. Thanks guys!

stuartf
04-14-2007, 12:54 PM
Mine on 98 are back to normal if not a bit faster - please leave whatever you did where it is :)

Thanks guys :thumbUP

stuartf
04-14-2007, 01:21 PM
mmmm.. spoke to soon

slowing down a lot again now :eek:

jhorowitz
04-14-2007, 01:24 PM
seemed speedy for a bit, then back down to a crawl again right now...

Traceroute has started ...

traceroute to iiki.org (72.35.81.127), 64 hops max, 40 byte packets
1 dd-wrt (192.168.1.1) 2.187 ms 1.683 ms 3.424 ms
2 10.126.224.1 (10.126.224.1) 8.945 ms 7.080 ms 7.849 ms
3 pch15.50.tampflerl-rtr1.tampabay.rr.com (65.32.14.254) 26.448 ms 12.839 ms 11.687 ms
4 pos6-0-oc-192.tampflerl-rtr3.tampabay.rr.com (65.32.8.133) 11.505 ms 12.285 ms 12.307 ms
5 pop1-tby-p0-1.atdn.net (66.185.136.169) 12.614 ms 12.991 ms 11.925 ms
6 bb1-tby-p0-0.atdn.net (66.185.136.160) 13.194 ms 11.799 ms 15.925 ms
7 bb2-atm-p2-0.atdn.net (66.185.152.186) 25.899 ms 25.409 ms 25.883 ms
8 pop1-atm-p4-0.atdn.net (66.185.150.1) 23.650 ms 28.641 ms 28.950 ms
9 atl-brdr-04.inet.qwest.net (65.112.33.129) 43.787 ms 43.925 ms 47.178 ms
10 atl-core-01.inet.qwest.net (205.171.21.173) 47.439 ms 44.551 ms 43.315 ms
11 * * *
12 nap-edge-01.inet.qwest.net (205.171.27.46) 62.371 ms 68.082 ms 62.394 ms
13 72.164.248.26 (72.164.248.26) 42.984 ms 45.725 ms 43.835 ms
14 g60.c02.bb.1vault.net (192.204.10.6) 60.507 ms 59.093 ms 68.365 ms
15 www92.hostpc.com (72.35.81.127) 44.179 ms 42.945 ms 43.313 ms

app-o-rama.com
04-14-2007, 02:47 PM
I'm still experiencing problems too.

Nick
04-14-2007, 02:53 PM
haven't had any complaints about earlier .. and mail hasn't timed out

right now my sites are smokin .. page after page

Nick

BigBen
04-14-2007, 03:16 PM
I'm on www92 and it's still pretty slow, but better than it was last night. Having some trouble running a traceroute from behind my router right now otherwise I'd post one. I'm in Delaware on Comcast if that helps.

gergev
04-14-2007, 04:02 PM
I'm on 45 and my mail is timing out

admin
04-14-2007, 04:18 PM
We're still working on this - I've called in another network admin to help identify the problem that's exclusive to this rack.

Joe

rmcb5
04-14-2007, 05:22 PM
Thanks for working on this, any more progress? The site was slow but usable just a little bit ago, not it is at a crawl again.

The timing of this issue is terrible as it impacted one of our events we had today.

Infiltraitor
04-14-2007, 05:22 PM
Hmmmm.... so this is why my website is operating super slow? I was going to ask the Helpdesk about this, but luckily I saw this thread.

It's been slow and "timing out" since last night... when will the speeds go back to normal??

I am in PA on Comcast

admin
04-14-2007, 05:23 PM
The timing of this issue is terrible as it impacted one of our events we had today.

I'll try to schedule issues at a better time in the future :rolleyes::(

We've got a team of 5 network engineers working on it. When the issue is found, it will be resolved.

rmcb5
04-14-2007, 05:54 PM
I wasn't insinuating that it's your fault, poo happens. Just frustrating that's all.

Thank you for the update. Remember, we are helpless and rely on you and the services of HostPC. :love

admin
04-14-2007, 06:09 PM
We have a couple of options. We, at this point, dont know where the issue is, but for whatever reason, hardware or software, it's isolated to this one rack.

I'll toss this option out for anyone that is really pulling their hair out. If you'd like to move to another rack, we have open slots on other servers.

To move, you MUST have a copy of your data - a fresh backup - that you can upload yourself to the new server. We can NOT at this time move data for you - we're all working on this issue.

Important notes:

1. There is no dataloss on the servers. This is strictly a network problem, all servers are up and healthy.

2. At any moment, one of the 6 technicians working on the servers/rack could find the issue, fix it, and bang, fixed. This could be 5 minutes - it could be 5 hours - we just dont know until we find it.


If you have your data, and would like to move to an unaffected rack/server - open a helpdesk ticket (link below) and we'll get it setup asap. DNS propogation happens in about an hour.

admin
04-14-2007, 08:54 PM
It appears that the broadcast storm has stopped, hopefully permanently. We've put some new monitoring in place to continue to watch it. We may want to replace some hardware from this rack in the near future to eliminate the possibility of a bad NIC, but it wouldn't involve any downtime or user intervention if that happens. We'll just move everyone to the backup servers (essentially a raid of the entire server) - but we'll cross that bridge when we come to it, if necessary.

Joe

rmcb5
04-14-2007, 09:59 PM
Thanks Joe

BigBen
04-20-2007, 01:09 AM
My site on www92 is completely unreachable right now, is it due to this same issue?

admin
04-20-2007, 08:52 AM
Not that anyone here is aware of, the server is responding fine, this incident clearedup 6 days ago

BigBen
04-20-2007, 09:03 AM
My site has been offline for going on 12 hours now. :( Submitted a ticket about 8 hours ago but no response yet. I can log in to DA by IP, is there something wrong with the nameservers? Checked with my registrar and the domain is in good standing.

EDIT - And just as I post this, my site became available again :P

admin
04-20-2007, 09:06 AM
Sorry BigBen - we haven't had any network outages