US-based connectivity issue

We have received reports this morning of network alerts from 3rd party providers indicating trouble reaching the Netnorth network.

After further investigation, we narrowed the issue to part of the Level3 network (traceroutes included below) located in the US – we raised this with Level3 who confirmed they currently have a network issue ticket open for UK-US traffic with packet loss exceeding 70% (although we see 90-100% in our tests).

We have extensive monitoring throughout UK and Europe, but only key point monitoring within the USA which did not flag any issues automatically.

We will look to add further test points in the US and Asia to attempt to locate these remote issues quicker.

This issue should not have affected any UK or EU traffic – but many “site uptime” servers will test from multiple locations and report issues with any location.

Here are a few traceroutes showing the issue lying within Level3’s network:

 1  vzd114.mediatemple.net (205.186.158.19)  0.060 ms  0.024 ms  0.022 ms
 2  e1.2.cr01.iad01.mtsvc.net (70.32.64.249)  0.286 ms  0.276 ms  0.251 ms
 3  65.97.50.1 (65.97.50.1)  6.981 ms  7.347 ms  7.680 ms
 4  br01-1-1.iad2.netdc.com (65.97.48.205)  0.468 ms  0.460 ms  0.545 ms
 5  209.48.42.149 (209.48.42.149)  0.382 ms  0.372 ms  0.420 ms
 6  206.111.0.66.ptr.us.xo.net (206.111.0.66)  0.935 ms  0.947 ms  0.957 ms
 7  * * *
 8  * * ae-14-14.bar1.Toronto1.Level3.net (4.69.200.93)  142.194 ms
 9  ae-0-11.bar2.Toronto1.Level3.net (4.69.151.242)  144.597 ms  144.601 ms *
 10  * * *
 11  * * *
 12  ae-41-41.ebr2.London1.Level3.net (4.69.137.65)  234.130 ms * *
 13  * * vlan102.ebr1.London1.Level3.net (4.69.143.89)  224.504 ms
 14  ae-4-4.car1.Manchesteruk1.Level3.net (4.69.133.101)  224.407 ms * *
 15  * NETNORTH-LT.car1.Manchester1.Level3.net (195.50.119.74)  219.169 ms  219.394 ms
HOST: stats.netnorth.co.uk                    Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. po1-16.router.tcw.netnorth.co.uk         0.0%    10    0.7   0.7   0.5   0.8   0.1
 2. ge-6-18.car1.Manchester1.Level3.net      0.0%    10    0.9   0.9   0.7   1.1   0.1
 3. ???                                     100.0    10    0.0   0.0   0.0   0.0   0.0
 4. AMAZON.COM.edge2.Washington1.Level3.net 90.0%    10  221.4 221.4 221.4 221.4   0.0
 5. 72.21.220.149                           80.0%    10  223.5 223.4 223.3 223.5   0.1
 6. 205.251.245.232                         70.0%    10  223.9 223.6 223.0 224.0   0.5
 7. ???                                     100.0    10    0.0   0.0   0.0   0.0   0.0
Sprint Source Region: Anaheim, CA (sl-crs3-ana)
 IP Destination: 82.148.224.24
 Performing: ICMP Traceroute
Wed May  6 08:11:49.079 UTC
 Tracing the route to 82.148.224.24
 1  144.232.13.244 4 msec  3 msec  2 msec
 2  144.232.24.40 6 msec  6 msec  5 msec
 3  ae14.edge1.LosAngeles9.Level3.net (4.68.111.89) 4 msec  3 msec  2 msec
 4   *  *  *
 5   *  *  *
 6   *  *  *
core1.tyo1.he.net> traceroute 82.148.224.24
 traceroute to 82.148.224.24 (82.148.224.24), 30 hops max, 60 byte packets
 1  74.82.46.5  3.918 ms  3.946 ms  4.023 ms
 2  184.105.223.105  133.830 ms  133.818 ms  133.894 ms
 3  80.239.167.189  98.187 ms  98.262 ms  98.247 ms
 4  213.155.137.58  98.174 ms 213.155.134.252  98.211 ms 213.155.130.126  98.086 ms
 5  4.68.70.129  102.706 ms  97.844 ms  102.817 ms
 6  * * *
 7  * * *
 8  * * *
 9  * * 4.69.151.242  232.329 ms
 10  * * *
 11  * * *
 12  * * 4.69.137.77  309.750 ms
 13  4.69.143.97  318.915 ms * 4.69.143.85  309.842 ms
 14  4.69.133.101  319.404 ms * *
 15  * 195.50.119.74  318.680 ms *
 16  * * *
 17  82.148.224.24  309.715 ms *  318.889 ms
Bookmark the permalink.

6 Responses to US-based connectivity issue

  1. After raising this issue with Level3, we have removed the Level3 connectivity from our network while they work on the issue internally.
    This should resolve most paths while they re-route via our alternate providers.

    NOTE: some paths may still choose to use Level3 into the United Kingdom, but these are outside of our control.

  2. We have received an update from Level3, but just confirming that the issue is ongoing. They provide these at certain intervals in accordance with their SLA.
    NOTE: we are attached to the main network issue ticket so will receive updates including services we do not use (such as the CDN part of this ticket)

    Update below:

    [SUMMARY OF WORK]
    Please be advised your service is currently being impacted by an event on the Level 3 network.

    Investigations are on-going and this ticket has been related to the main event ticket in order for you to be kept updated with the event progress.

    Please see the most recent update below.

    Updates

    08:58 GMT – The IP NOC responded to alarms indicating CDN services in multiple markets are being impacted by a packet loss issue. The trouble is being investigated and an estimate time of restoral cannot be provided at this time.

    [PLAN OF ACTION]
    Level 3 to provide further updates accordingly.

  3. Level3 have updated the ticket as follows:

    *** CASCADED EXTERNAL NOTES 06-May-2015 10:14:56 GMT From CASE: 9198136 – Event
    Through investigations the IP NOC isolated the trouble to overutilization on a link from Toronto to Chicago. Traffic was rerouted to avoid the trouble and services are now restored. The IP NOC will continue monitoring services to ensure continued stability. If additional issues are experienced, please contact the Level 3 Technical Service Center.

  4. The latency to the US appears to have been stable for the last two hours, so we have re-established our Level3 connectivity into the network.

Leave a Reply