Earlier this evening we identified one of our Cisco switch stacks misbehaving causing a constant stream of stack reconverges – this constant reconverge event has been causing layer 2 network instability for traffic flowing via the stack of nine switches.
We have just completed a physical inspection of all stack cables, including full reseating of cables as per Cisco’s guidelines – however the problem still persists.
It is possible that the stack issues are caused via a software fault within the Cisco IOS software.
We are currently applying an upgrade to the switch stack and will reboot the full stack afterwards to activate the changes.
Cisco advise doing this as a full cold reboot by removing the power from the stack members so this reload will take longer than usual.
Any customers connected to a different switch stack will only see a momentary outage during a layer 2 reconvergence.
Customers directly connected to switch stack DS-101 will see a total outage for up to 15 minutes.
The symptoms of the stack issue is hampering our ability to update the software quickly… we are almost ready to reboot the stack.
We are now rebooting the affected switch stack as per Cisco guidelines.
This switch stack is still showing issues despite swapping out of parts overnight. We have taken the decision to retire the entire stack from service.
Customers will be moved port-by-port to other switches.
All layer 3 operations on the stack were moved during office hours today, and all high profile ports have been migrated to new switches.
Some lower profile ports still remain and will be migrated during the week.
The switch stack is no longer part of our ring network and will be fully decommissioned shortly.