GSR12406 SFC交换矩阵卡故障处理

在设备检查中发现GSR12406的一块SFC卡被Shutdown,系统日志信息如下:
Aug 11 19:42:42 GMT+8: %FABRIC-3-ERR_HANDLE: Due to CRC error from slot 16,shutdown the fabric card on slot 18
Aug 11 19:42:42 GMT+8: %MBUS-6-FABCONFIG: Switch Cards 0x1F (bitmask)
Primary Clock is CSC_0
Fabric Clock is Redundant
Bandwidth Mode : 10Gbps Bandwidth

为了判断故障是Slot18的SFC卡问题还是Slot16的CSC卡问题,采取拨插Slot18的SFC卡,然后抓取相关信息来判断,相关信息如下:

1、利用show controllers errors fabric和show controllers errors fabric counters命令相看产生的错误信息:

gsr#show controllers errors fabric counters
LC/RP FIA Software Error Counters/Bitmaps:
SLOT 0 :
CellDrop (lane0..0) 0

CRC CRC CRC CRC CRC LOS LOS LOS LOS LOS
Counter XBAR0 XBAR1 XBAR2 XBAR3 XBAR4 XBAR0 XBAR1 XBAR2 XBAR3 XBAR4
Lane0 0 0 1274 0 0 0 0 3 0 0

SLOT 1 :
CellDrop (lane0..0) 0

CRC CRC CRC CRC CRC LOS LOS LOS LOS LOS
Counter XBAR0 XBAR1 XBAR2 XBAR3 XBAR4 XBAR0 XBAR1 XBAR2 XBAR3 XBAR4
Lane0 0 0 1275 0 0 0 0 4 0 0

SLOT 2 :
CellDrop (lane0..0) 0

CRC CRC CRC CRC CRC LOS LOS LOS LOS LOS
Counter XBAR0 XBAR1 XBAR2 XBAR3 XBAR4 XBAR0 XBAR1 XBAR2 XBAR3 XBAR4
Lane0 0 0 1275 0 0 0 0 4 0 0

SLOT 4 :
CellDrop (lane0..0) 0

CRC CRC CRC CRC CRC LOS LOS LOS LOS LOS
Counter XBAR0 XBAR1 XBAR2 XBAR3 XBAR4 XBAR0 XBAR1 XBAR2 XBAR3 XBAR4
Lane0 0 0 0 0 0 0 0 0 0 0

gsr#show controllers errors fabric
SCA192 SCA192 SCA192 SCA192 XBAR192 XBAR192 CSCFPGA CSCFPGA CLKFPGA
LC_ENA BP_FRC LC_TYP DE_GNT DAT_LOS SEL_IDL LP_BAK LC_PRE CLKSTS
SLOT0 OK OK OK OK OK 00100 00100 OK OK
SLOT1 OK OK OK OK OK OK OK 00100 OK
SLOT2 OK OK OK OK OK OK OK OK OK
SLOT4 OK OK OK OK OK OK OK OK OK

Fabric error handling : enabled

通过上面信息基本上可以看出XBAR2产生CRC错误。

2、通过execute-on all show control fia查看Fabric工作情况

gsr#execute-on all show control fia
========= Line Card (Slot 1) =========

From Fabric FIA Errors
-----------------------
redund overflow 0 cell drops 0
cell parity 0 Switch cards present 0x001B Slots 16 17 19 20
Switch cards monitored 0x001B Slots 16 17 19 20
Slot: 16 17 18 19 20
Name: csc0 csc1 sfc0 sfc1 sfc2
-------- -------- -------- -------- --------
los 0 0 0 0 0
state Off Off Off Off Off
crc16 0 0 0 0 0

To Fabric FIA Errors
-----------------------
sca not pres 0 req error 0 uni fifo overflow 0
grant parity 0 multi req 0 uni fifo undrflow 0
cntrl parity 0 uni req 0 multi fifo 0 empty dst req 0 handshake error 0
cell parity 0
========= Line Card (Slot 2) =========

From Fabric FIA Errors
-----------------------
redund overflow 0 cell drops 0
cell parity 0
Switch cards present 0x001B Slots 16 17 19 20
Switch cards monitored 0x001B Slots 16 17 19 20
Slot: 16 17 18 19 20
Name: csc0 csc1 sfc0 sfc1 sfc2
-------- -------- -------- -------- --------
los 0 0 0 0 0
state Off Off Off Off Off
crc16 0 0 0 0 0

To Fabric FIA Errors
-----------------------
sca not pres 1 req error 0 uni fifo overflow 0
grant parity 0 multi req 0 uni fifo undrflow 0
cntrl parity 0 uni req 0
multi fifo 0 empty dst req 0 handshake error 0
cell parity 0

通过上面信息可以看到Slot18 no present。另外通过show controllers clock命令也可以看出。
gsr#sh controllers clock
Switch Card Configured 0x1F (bitmask), Primary Clock for system is CSC_0
System Fabric Clock is Redundant

Slot # Primary Clock Mode

0 CSC_0 Redundant
1 CSC_0 Redundant
2 CSC_0 Redundant
4 CSC_0 Redundant
16 CSC_0 Redundant
17 CSC_0 Redundant
18 None
19 CSC_0 Redundant
20 CSC_0 Redundant

3、查看系统日志信息:
gsr#sh log

Aug 12 08:52:19 GMT+8: %MBUS-6-OIR: Switch Fabric Card(6) OC-192 Removed from Slot 18
Aug 12 08:52:19 GMT+8: %MBUS-6-OIR: Switch Fabric Card(6) OC-192 Inserted into Slot 18
Aug 12 08:52:26 GMT+8: %MBUS-6-FABANALYZED: Switch card in slot 18 analyzed
Aug 12 08:52:26 GMT+8: %MBUS-6-FABCONFIG: Switch Cards 0x1F (bitmask)
Primary Clock is CSC_0
Fabric Clock is Redundant
Bandwidth Mode : 10Gbps Bandwidth
Aug 12 08:52:36 GMT+8: %FIA-3-LOS: LOS for slot 18 was detected.
SLOT 1:Aug 12 08:52:36 GMT+8: %FIA-3-LOS: LOS for slot 18 was detected.
SLOT 2:Aug 12 08:52:36 GMT+8: %FIA-3-LOS: LOS for slot 18 was detected.
Aug 12 08:52:38 GMT+8: %FABRIC-3-ERR_HANDLE: Due to CRC error from slot 2,shutdown the fabric card on slot 18
Aug 12 08:52:38 GMT+8: %MBUS-6-FABCONFIG: Switch Cards 0x1F (bitmask)
Primary Clock is CSC_0
Fabric Clock is Redundant
Bandwidth Mode : 10Gbps Bandwidth
Aug 12 08:52:42 GMT+8: %FIA-3-LOS: LOS for slot 18 was cleared.
SLOT 1:Aug 12 08:52:42 GMT+8: %FIA-3-LOS: LOS for slot 18 was cleared.
SLOT 2:Aug 12 08:52:42 GMT+8: %FIA-3-LOS: LOS for slot 18 was cleared.
通过上面日志可以看出,插入Slot18 SFC就会产生CRC错误,并被Shutdown,由此可初步判断是Slot18 SFC卡问题。幸好购买了质保,直接向Cisco开Case RMA。

posted on 2009-08-19 09:36 梯玛 阅读(1473) 评论(2)  编辑 收藏 引用 所属分类: 网络知识

评论

# re: GSR12406 SFC交换矩阵卡故障处理 2009-08-19 15:05 乐蜂网

不错啊!  回复  更多评论   

# re: GSR12406 SFC交换矩阵卡故障处理 2009-08-21 17:51 罗莱家纺

很好~~  回复  更多评论   

只有注册用户登录后才能发表评论。

导航

<2009年8月>
2627282930311
2345678
9101112131415
16171819202122
23242526272829
303112345

统计

常用链接

留言簿(1)

随笔分类

随笔档案

文章分类

搜索

最新评论

阅读排行榜

评论排行榜