本文最后更新于 794 天前,其中的信息可能已经有所发展或是发生改变。
当3Par存储报警硬盘损坏时,就需要运维人员手动对故障硬盘进行更换。
用到的命令
查看数据回拷进度
servicemag status
-d可以查看具体信息
查看磁盘状态
showpd
-c查看数据块状态
-i查看磁盘wwn
查看存储状态 shownode
查看日志信息 showalert
存储识别磁盘 admitpd
数据写回磁盘 servicemag resume 0 2
卸载磁盘 dismisspd [pdId]
更换之前
首先,使用ssh工具链接到有故障盘的3Par存储阵列上,执行命令:
1 2 3 4 |
Nodename cli% shownode Node ---Name--- -State- Master InCluster -Service_LEC ---LED--- Control Mem Data Mem Cache 0 CN00000MV0-0 OK Yes Yes Off GreenBlnk 16384 16384 100 1 CN00000MV0-1 OK No Yes Off GreenBlnk 16384 16384 100 |
这一步用来排除故障不是因为控制器损坏。接下来查看硬盘状态:
1 2 3 4 5 6 |
Nodename cli% showpd Id CagePos Type RPM Stste Total Free A B Ccapacity(GB) 0 0:0:0 FC 10 normal 1142784 333824 0:1:1* 1:1:1* 1200 1 0:1:0 FC 10 normal 1142784 333824 0:1:1* 1:1:1* 1200 2 0:2:0 FC 10 failed 1142784 0 0:1:1- 1:1:1- 1200 3 0:3:0 FC 10 normal 1142784 333824 0:1:1* 1:1:1* 1200 |
这一步可以确定,槽位3中的硬盘已经损坏。具体而言观察CagePos即可。此处的Id选项在dismisspd命令中也有用处(用来手动卸载一颗硬盘)。
在更换硬盘之前,还需要观察硬盘数据是否已经迁移到其他存储部分。
1 2 3 4 5 6 |
Nodename cli% servicemag status Cage 0,magazine 2: The magazine was successfully brought offline by a servicemag start command. The command completed at Fri Jan 21 10:00:00 2022. The command started at Fri Jan 21 09:00:50 2022 servicemag start -wait -pdid 2 -- Succeeded |
观察最后一行,当命令进入Succeeded状态后可以对硬盘进行更换,这一步命令也可以加入 -d 参数以获得更详细的信息。
更换之后
当新的硬盘被插入后,正常情况下就可以使用showpd命令查看到新的硬盘。
1 2 3 4 5 6 |
Nodename cli% showpd Id CagePos Type RPM Stste Total Free A B Ccapacity(GB) 0 0:0:0 FC 10 normal 1142784 333824 0:1:1* 1:1:1* 1200 1 0:1:0 FC 10 normal 1142784 333824 0:1:1* 1:1:1* 1200 2 0:2:0 FC 10 degraded 1142784 0 0:1:1- 1:1:1- 1200 3 0:3:0 FC 10 normal 1142784 333824 0:1:1* 1:1:1* 1200 |
如果输出如上所示,此时需要手动添加硬盘到阵列中
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
Nodename cli% admitpd Checking for drive table upgrade packages. Package check completed Warning:The following disks are maked as being part of another system. Admitting these disks to this system will cause any existing data on them to be erased. -Disk_WWN- System_ID System_Name 5000C0000A2F0000 0000 0 Are you sure? select y=yes n=no:y 1 disks admitted Nodename cli% showpd Id CagePos Type RPM Stste Total Free A B Ccapacity(GB) 0 0:0:0 FC 10 normal 1142784 333824 0:1:1* 1:1:1* 1200 1 0:1:0 FC 10 normal 1142784 333824 0:1:1* 1:1:1* 1200 2 0:2:0 FC 10 degraded 1142784 0 0:1:1- 1:1:1- 1200 3 0:3:0 FC 10 normal 1142784 333824 0:1:1* 1:1:1* 1200 4 0:2:0 FC 10 normal 1142784 333824 0:1:1* 1:1:1* 1200 |
可以观察到手动添加后多了一颗硬盘。确定添加成功后,执行servicemag status,正常情况下应该已经开始数据迁移。若此时状态为Failed,确定硬盘正常后可以手动执行回迁任务:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
Nodename cli% servicemag resume 0 2 Are you sure you want to run servicemag? select q=quit y=yes n=no:y servicemag resume 0 2 ... mag 0 2 already onlopped ... firmware is current on pd WWN [] Id [4] ... firmware is current on pd WWN [] Id [2] ... checking for valid disks... ... disks in mag : 0 2 ... normal disks: WWN[] Id [4] diskpos[0] ... nor normal disks: WWN[] Id [2] ... verifying spare apace for disks 2 and 4 ... playback chunklets from pd WWN [] Id [4] |
执行完毕后可以手动执行servicemag status,此时任务应该进入Progress状态并回报任务进度。迁移完毕后硬盘即更换完毕。