Hey all, thanks for taking the time to read through my problem.
I'm a new admin for this company and haven't had very much experience with the MD3000i.
According to my boss the admin before me tried to change out the battery on one of the controllers, the steps he took are unknown. Once that was done, about 15 minutes later the controller became unresponsive and the SAN orange lighted as well as the controller.
This was left for a while as that admin left abruptly. Now that I've had some time to work on this I've tried a couple things:
-Change to a different md3000i chassis.
-Change the cache on the controller
-Change the battery back to the original one
None of these got the controller functioning again. However changing the chassis did stop the controller from orange lighting. this prompted me to plug in the serial cable to see what's going on:
Reset, Power-Up Diagnostics - Loop 1 of 1
3600 Processor DRAM
01 Data lines Passed
02 Address lines Passed
3300 NVSRAM
01 Data lines Passed
5900 Ethernet 91c111 #1
01 Register read Passed
02 Register test Passed
3A00 NAND Flash
06 Bad Blocks Test Passed
2310 Application Accelerator Unit
01 AAU Register Test Passed
6D00 LSI SAS 1068 IOC--Base Board
01 IOC Register Read Test Passed
02 IOC Register Address Lines Test Passed
03 IOC Register Data Lines Test Passed
6F01 QLOGIC EP4032 CHIP 0
01 Register Read Test Passed
02 Register Address Lines Test Passed
03 Register Data Lines Test Passed
3900 Real-Time Clock
01 RT Clock Tick Passed
Diagnostic Manager exited normally.
Current date: 02/13/14 time: 13:03:30
Send <BREAK> for Service Interface or baud rate change
02/13/14-20:59:13 (GMT) (tRAID): NOTE: Set Powerup State
02/13/14-20:59:13 (GMT) (tRAID): NOTE: SOD Sequence is Normal, 0
02/13/14-20:59:13 (GMT) (tRAID): NOTE: SOD: removed SAS host from index 0
02/13/14-20:59:13 (GMT) (tRAID): NOTE: In iscsiIOQLIscsiInitDq. iscsiIoFstrBase = 0x0
02/13/14-20:59:13 (GMT) (tRAID): NOTE: Turning on tray summary fault LED
02/13/14-20:59:15 (GMT) (tRAID): NOTE: SYMBOL: SYMbolAPI registered.
02/13/14-20:59:15 (GMT) (tRAID): NOTE: lost persistent dq data because buffer was modified or size changed.
esmc0: LinkUp event
02/13/14-20:59:16 (GMT) (tNetCfgInit): NOTE: Network Ready
02/13/14-20:59:19 (GMT) (tRAID): NOTE: Initiating Drive channel: ioc:0 bringup
02/13/14-20:59:21 (GMT) (tRAID): NOTE: IOC Firmware Version: 00-24-63-00
02/13/14-20:59:29 (GMT) (tSasEvtWkr): NOTE: sasIocPhyUp: chan:0 phy:0 prevNumActivePhys:2 numActivePhys:2
02/13/14-20:59:29 (GMT) (tSasEvtWkr): NOTE: sasIocPhyUp: chan:0 phy:1 prevNumActivePhys:2 numActivePhys:2
02/13/14-20:59:39 (GMT) (tRAID): NOTE: IonMgr: Drive Interface Enabled
02/13/14-20:59:39 (GMT) (tRAID): NOTE: SOD: Instantiation Phase Complete
02/13/14-20:59:39 (GMT) (tRAID): WARN: No attempt made to open Inter-Controller Communication Channels
02/13/14-20:59:39 (GMT) (tRAID): NOTE: LockMgr Role is Master
02/13/14-20:59:39 (GMT) (tRAID): WARN: FBM:validateSubModel: Exception - Alt controller not ready
02/13/14-20:59:39 (GMT) (tSasDiscCom): NOTE: SAS Discovery complete task spawned
02/13/14-20:59:40 (GMT) (tRAID): NOTE: spmEarlyData: No data available
02/13/14-20:59:45 (GMT) (tSasDiscCom): WARN: SAS: Initial Discovery Complete Time: 30 seconds
02/13/14-21:00:09 (GMT) (sasCheckExpanderSet): WARN: sasCheckExpanderSetup: No bus target for local expander available for 30 seconds after sasChannelInitialize
02/13/14-21:01:15 (GMT) (tRAID): NOTE: WWN baseName 0004a4ba-db34f70b (valid==>NoPrevAlt)
02/13/14-21:01:15 (GMT) (tRAID): NOTE: IonMgr: Host Interface Enabled
02/13/14-21:01:15 (GMT) (tRAID): NOTE: SOD: Pre-Initialization Phase Complete
02/13/14-21:01:16 (GMT) (tRAID): WARN: BID: initialize(): Power latched!
02/13/14-21:01:30 (GMT) (tRAID): WARN: dbm::RWFileSystem::initialize: Exception caught, ConstructorIOException: -16
02/13/14-21:01:31 (GMT) (tRAID): WARN: Exception caught in intentLogStartOfDay, DbmNoFileSystemException: recType: 11
02/13/14-21:01:32 (GMT) (tRAID): WARN: DbmNoFileSystemException: recType: 59 Line 1375 File cmgrControllerMgr.cc
02/13/14-21:01:32 (GMT) (tRAID): WARN: DbmNoFileSystemException: recType: 34 Line 605 File cmgrControllerMgr.cc
02/13/14-21:01:32 (GMT) (tRAID): NOTE: ACS: Icon ping to alternate failed: -2, resp: 0
02/13/14-21:01:32 (GMT) (tRAID): NOTE: ACS: autoCodeSync(): Process start. Comm Mode: 0, Status: 0
02/13/14-21:01:32 (GMT) (tRAID): WARN: ACS: autoCodeSync(): Skipped since alt not communicating.
02/13/14-21:01:32 (GMT) (tRAID): NOTE: SOD: Code Synchronization Initialization Phase Complete
02/13/14-21:01:32 (GMT) (tRAID): NOTE: Caught IconSendInfeasibleException Error in iop::requestAltIopDelay
02/13/14-21:01:32 (GMT) (tRAID): NOTE: CheckInMonitor: Check-in failed (IconSendInfeasibleException Error)
02/13/14-21:01:32 (GMT) (tRAID): WARN: USM Exception caught in processUsmHeader - DbmNoFileSystemException: recType: 30
02/13/14-21:01:32 (GMT) (tRAID): WARN: USM Error allocating UsmStableStorageHeader in processUsmHeader() - DbmNoFileSystemException: recType: 30
02/13/14-21:01:32 (GMT) (tRAID): NOTE: USM Mgr initialization complete with 0 records.
02/13/14-21:01:32 (GMT) (tRAID): NOTE: SOD failure in evf::VolumeCfgManager::initialize
02/13/14-21:01:32 (GMT) (tRAID): NOTE: DbmNoFileSystemException in evf::VolExtentMgr::initialize
02/13/14-21:01:32 (GMT) (tRAID): WARN: Received IconSendInfeasibleException Error adding small edr records from alt controller
02/13/14-21:01:32 (GMT) (tRAID): WARN: edrSOD: No Config File System
02/13/14-21:01:32 (GMT) (tRAID): NOTE: DbmNoFileSystemException in safe::initialize
02/13/14-21:01:32 (GMT) (tRAID): WARN: snrProcessDatabase: No File System Found
02/13/14-21:01:32 (GMT) (tRAID): WARN: spm: unable to exchange features, assuming none
02/13/14-21:01:32 (GMT) (tRAID): WARN: spm::SPMManager::initialize NoFileSystem
02/13/14-21:01:32 (GMT) (tRAID): NOTE: sas: Peering Disabled (Alt Unavailable)
02/13/14-21:01:32 (GMT) (tRAID): ERROR: CopyTargetname: target name length = 0
02/13/14-21:01:32 (GMT) (tRAID): WARN: ConfigManager::processArrayDatabase, No Config File System
02/13/14-21:01:32 (GMT) (tRAID): WARN: ConfigManager::processPortDatabase, No Config File System
02/13/14-21:01:33 (GMT) (tRAID): NOTE: QLStartFw: Downloading Driver's FW image 03.00.01.47 from 0058c3a0 4c0c8 bytes , result 0
02/13/14-21:01:33 (GMT) (NvpsPersistentSyncM): WARN: nvps Exception DbmNoFileSystemException: recType: 19 Line 354 File nvpsPersistentSyncMgr.cc
02/13/14-21:01:33 (GMT) (NvpsPersistentSyncM): WARN: nvps Exception DbmNoFileSystemException: recType: 19 Line 354 File nvpsPersistentSyncMgr.cc
02/13/14-21:01:33 (GMT) (NvpsPersistentSyncM): WARN: nvps Exception DbmNoFileSystemException: recType: 19 Line 354 File nvpsPersistentSyncMgr.cc
02/13/14-21:02:00 (GMT) (tRAID): WARN: QLMailboxCommand: Cmd = 0069, completion timeout
02/13/14-21:02:00 (GMT) (tRAID): WARN: QLMailboxCommand: command completion timeout, cmd = 0x69
02/13/14-21:02:01 (GMT) (tRAID): NOTE: Qlogic coredump file written to 'G2GJ7M1:/tmp/QLogic_Coredump_port_0_G2GJ7M1',rc 204E50, expected 204E50
02/13/14-21:02:01 (GMT) (tRAID): WARN: Qlogic coredump file write failed.fclose returned -1
02/13/14-21:02:01 (GMT) (tRAID): NOTE: QLProcessSystemError: Restart RISC
02/13/14-21:02:01 (GMT) (tRAID): ERROR: QLGetFwState: MBOX_CMD_GET_FW_STATE failed. Stat f000
02/13/14-21:02:01 (GMT) (tRAID): NOTE: QLRebootTimer: Status after Get FW State 4543
02/13/14-21:02:01 (GMT) (tRAID): NOTE: QLRebootTimer: QLGetFwState failed
02/13/14-21:02:02 (GMT) (tRAID): NOTE: QLStartFw: Downloading Driver's FW image 03.00.01.47 from 0058c3a0 4c0c8 bytes , result 0
02/13/14-21:02:29 (GMT) (tRAID): WARN: QLMailboxCommand: Cmd = 0069, completion timeout
02/13/14-21:02:29 (GMT) (tRAID): WARN: QLMailboxCommand: command completion timeout, cmd = 0x69
02/13/14-21:02:30 (GMT) (tRAID): NOTE: Qlogic coredump file written to 'G2GJ7M1:/tmp/QLogic_Coredump_port_0_G2GJ7M1',rc 204E50, expected 204E50
02/13/14-21:02:30 (GMT) (tRAID): WARN: Qlogic coredump file write failed.fclose returned -1
02/13/14-21:02:30 (GMT) (tRAID): NOTE: QLProcessSystemError: Restart RISC
02/13/14-21:02:30 (GMT) (tRAID): ERROR: QLGetFwState: MBOX_CMD_GET_FW_STATE failed. Stat f000
02/13/14-21:02:30 (GMT) (tRAID): NOTE: QLRebootTimer: Status after Get FW State 4543
02/13/14-21:02:30 (GMT) (tRAID): NOTE: QLRebootTimer: QLGetFwState failed
02/13/14-21:02:31 (GMT) (tRAID): NOTE: QLStartFw: Downloading Driver's FW image 03.00.01.47 from 0058c3a0 4c0c8 bytes , result 0
02/13/14-21:02:58 (GMT) (tRAID): WARN: QLMailboxCommand: Cmd = 0069, completion timeout
02/13/14-21:02:58 (GMT) (tRAID): WARN: QLMailboxCommand: command completion timeout, cmd = 0x69
02/13/14-21:02:59 (GMT) (tRAID): NOTE: Qlogic coredump file written to 'G2GJ7M1:/tmp/QLogic_Coredump_port_0_G2GJ7M1',rc 204E50, expected 204E50
02/13/14-21:02:59 (GMT) (tRAID): WARN: Qlogic coredump file write failed.fclose returned -1
02/13/14-21:02:59 (GMT) (tRAID): NOTE: QLProcessSystemError: Restart RISC
02/13/14-21:02:59 (GMT) (tRAID): ERROR: QLGetFwState: MBOX_CMD_GET_FW_STATE failed. Stat f000
02/13/14-21:02:59 (GMT) (tRAID): NOTE: QLRebootTimer: Status after Get FW State 4543
02/13/14-21:02:59 (GMT) (tRAID): NOTE: QLRebootTimer: QLGetFwState failed
02/13/14-21:03:00 (GMT) (tRAID): NOTE: QLStartFw: Downloading Driver's FW image 03.00.01.47 from 0058c3a0 4c0c8 bytes , result 0
02/13/14-21:03:27 (GMT) (tRAID): WARN: QLMailboxCommand: Cmd = 0069, completion timeout
02/13/14-21:03:27 (GMT) (tRAID): WARN: QLMailboxCommand: command completion timeout, cmd = 0x69
02/13/14-21:03:28 (GMT) (tRAID): NOTE: Qlogic coredump file written to 'G2GJ7M1:/tmp/QLogic_Coredump_port_0_G2GJ7M1',rc 204E50, expected 204E50
02/13/14-21:03:28 (GMT) (tRAID): WARN: Qlogic coredump file write failed.fclose returned -1
02/13/14-21:03:28 (GMT) (tRAID): NOTE: QLProcessSystemError: Restart RISC
02/13/14-21:03:28 (GMT) (tRAID): ERROR: QLGetFwState: MBOX_CMD_GET_FW_STATE failed. Stat f000
02/13/14-21:03:28 (GMT) (tRAID): NOTE: QLRebootTimer: Status after Get FW State 4543
02/13/14-21:03:28 (GMT) (tRAID): NOTE: QLRebootTimer: QLGetFwState failed
02/13/14-21:03:30 (GMT) (tRAID): NOTE: QLStartFw: Downloading Driver's FW image 03.00.01.47 from 0058c3a0 4c0c8 bytes , result 0
02/13/14-21:03:57 (GMT) (tRAID): WARN: QLMailboxCommand: Cmd = 0069, completion timeout
02/13/14-21:03:57 (GMT) (tRAID): WARN: QLMailboxCommand: command completion timeout, cmd = 0x69
02/13/14-21:03:58 (GMT) (tRAID): NOTE: Qlogic coredump file written to 'G2GJ7M1:/tmp/QLogic_Coredump_port_0_G2GJ7M1',rc 204E50, expected 204E50
02/13/14-21:03:58 (GMT) (tRAID): WARN: Qlogic coredump file write failed.fclose returned -1
02/13/14-21:03:58 (GMT) (tRAID): NOTE: QLProcessSystemError: Restart RISC
02/13/14-21:03:58 (GMT) (tRAID): ERROR: QLGetFwState: MBOX_CMD_GET_FW_STATE failed. Stat f000
02/13/14-21:03:58 (GMT) (tRAID): NOTE: QLRebootTimer: Status after Get FW State 4543
02/13/14-21:03:58 (GMT) (tRAID): NOTE: QLRebootTimer: QLGetFwState failed
02/13/14-21:03:59 (GMT) (tRAID): WARN: QLStartAdapter: ControllerErrorCount exceeds threshold.
02/13/14-21:03:59 (GMT) (tRAID): ERROR: QLInitializeDevice: QLStartAdapter failed
02/13/14-21:03:59 (GMT) (tRAID): ERROR: QLAddDevice: controller/device/chip init
-=<###>=-
Attaching interface lo0... done
Adding 9768 symbols for standalone.
Error
02/13/14-21:04:12 (GMT) (tRootTask): NOTE: I2C transaction returned 0x0423fe00
WARNING: Restart by watchdog time out
Annnnd repeat. Everything from "SOD Sequence is Normal, 0" to "WARNING: Restart by watchdog time out" is repeated endlessly. This unit was bought new by the company but it is now out of warranty by about 6 months. So I post it to the community in hopes someone can tell me a fix for this and the controller isn't dead (I don't have much of a budget to work with)
Thanks ahead of time for any help that you can offer.
Josh.