Troubleshooting

Dell OpenManage™ Array Manager 3.4

Dell PowerVault 660F and 224F Storage Systems Troubleshooting

This chapter contains status message information, troubleshooting procedures, and common problems and solutions. It also has a separate section for troubleshooting the Dell PowerVault™ 660F and 224F storage systems.

Disks and Volumes

If a disk or volume fails, it is important to repair the disk or volume as quickly as possible to avoid data loss. Because time is critical, Array Manager makes it easy for you to locate problems quickly. In the Status column of the list view, you can view the status of a disk or volume. The status also appears in the graphical view of each disk or volume. If the status is not Healthy for volumes or Online for disks, use the status information to determine the problem and then fix it.

There are also various troubleshooting procedures for disks, volumes, and arrays.

Topics include:

Disk Status Descriptions

One of the following disk status descriptions will always appear in the Status column of the disk in the right pane of the console window. If there is a problem with a disk, you can use this troubleshooting chart to diagnose and correct the problem

.

Status

Meaning

Online
The disk is accessible and has no known problems. This is the normal disk status. No user action is required. Both dynamic disks and basic disks display the Online status.

Online (Errors)
This status indicates that the disk is in an error state or that I/O errors have been detected on a region of the disk. All the volumes on the disk will display Failed or Failed Redundancy status, and you may not be able to create new volumes on the disk. Only dynamic disks display this status.
Right-click the failed disk and select Reactivate Disk to bring the disk to an Online status and bring all the volumes to a Healthy status.

Offline
The disk is not accessible. The disk may be corrupted or intermittently unavailable. An error icon appears on the offline disk. Only dynamic disks display the Offline status.
If the disk status is Offline and a separate corresponding icon titled Missing, Disk appears, the disk was recently available on the system but can no longer be located or identified. The Missing disk may be corrupted, powered down, or disconnected, or the disk may be a virtual disk that has been deleted.

Unreadable
The disk is not accessible. The disk may have experienced hardware failure, corruption, or I/O errors. The disk's copy of the system's disk configuration database may be corrupted. An error icon appears on the Unreadable disk. Both dynamic and basic disks display the Unreadable status.
Disks may display the Unreadable status while they are spinning up or when Array Manager is rescanning all the disks on the system. In some cases, an Unreadable disk has failed and is not recoverable. For dynamic disks, the Unreadable status usually results from corruption or I/O errors on part of the disk, rather than failure of the entire disk. You can rescan the disks (using the Rescan Disks command) or reboot the computer to see if the disk status changes.

Unrecognized
The disk has an original equipment manufacturer's (OEM) signature and Array Manager will not allow you to use this disk. For example, a disk from a UNIX system displays the Unrecognized status. Only Unknown disk types display the Unrecognized status.

Foreign Disk
The disk has been moved to your computer from another Microsoft® Windows NT® or Windows® 2000 computer and has not been set up for use. Only dynamic disks display this status. To add the disk so that it can be used, right-click the disk and select Merge Foreign Disk. All existing volumes on the disk will be visible and accessible.

Because a volume can span more than one disk (e.g., a mirrored volume), it is important that you first verify your disk configurations and then move the entire disk set that the volume is on. If only part of the disk set is moved, some of the volumes will show the Failed Redundancy or Failed error condition.

Status	Meaning
Online	The disk is accessible and has no known problems. This is the normal disk status. No user action is required. Both dynamic disks and basic disks display the Online status.
Online (Errors)	This status indicates that the disk is in an error state or that I/O errors have been detected on a region of the disk. All the volumes on the disk will display Failed or Failed Redundancy status, and you may not be able to create new volumes on the disk. Only dynamic disks display this status. Right-click the failed disk and select Reactivate Disk to bring the disk to an Online status and bring all the volumes to a Healthy status.
Offline	The disk is not accessible. The disk may be corrupted or intermittently unavailable. An error icon appears on the offline disk. Only dynamic disks display the Offline status. If the disk status is Offline and a separate corresponding icon titled Missing, Disk appears, the disk was recently available on the system but can no longer be located or identified. The Missing disk may be corrupted, powered down, or disconnected, or the disk may be a virtual disk that has been deleted.
Unreadable	The disk is not accessible. The disk may have experienced hardware failure, corruption, or I/O errors. The disk's copy of the system's disk configuration database may be corrupted. An error icon appears on the Unreadable disk. Both dynamic and basic disks display the Unreadable status. Disks may display the Unreadable status while they are spinning up or when Array Manager is rescanning all the disks on the system. In some cases, an Unreadable disk has failed and is not recoverable. For dynamic disks, the Unreadable status usually results from corruption or I/O errors on part of the disk, rather than failure of the entire disk. You can rescan the disks (using the Rescan Disks command) or reboot the computer to see if the disk status changes.
Unrecognized	The disk has an original equipment manufacturer's (OEM) signature and Array Manager will not allow you to use this disk. For example, a disk from a UNIX system displays the Unrecognized status. Only Unknown disk types display the Unrecognized status.
Foreign Disk	The disk has been moved to your computer from another Microsoft® Windows NT® or Windows® 2000 computer and has not been set up for use. Only dynamic disks display this status. To add the disk so that it can be used, right-click the disk and select Merge Foreign Disk. All existing volumes on the disk will be visible and accessible. Because a volume can span more than one disk (e.g., a mirrored volume), it is important that you first verify your disk configurations and then move the entire disk set that the volume is on. If only part of the disk set is moved, some of the volumes will show the Failed Redundancy or Failed error condition.

Array Disk Status Information

These definitions appear in the Status line and indicate the condition of array disks.

Status line entry

Status indication

Unknown
May signify a problem or indicate a transitional state. Additionally, a new disk that had previously been formatted or initialized by another type of RAID controller may show this state.

Ready
Means the array disk is operational. For PERC 2/SC, 3/SC, 2/DC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4/Di, and CERC ATA100/4ch controllers, Ready status applies to operational array disks that are not part of a virtual disk.
For the PERC 2, PERC 2/Si, PERC 3/Si, and PERC 3/Di controllers, operational array disks display Ready status regardless of whether they are a part of a virtual disk or not.

Failed
Not operational. A disk needs repair, has been removed, or has another problem that prevents operation.

Online
Operational. Applies to array disks contained in a virtual disk on PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4/Di, and CERC ATA100/4ch controllers.

Offline
The drive is not available to the RAID controller.

Degraded
Refers to a fault-tolerant array/virtual disk that has a failed disk. This state definition may also appear when resynching the array/virtual disk, since the array/virtual disk is not fault-tolerant during the resynchronization.

Recovering
Refers to state of recovering from bad blocks on disks.

Removed
Indicates that array disk has been removed.

Resynching
This state definition appears during the following types of disk operations: Transform Type, Reconfiguration, and Check Consistency.

Rebuilding
Refers to part of a virtual disk being rebuilt.

No Media
CD-ROM or removable disk has no media.

Formatting
Refers to array disk in process of formatting.

Diagnostics
Indicates that diagnostics are running.

Reconstructing
The configuration of a virtual disk has been changed. The individual array disks within the virtual disk are being modified to support the changes. The data on the virtual disk will be saved. You cannot cancel a virtual disk reconstruction.

Initializing
Applies only to virtual disks on PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4/Di, and CERC ATA100/4ch controllers. This prepares the virtual disk for use by Array Manager by deleting the configuration information on this virtual disk. The data on the virtual disk will be lost.

Disk Troubleshooting Procedures

The following sections describe disk troubleshooting procedures:

See also the following sections for these and other troubleshooting procedures:

Volume Status Descriptions

One of the following volume status descriptions will always appear in the graphical view of the volume and in the Status column of the volume in list view. If there is a problem with a volume, you can use this troubleshooter to diagnose and correct the problem.

Status

Meaning

Healthy
The volume is accessible and has no known problems. This is the normal volume status. No user action is required. Both dynamic volumes and basic volumes display the Healthy status.

Healthy (At Risk)
The volume is currently accessible, but I/O errors have been detected on the underlying disk. If an I/O error is detected on any part of a disk, all volumes on the disk display the Healthy (At Risk) status. A warning icon appears on the volume. Only dynamic volumes display the Healthy (At Risk) status.
When the volume status is Healthy (At Risk), an underlying disk's status is usually Online (Errors). To return the underlying disk to the Online status, reactivate the disk (using the Reactivate Disk command). Once the disk is returned to Online status, the volume should return to the Healthy status.

Initializing
The volume is being initialized. Dynamic volumes display the Initializing status.
No user action is required. When initialization is complete, the volume's status becomes Healthy. Initialization should be completed very quickly.

Resynching
The volume's mirrors are being resynchronized so that both mirrors contain identical data. Both dynamic and basic mirrored volumes display the Resynching status.
No user action is required. When resynchronization is complete, the mirrored volume's status returns to Healthy. Resynchronization may take some time, depending on the size of the mirrored volume. Although you can access a mirrored volume while resynchronization is in progress, you should avoid making configuration changes (such as breaking a mirror) during resynchronization.

Regenerating
Data and parity are being regenerated for the RAID-5 volume. Both dynamic and basic RAID-5 volumes display the Regenerating status.
No user action is required. When regeneration is complete, the RAID-5 volume's status returns to Healthy. You can access a RAID-5 volume while data and parity regeneration is in progress.

Failed Redundancy
The data on the volume is no longer fault tolerant because one of the underlying disks is not online. A warning icon appears on the volume with Failed Redundancy. The Failed Redundancy status applies only to mirrored or RAID-5 volumes. Both dynamic and basic volumes display the Failed Redundancy status.
You can continue to access the volume using the remaining online disks, but if another disk that contains the volume fails, you will lose the volume and its data. To avoid such loss, you should attempt to repair the volume as soon as possible.
A Failed Redundancy status will also display if a disk was moved and the volume on it spanned more than the single disk. To correct the problem, you must move the entire disk set that contains all the appropriate volumes.

Failed Redundancy (At Risk)
The data on the volume is no longer fault tolerant, and I/O errors have been detected on the underlying disk. If an I/O error is detected on any part of a disk, all volumes on the disk display the (At Risk) status. A warning icon appears on the volume. Only dynamic mirrored or RAID-5 volumes display the Failed Redundancy (At Risk) status.
When the volume status is Failed Redundancy (At Risk), the underlying disk's status is usually Online (Errors). To return the underlying disk to the Online status, reactivate the disk (using the Reactivate Disk command). Once the disk is returned to the Online status, the volume status should change to Failed Redundancy.

Failed
The volume cannot be started automatically. An error icon appears on the failed volume. Both dynamic and basic volumes display the Failed status.

Formatting
The volume is being formatted using the specifications you chose for formatting.

No Media
No media has been inserted into the CD-ROM or removable drive. The volume status will become Online when you insert the appropriate media into the CD-ROM or removable drive. Only CD-ROM or removable disk types display the No Media status.

Status	Meaning
Healthy	The volume is accessible and has no known problems. This is the normal volume status. No user action is required. Both dynamic volumes and basic volumes display the Healthy status.
Healthy (At Risk)	The volume is currently accessible, but I/O errors have been detected on the underlying disk. If an I/O error is detected on any part of a disk, all volumes on the disk display the Healthy (At Risk) status. A warning icon appears on the volume. Only dynamic volumes display the Healthy (At Risk) status. When the volume status is Healthy (At Risk), an underlying disk's status is usually Online (Errors). To return the underlying disk to the Online status, reactivate the disk (using the Reactivate Disk command). Once the disk is returned to Online status, the volume should return to the Healthy status.
Initializing	The volume is being initialized. Dynamic volumes display the Initializing status. No user action is required. When initialization is complete, the volume's status becomes Healthy. Initialization should be completed very quickly.
Resynching	The volume's mirrors are being resynchronized so that both mirrors contain identical data. Both dynamic and basic mirrored volumes display the Resynching status. No user action is required. When resynchronization is complete, the mirrored volume's status returns to Healthy. Resynchronization may take some time, depending on the size of the mirrored volume. Although you can access a mirrored volume while resynchronization is in progress, you should avoid making configuration changes (such as breaking a mirror) during resynchronization.
Regenerating	Data and parity are being regenerated for the RAID-5 volume. Both dynamic and basic RAID-5 volumes display the Regenerating status. No user action is required. When regeneration is complete, the RAID-5 volume's status returns to Healthy. You can access a RAID-5 volume while data and parity regeneration is in progress.
Failed Redundancy	The data on the volume is no longer fault tolerant because one of the underlying disks is not online. A warning icon appears on the volume with Failed Redundancy. The Failed Redundancy status applies only to mirrored or RAID-5 volumes. Both dynamic and basic volumes display the Failed Redundancy status. You can continue to access the volume using the remaining online disks, but if another disk that contains the volume fails, you will lose the volume and its data. To avoid such loss, you should attempt to repair the volume as soon as possible. A Failed Redundancy status will also display if a disk was moved and the volume on it spanned more than the single disk. To correct the problem, you must move the entire disk set that contains all the appropriate volumes.
Failed Redundancy (At Risk)	The data on the volume is no longer fault tolerant, and I/O errors have been detected on the underlying disk. If an I/O error is detected on any part of a disk, all volumes on the disk display the (At Risk) status. A warning icon appears on the volume. Only dynamic mirrored or RAID-5 volumes display the Failed Redundancy (At Risk) status. When the volume status is Failed Redundancy (At Risk), the underlying disk's status is usually Online (Errors). To return the underlying disk to the Online status, reactivate the disk (using the Reactivate Disk command). Once the disk is returned to the Online status, the volume status should change to Failed Redundancy.
Failed	The volume cannot be started automatically. An error icon appears on the failed volume. Both dynamic and basic volumes display the Failed status.
Formatting	The volume is being formatted using the specifications you chose for formatting.
No Media	No media has been inserted into the CD-ROM or removable drive. The volume status will become Online when you insert the appropriate media into the CD-ROM or removable drive. Only CD-ROM or removable disk types display the No Media status.

Volume Troubleshooting Procedures

The following sections describe common volume troubleshooting procedures:

See also the following sections for these and other troubleshooting procedures:

Common Troubleshooting Procedures

This section describes commands and procedures that can be used in troubleshooting. Topics covered include:

Cables attached correctly

Verify that the power-supply cord and adapter cables are attached correctly. If the system is having trouble with read and write operations to a particular array (if the system hangs, for example), then make sure that the SCSI cables attached to the array are secure. If the connection is secure but the problem persists, you may need to replace a cable. See also the "Isolate SCSI device problems" section.

System Requirements

Make sure that the system meets all system requirements as described in the readme.txt file located in the installation directory. In particular, verify that the correct levels of firmware and drivers are installed on the system. For more information on drivers and firmware, see the "Drivers and Firmware" section.

Drivers and Firmware

Array Manager is tested with the supported controller firmware and drivers. The supported controllers and firmware are listed in the readme.txt file. To avoid possible conflicts or inconsistencies between the controller firmware and drivers, it is recommended that you only use the supported versions. The most current versions can be obtained from the Dell support site at http://support.dell.com.

In a SAN environment, all LS modules in an array should have the same firmware version. When upgrading the firmware on an LS module, make sure to upgrade the firmware on the other LS modules at the same time.

It is also recommended to obtain and apply the latest Dell PowerEdge™ Server System BIOS on a periodic basis to benefit from the most recent improvements. Please refer to the Dell PowerEdge system documentation for more information.

Isolate SCSI device problems

If you receive a "timeout" event related to a SCSI device or if you otherwise suspect that one of the SCSI devices is experiencing a hardware failure, then do the following to confirm the problem:

Verify that the cables are correctly attached.
If the cables are correctly attached and you are still experiencing the problem, then disconnect the device cables and reboot the system. If the system reboots successfully, then one of the SCSI devices may be defective. Refer to the SCSI device documentation for more information.

Rescan to Update Information

Use Rescan to update disk information. This operation may take a few minutes if there are a number of devices attached to the system. You will see a message "Getting hardware configuration. Please wait." while the rescan is occurring.

If this does not properly update the disk information, you may need to reboot your system.

Maintain integrity of redundant (mirrored and parity) information

The Check Consistency function determines the accuracy of mirrored data and parity information. When necessary, this feature rebuilds the parity information. For more information, see the following sections:

"Check Consistency" for PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4/Di, and CERC ATA100/4ch controllers
"Check Consistency" for PERC 2, 2/Si, 3/Si, and 3/Di controllers
"Check Consistency" for the PowerVault 660F storage system

Reactivate a Disk

Reboot your machine to update the list of existing disks.
Right-click the disk marked Missing or Offline dynamic disk.
Use Rescan to change the disk status to Online (errors).
Right-click the disk marked Missing or Offline dynamic disk. Select Reactivate Disk from the context menu. The disk should be marked Online after the disk is reactivated.
For any volumes that are not Healthy, right-click the volume from the context menu and select Reactivate Volume.

Caution When reactivating a volume, be aware that the volume's data is restored, even if it is stale, corrupt, or out-of-date. See "Reactivate a Dynamic Volume" for more information on the consequences of reactivating a volume.

Bring a Dynamic RAID-5 or Mirrored Volume Back Online

A RAID 5 volume's status can appear as Failed Redundancy and the disk's status is Offline. The disk's name may be Missing, and an error icon (X) appears on the missing or offline disk. In this case, do the following.

Rescan the disk to make sure the disk, controller, or cable problem is fixed.
Try to reactivate the disk by right-clicking on the disk and selecting Reactivate Disk.
If the volume remains as Failed Redundancy or Failed, right-click the volume, then select Reactivate Volume. If all disks on this volume are Online, the volume should be brought back to a healthy state.

Reactivate a Dynamic Volume

Reactivating a volume attempts to restart all volumes regardless of the volume's state. If data corruption exists, you can reactivate the volume and then run the chkdsk utility. However, in the case of a mirrored or RAID-5 volume, reactivating a volume with stale data can cause that data to be used when it is inaccurate.

Reactivating a volume should be done only if you understand that the volume's data, which might be corrupted, will be restored. For example, if one mirror in a mirrored volume fails and data is written to the remaining mirror, the data is now out of sync. Then, if the remaining mirror (the one with accurate data) fails and the first mirror is reactivated, the stale data becomes "real" data.

For this reason, it is important to act on data failures as soon as possible. You should use care when reactivating volumes.

Repair Basic Volumes

Make sure that the underlying physical disk is turned on, plugged in, and attached to the computer. No other user action is possible for basic volumes unless the volumes are mirrored or RAID-5 volumes that were originally created in NT Disk Administrator. The repair of these volumes is covered in the next topic.

Repair Dynamic Volumes

If the disks are not online, use the Rescan and then the Reactivate Disk commands to return the disk to the Online status. If this succeeds, the volume automatically restarts and returns to the Healthy status. A mirrored volume repairs itself by resynchronizing the data in its mirrors. A RAID-5 volume repairs itself by regenerating its parity and data.
If the disk returns to the Online status but the volume does not return to the Healthy status, you can reactivate the volume manually (using the Reactivate Volume command).

If the volume is a mirrored or RAID-5 volume with stale data, bringing the underlying disk online will not automatically restart the volume. If the disks that contain non-stale data are disconnected, you should bring those disks online first (to allow the data to become synchronized). Otherwise, restart the mirrored or RAID-5 volume manually (using the Reactivate Disk command), and then run Chkdsk.exe. To run Chkdsk.exe, click Start, click Run, type chkdsk, and then click OK.
If the disk does not return to the Online status and the volume does not return to the Healthy status, there may be something wrong with the disk. You should replace the failed mirror or RAID-5 disk region. To replace the failed mirror in a mirrored volume, use the Remove Mirror command to remove the failed mirror, then use the Add Mirror command to create a new mirror on another disk. To replace the failed disk region in a RAID-5 volume, use the Repair RAID-5 Volume command.

There are particular considerations regarding dynamic disks and volumes on NetWare, Windows Server 2003, and Linux. See "Dynamic Disk and Volume Support on NetWare, Windows Server 2003, and Linux" for more information.

Repair a Dynamic RAID-5 Volume

Right-click volume, then click Repair RAID-5 volume.
A message appears that indicates that the repair will be attempted if there is another dynamic disk with adequate unallocated space. Click Yes to confirm the repair.
The volume should be brought back to a healthy state.

You should be able to repair a RAID-5 volume if it is in a state of Failed Redundancy, and if there is unallocated space on another dynamic disk available. To avoid data loss, you should attempt to repair the volume as soon as possible.

Repair Basic Mirrored or RAID-5 Volumes

Use Microsoft Windows NT Disk Administrator to repair basic mirrored or RAID-5 volumes if you are running Windows NT 4.0. For Windows 2000, there is a command available form the context menu for repairing basic mirrored or RAID-5 volumes.

Caution In Windows NT 4.0, Disk Administrator should never be used while Array Manager is running, especially if there are tasks running on the controller at the time. Data loss can occur if both applications are running simultaneously.

Drive Letters and Drive Mapping

A Drive Letter is Unavailable

After deleting a basic disk, the drive letter used by that disk may no longer be available. To correct this problem, reboot the server.

Drive Mapping is Not Working

Drive mapping may not work properly on Windows NT and Windows 2000 systems with PERC 3/DC, PERC 3/DCL, PERC 3/QC, PERC 2/DC, PERC 3/SC, PERC 2/SC, PERC 4/SC, PERC 4/DC, PERC 4/Di, PERC 4/IM, and CERC ATA100/4ch controllers. After creating a virtual disk on these controllers, the disk may not be visible in the disk folder until the system is rebooted. After rebooting the system, the mapping between the newly created disk and the corresponding Windows NT or Windows 2000 disk may not be displayed in the Array Manager console.

Solution for Windows NT:

After creating a virtual disk and rebooting the system, do a console rescan by either clicking the Rescan button or selecting Rescan from the View pull-down menu.

Solution for Windows 2000:

When using a PERC 2/SC or 2/DC controller, upgrade your driver to MRAID 35X.SYS version 2.68 or later.

Recovering from Removing the Wrong Drive

If the drive that you mistakenly removed is part of a redundant virtual disk that also has a hot spare, then the virtual disk rebuilds automatically either immediately or when a write request is made. After the rebuild has completed, the virtual disk will no longer have a hot spare since data has been rebuilt onto the disk previously assigned as a hot spare. In this case, you should assign a new hot spare.

If the drive that you removed is part of a redundant virtual disk that does not have a hot spare, then replace the drive and do a rebuild.

See the following sections for information on rebuilding drives and assigning hot spares:

For PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4/Di, and CERC ATA100/4ch controllers, see "Rebuild" and "Assign Global Hot Spare."
For PERC 2, 2/Si, 3/Si, and 3/Di controllers, see "Configure Dedicated Hot Spare."
For the PowerVault 660F storage system, see "Rebuild" and "Make Global Hot Spare."

You can avoid removing the wrong drive by blinking the LED display on the drive that you intend to remove. See the following sections for information on blinking the LED display:

"Blink" for the PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, and 4/Di controllers
"Blink" for the PERC 2, 2/Si, 3/Si, and 3/Di controllers
"Blink Disk" for the PowerVault 660F storage system

Resolving Windows Upgrade Problems

If you upgrade the Windows operating system on a server, you may find that Array Manager no longer functions after the upgrade. The installation process installs files and makes registry entries on the server that are specific to the operating system. For this reason, changing the operating system can disable Array Manager.

To avoid this problem, you should uninstall Array Manager before upgrading. If you have already upgraded without uninstalling Array Manager, however, you should uninstall Array Manager after the upgrade.

After you have uninstalled Array Manager and completed the upgrade, reinstall Array Manager using the Array Manager install media. You can download Array Manager from the Dell support site at http://support.dell.com.

Problem Situations and Solutions

This section contains additional trouble-shooting problem areas. Topics include:

Note If you are using the Dell PowerVault 660F storage system and the PowerVault 224F enclosure, see "Dell PowerVault 660F and 224F Storage Systems Troubleshooting," for additional issues specific to the PowerVault 660F storage system and PowerVault 224F enclosure.

Rebuild does not work

A rebuild will not work in the following situations:

The virtual disk is non-redundant. For example, a RAID 0 virtual disk cannot be rebuilt because RAID 0 does not provide data redundancy.
There is no hot spare assigned to the virtual disk. As long as the virtual disk is redundant, you can do the following to rebuild it:

Pull out the failed array disk and replace it. A rebuild will automatically start on the new disk.
Assign a hot spare to the virtual disk and then perform a rebuild.

An array group contains both redundant and non-redundant virtual disks. On the PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4/Di, 4/IM, and CERC ATA100/4ch controllers, a rebuild is not performed for an array disk that is used by both redundant and non-redundant virtual disks. In order to rebuild the redundant virtual disk, you need to delete the non-redundant virtual disk. Before deleting this disk, however, you can attempt to recover data from the failed array disk by forcing it back online.
An array disk has been removed, and the system has not yet attempted to write data to the removed disk. In this case, the system will not recognize the removal of an array disk until it attempts a write operation to the disk. If the array disk is part of a redundant virtual disk, then the system will rebuild the disk after attempting a write operation. This situation applies to PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4/Di, 4/IM, and CERC ATA100/4ch controllers.

Cannot create a virtual disk (option is inactive)

Check:

How many virtual disks exist? You can create a maximum of 8 virtual disks on one PERC 2/SC or PERC 2/DC controller and a maximum of 16 virtual disks on a CERC ATA100/4ch controller. You can create a maximum of 24 virtual disks on a PERC 2, PERC 2/Si, PERC 3/Si, or PERC 3/Di controller, and 40 virtual disks on a PERC 3/SC, 3/DCL, PERC 3/DC, PERC 3/QC, 4/SC, 4/DC, or PERC 4/Di controller.
Is there adequate unallocated space on the disk? You must have adequate available disk space to create a virtual disk.

Cannot create a RAID-5 volume

Check:

Is there adequate unallocated space on three or more disks? You must have at least three disks to create a RAID-5 volume.
Are you using an NT Workstation or Windows 2000 Professional machine? You cannot create a RAID-5 volume on those machines. This restriction applies to software RAID only (that is, dynamic volumes). You need to use an NT Server or Windows 2000 Server machine. Another option is that you can implement a hardware RAID-5 virtual disk and then create a volume on the virtual disk.

Cannot create a mirror

Check:

Is there adequate unallocated space on two different disks? You must have two disks to create a mirrored volume. You also can create a mirror only on a simple or spanned volume.
Are you using an NT Workstation or Windows 2000 Professional machine? You cannot create a RAID-1 volume on those machines. This restriction applies to software RAID only (that is, dynamic volumes). You need to use an NT Server or Windows 2000 Server machine. Another option is that you can implement a hardware RAID-1 virtual disk and then create a volume on the virtual disk.

When expanding the Disks object, error icons appear

Situation:

Windows is not aware of the status of these disks. Most likely, the virtual disks that were associated with these have been deleted.

Check:

To remove these error status icons from the Disks object, the computer must be restarted to allow Windows to find the current information.

Situation:

If the type of disk shows No Signature, you need to write a signature to the disk. When creating a new virtual disk, the software must write a signature to the virtual disk that prepares it for use. This signature is not written automatically in case this disk has been merged from another operating system and the configuration information needs to be kept intact.

Check:

For instructions on writing a disk signature, see the section "Write a Disk Signature" in the "Disk Management" chapter.

Missing Disk displays error icon

The corresponding virtual disk has been removed, or the disk has been rendered inactive because of a problem.

Check:

If the corresponding virtual disk for this disk has been deleted, select Remove Disk and remove the disk from list of disks.
If the disk has been rendered inactive because of some problem, see "To bring a disk that is Offline and Missing back online."

Once you have repaired the disk, controller, or cable problem, you need to:

Rescan to see the disk within Array Manager. If Array Manager finds the disk, this should bring the disk Online. If Array Manager does not find the disk, a reboot may be required.
Reactivate Disk to bring all the volumes on the disk to the Healthy status.

Read and Write Operations Experience Problems

If the system is hanging, timing out, or experiencing other problems with read and write operations, then there may be a problem with the adapter cables or a SCSI device. For more information, see the "Cables attached correctly" and "Isolate SCSI device problems" sections.

Problems after Installing a PERC 2/SC or 2/DC Controller

If you install a PERC 2/SC or 2/DC controller after you have already installed Array Manager, you may experience problems with Drive mapping, system hangs, and other performance problems. Reinstall Array Manager to resolve these problems.

I/O Stops on a Channel Redundant Channel

If you have implemented channel redundancy on a PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, or 4/Di controller, a failure of one channel causes I/O to stop on the other channels included in the channel redundant configuration. The resolution to this problem is described in the "Considerations for Implementing Channel Redundancy" section.

Error message that the connection to remote computer has terminated

The full message is: "The connection to the remote computer has terminated. Remote computer will be removed from view." The remote computer that you were connected to has been disconnected from your console. Most often, there is a problem with the network connection and the transmissions timed out. This can also occur if the remote machine was restarted or the service on the remote machine was stopped.

Check:

Make sure that the remote machine is turned on and is available to the network, and that the service is started. Reconnect to the resource.

Error Message: The stripe depth is out of range

Array Manager displays "The stripe depth is out of range" error message when you attempt to apply a RAID 0 or RAID 5 to more array disks than the controller can support in a single virtual disk. For example, the PERC 4/SC and 4/DC controllers can support up to 32 array disks in a virtual disk when using RAID 0 or RAID 5. Attempting to create a RAID 0 or RAID 5 using more than 32 array disks on these controllers will cause this error message to be displayed.

Tree view object for PowerEdge RAID controller cannot be expanded after the software and driver are installed

The installation detects any drivers that you have installed for PowerEdge RAID controllers. If these drivers (and/or the card itself) are installed after the software is installed, support for the controller will need to be added.

Check:

Close the console. Open the Array Manager Utilities and check the box next to the appropriate controller. This action will restart the service, and the disks should be available the next time you launch the console.

An option is inactive

When an operation is inactive or dimmed in a menu, the task cannot be performed on the object at this time. Certain operations are valid only for certain types of objects. (For example: RAID levels that are not fault tolerant will not allow you to check the consistency of the virtual disk.) If there is a task currently running on that object, wait until it has finished and try again. Otherwise, the operation may not be appropriate at this time.

To bring a disk that is Offline and Missing back online

If this was a virtual disk, then check that the virtual disk still exists. If it no longer exists, use the Remove Disk command to remove the disk from the list of disks.

Repair any disk, controller, or cable problems and make sure that the physical disk is turned on, plugged in, and attached to the computer. From the View pull-down menu, select Rescan. The disk should change from Offline to Online, but the volumes remain Failed. (If they do not change to Online, you may need to reboot.) Right-click the disk and select Reactivate Disk. The disk status changes to Healthy. (You can also select each volume one at a time and select Reactivate Volume. It is recommended you do a chkdsk.

If the disk status remains Offline and Missing and you determine that the disk has a problem that cannot be repaired, you can remove the disk from the system (using the Remove Disk command). However, before you can remove the disk, you must delete all volumes on the disk. You can save any mirrored volumes on the disk by removing the mirror that is on the Missing disk instead of the entire volume. Deleting a volume destroys the data in the volume, so you should remove a disk only if you are absolutely certain that the disk is permanently damaged and unusable.

To bring a disk that is Offline (not Missing) and is still named Disk # back online

Use the Reactivate Disk command to bring the disk back online. If the disk status remains Offline, check the cables and disk controller, and make sure that the physical disk is healthy. Correct any problems and try to reactivate the disk again. If the disk reactivation succeeds, any volumes on the disk should automatically return to the Healthy status.

A disk on a PERC 4/Di controller does not return online after a Prepare to Remove

When you do a Prepare to Remove command on an array disk attached to a PERC 4/Di controller, you may find that the disk does not display in the Array Manager tree view even after doing a rescan or a reboot.

In this case, do the following to redisplay the disk in the Array Manager tree view:

Manually remove and then replace the array disk.
Either do a Rescan or reboot the system.

A disk is marked as Foreign

The disk has been moved to your computer from another Microsoft Windows NT/2000 computer and has not been set up for use. Only dynamic disks display this status. To add the disk so that it can be used, right-click the disk and select Merge Foreign Disk. All existing volumes on the disk will be visible and accessible.

Because a volume can span more than one disk (e.g., a mirrored volume), it is important that you first verify your disk configurations and then move the entire disk set that the volume is on. If only part of the disk set is moved, some of the volumes will show Failed Redundancy or Failed error condition.

A Disk is Marked as Offline or Foreign after Upgrading to Dynamic with a PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4/Di, or CERC ATA100/4ch Controller

If you initialize a virtual disk that has been upgraded to dynamic, the status of the dynamic disk may change to "offline" or "foreign." You can view a disk's status by selecting the disk's General tab. When using a PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4/Di, or CERC ATA100/4ch controller, you can resolve this problem by reverting the "offline" or "foreign" disk to a basic disk. See "Reverting a Dynamic Disk to Basic."

Virtual Disk Initialization Causes a Foreign, Offline, or Missing Disk

Because initializing a virtual disk destroys the data on the virtual disk, you may find that after initializing a virtual disk on a Windows system, a disk marked as "foreign" or "missing" is displayed under the Disks folder. In addition, initializing a virtual disk containing a dynamic disk changes the status of the dynamic disk to foreign or offline.

To reuse a Windows disk that is set to foreign or offline, right-click the disk and select Merge Foreign Disk or Revert to Basic Disk from the pop-up menu.

If the disk is marked as missing, right-click the disk and select Remove Disk.

A Disk's Functions become Inactive or It is Marked as Foreign or Offline after Upgrading to Dynamic with a PERC 2, 2/Si, 3/Si, and 3/Di Controller

If you format a virtual disk that has been upgraded to dynamic, the disk functions may become inactive or the status of the disk may change to "foreign" or "offline." You can view a disk's status by selecting the disk's General tab. When using a PERC 2, 2/Si, 3/Si, and 3/Di controller, you can resolve these problems by doing a global rescan.

The Online Help behaves strangely, or will not come up at all

The Help file uses a technology known as HTML Help, a Microsoft standard. Some software will attempt to update the core files with an older version of HTML Help and make Array Manager's Help file unusable. The required HTML Help update is located on the Array Manager CD-ROM in the Help Update folder. Double-click HHUPD.EXE and follow the instructions.

When attempting to bring up the Help file, Dr. Watson reports an Access Violation in HH.EXE

HH is Microsoft's HTML Help format, which reads precompiled HTML files for Array Manager's Help sections.

Check:

Delete the HH.DAT file in your Windows directory. Deleting this file will remove any customizations that have been made to your HTML help files.

During reboot, a message displays about a "corrupt drive," suggesting that you run autocheck

Let autocheck run, but do not worry about the message. Autocheck will finish and the reboot will be complete. If you have a large system (more than 1 gigabyte), this may take about 10 minutes.

When attempting to access a remote computer, you are denied access or get an error message

There are several situations where this occurs.

You are denied access and do not even get a connection login box

This occurs when you log in to the local computer originally as a local user, local administrator, or domain user and the remote computer is not in your domain or a trusted domain. The Windows security model does not allow you to have access under these circumstances. The workaround is to log in to your local computer with an account that has the same user name and password as an administrator account on the remote computer.

You are denied access after typing the login information in the connection box

Access can be denied here if you do not type in a user name and password that match a local or domain administrator account on the remote computer or if you mistype the login information.

"Connection Failed" message

If the remote computer is not on or there are network problems, you will get the message "Connection Failed."

For a NetWare system, refer to "The Connection Failed" message displays when connecting to the NetWare server.

You are unable to connect to a Windows 2000 server with Disk Management after a client-only installation

Another situation where you may get an error message is when you have just done a client-only installation of Array Manager and you bring up the Array Manager client and attempt to connect to a remote server that has Windows 2000 Disk Management.

Array Manager assumes that its client will connect first to a remote server running Array Manager before connecting to a system running Windows 2000 Disk Management.

Once you connect to a server with Array Manager, you will then be able to connect successfully to a remote system running Disk Management.

Windows 2000 Disk Management is the disk and volume management program that comes with Windows 2000. Because Array Manager and Disk Management are related programs, Array Manager is able to remotely manage the storage on a Windows 2000 computer with Disk Management.

You are unable to connect to a NetWare server

If you are having problems connecting to a NetWare® server from a local machine, use the ping and nslookup TCP/IP network diagnostic tools to determine whether the NetWare server is accessible from the local machine and whether the system running the NetWare server has a valid DNS name. If the system running the NetWare server does not have a valid DNS name, then you can edit the Hosts file on the local machine with an entry for the system running the NetWare server. The Hosts file is located in the winnt/system32/drivers/etc directory. The entry in the Hosts file should consist of the IP address and the host or server name of the system running the NetWare server.

If you do not connect by using a valid DNS name or an entry in the Hosts file, then you will need to use the IP address.

When you want to connect to a NetWare server, Array Manager expects the server to be identified by one of three types of entries:

A DNS entry name
An IP address
A host name from a Hosts file listing

If you identify the name of the machine by a NetWare server's name that is not one of the three items above, the connection will fail. It is suggested that the name assigned to the NetWare server be the same name as its DNS or Hosts file entry.

Note that the DNS and Hosts file entries do not allow for a computer name that consists of all numbers. In addition, the DNS name does not allow a computer name that starts with a number. If the NetWare server has a numeric name or a name that starts with a number, you can use the IP address to identify that server. You can also put quotation marks around the computer's name for the entry in DNS or the Hosts file (such as "12345").

The Hosts file has to be on the client computer (local machine) that has the Array Manager console.

In addition, connecting to a remote system requires that you have administrator authority on both the local and remote system.

Note Dell does not offer NetWare in Japan.

After creating a virtual disk with a PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4/Di, or CERC ATA100/4ch controller, the virtual disk does not appear under the Disks storage object

If there are no virtual disks configured at boot time on a PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4/Di, or CERC ATA100/4ch controller on Windows 2000, the Windows disk driver may not be loaded. The solution is to reboot after creating the first virtual disk or create the first virtual disk in the bios (use Ctrl-m to invoke the BIOS utility).

"The Connection Failed" message displays when connecting to the NetWare server

If you are trying to connect to the NetWare server with the Array Manager console, you may receive a "The Connection Failed" error message. There can be a variety of reasons for why the connection between the Array Manager console and the NetWare managed system fails. (See also "Connection Failed" message.)

To identify why the connection failed, perform the following steps:

Ping the NetWare server from the system running the Array Manager console. If this fails, then you are experiencing network problems.
Verify that the correct NDS tree, user ID, and password are correct for the target NetWare server. Also verify that the user ID has administrator rights.
Verify that the server name is included in some form of DNS (DNS server, hosts file, and so forth).
Verify that the server name does not start with a number. This can cause problems with DNS.
Restart the console if an entry for the NetWare server was just added to the hosts file.

Note You may be able to avoid this connection problem by using the NetWare server's IP address instead of the server.

Note Dell does not offer NetWare in Japan.

A Disk is Marked as Failed when Rebuilding in a Cluster Configuration

When a system in a cluster attempts to rebuild a failed disk but the rebuild fails, then another system takes over the rebuild. In this situation, you may notice that the rebuilt disk continues to be marked as failed on both systems even after the second system has rebuilt successfully. To resolve this problem, perform a rescan on both systems after the rebuild completes successfully.

Erroneous Status and Error Messages after a Windows Hibernation

Activating the Windows hibernation feature may cause Array Manager to display erroneous status information and error messages. This problem resolves itself when the Windows operating system recovers from hibernation.

Cannot Connect to Remote System from Windows Server 2003

Certain conditions must be met before you can connect to a remote system from Windows Server 2003. For a description of these conditions, see "Remote Connection and Windows Server 2003."

System Performance Problems

This section describes problems that may deteriorate system performance.

Unusual CPU Usage

You may notice unusual surges in your system's CPU usage. These surges may be caused by Array Manager's volume capacity monitoring. This function monitors NTFS volumes on the local server for the amount of space used. When the space used on an NTFS volume reaches 90%, a warning event is logged in the Array Manager and Windows event log. When the space used reaches 98%, an error event is logged.

If the surges in CPU usage pose a problem, you can disable volume capacity monitoring.

To disable volume capacity monitoring:

Launch the Array Manager Utilities by clicking Start | Programs | Dell OpenManage Applications | Array Manager and selecting Array Manager Utilities.
Deselect the Volume Capacity Monitoring check box on the Windows tab.
Click Apply and then Close to exit the Array Manager Utilities.

For information related to volume capacity monitoring, see the following:

Dell PowerVault 660F and 224F Storage Systems Troubleshooting

This section presents possible problem situations with accompanying solutions for the Dell PowerVault 660F and 224F storage systems. The problem situations are organized as follows:

The situations in the first three topics are categorized by their event number. A brief discussion of event messages is included at the beginning of this section in the topic "Event Monitoring and Logging." The fourth topic describes general problems not related to a specific event.

You will also find a full listing of the events associated with the Dell PowerVault 660F Fibre Channel RAID controller at the end of this section in the topic "Events Generated by the PowerVault 660F Storage System."

Event Monitoring and Logging

Event messages help identify significant incidents such as an array disk failure or an array disk addition. Event monitoring and logging starts when the Array Manager managed system starts up. If the managed system service (Disk Management Service) stops in Microsoft Windows NT or the Array Manager Service stops in NetWare, then event monitoring and logging stops. If array disks are S.M.A.R.T. (Self Monitoring Analysis and Reporting Technology) enabled, the RAID controllers check array disks for failure predictions, and if found, pass this information on to the Array Manager console. Array Manager immediately displays an alert icon on the array disk and also raises an alert under the Events tab and in the Windows NT event log. Windows NT has three event logs; Array Manager uses the application log.

Note When a controller's I/O is paused, Array Manager does not receive S.M.A.R.T. events.

Fibre Channel RAID Controller Status Events

The following incidents are included in this topic:

Status line entry	Status indication
Unknown	May signify a problem or indicate a transitional state. Additionally, a new disk that had previously been formatted or initialized by another type of RAID controller may show this state.
Ready	Means the array disk is operational. For PERC 2/SC, 3/SC, 2/DC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4/Di, and CERC ATA100/4ch controllers, Ready status applies to operational array disks that are not part of a virtual disk. For the PERC 2, PERC 2/Si, PERC 3/Si, and PERC 3/Di controllers, operational array disks display Ready status regardless of whether they are a part of a virtual disk or not.
Failed	Not operational. A disk needs repair, has been removed, or has another problem that prevents operation.
Online	Operational. Applies to array disks contained in a virtual disk on PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4/Di, and CERC ATA100/4ch controllers.
Offline	The drive is not available to the RAID controller.
Degraded	Refers to a fault-tolerant array/virtual disk that has a failed disk. This state definition may also appear when resynching the array/virtual disk, since the array/virtual disk is not fault-tolerant during the resynchronization.
Recovering	Refers to state of recovering from bad blocks on disks.
Removed	Indicates that array disk has been removed.
Resynching	This state definition appears during the following types of disk operations: Transform Type, Reconfiguration, and Check Consistency.
Rebuilding	Refers to part of a virtual disk being rebuilt.
No Media	CD-ROM or removable disk has no media.
Formatting	Refers to array disk in process of formatting.
Diagnostics	Indicates that diagnostics are running.
Reconstructing	The configuration of a virtual disk has been changed. The individual array disks within the virtual disk are being modified to support the changes. The data on the virtual disk will be saved. You cannot cancel a virtual disk reconstruction.
Initializing	Applies only to virtual disks on PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4/Di, and CERC ATA100/4ch controllers. This prepares the virtual disk for use by Array Manager by deleting the configuration information on this virtual disk. The data on the virtual disk will be lost.