Analyzing SCSI Reservation conflicts on VMware Infrastructure 3.x and vSphere 4.x
Details
•ESX 3.0.x, ESX 3.5, or ESX 4.0 VMkernel logs contain the following messages:
SCSI: vm 1043: 5522: Sync CR at 64
SCSI: vm 1043: 5522: Sync CR at 48
SCSI: vm 1043: 5522: Sync CR at 32
SCSI: vm 1043: 5522: Sync CR at 16
SCSI: vm 1043: 5522: Sync CR at 0
WARNING: SCSI: 5532: Failing I/O due to too many reservation conflicts
WARNING: SCSI: 5628: status SCSI reservation conflict, rstatus 0xc0de01 for vmhba1:0:7. residual R 919, CR 0, ER 3
WARNING: J3: 1970: Error committing txn to slot 0: SCSI reservation conflict
Solution
There are two main categories of operation under which VMFS makes use of SCSI reservations.
The first category is for VMFS data-store level operations. These include opening, creating, resignaturing, and expanding/extending of VMFS data-store.
The second category involves acquisition of locks. These are locks related to VMFS specific meta-data (called cluster locks) and locks related to files (including directories). Operations in the second category occur much more frequently than operations in the first category. The following are examples of VMFS operations that require locking metadata:
•Creating a VMFS datastore
•Expanding a VMFS datastore onto additional extents
•Powering on a virtual machine
•Acquiring a lock on a file
•Creating or deleting a file
•Creating a template
•Deploying a virtual machine from a template
•Creating a new virtual machine
•Migrating a virtual machine with VMotion
•Growing a file, for example, a Snapshot file or a thin provisioned Virtual Disk
If the VMkernel log contains the messages described in the Details section, follow this procedure:
1.If the VMware ESX version is 3.0.1, install Patch ESX-1002960: Fix for SCSI Reservation Conflict Issue. For more information, see ESX Server 3.0.1, Patch ESX-1002960: Fix for SCSI Reservation Conflict Issue (1002960).
If the VMware ESX version is 3.0.2, install Patch ESX-1002974: Fix for SCSI Reservation Conflicts. For more information, see ESX Server 3.0.2, Patch ESX-1002974: Fixes for SCSI Reservation Conflicts; Support for EMC Invista (1002974).
2.If the log messages persist and the ESX host is running on an HP Server using Insight Manager Agents, see Insight Manager may cause SCSI reservation conflicts (1004771).
3.If the log messages persist, see the article that applies to your environment:
SCSI Reservation Failures on HDS USP and NSC Arrays (1005010)
SCSI Reservation Failures on HP XP Storage Arrays (1005011)
SCSI Reservation Failures on Hitachi USP and NSC Arrays (1006001)
SCSI Reservation Failures on SUN StorageTek 9985 and 9990 Arrays (1006002)
SCSI Reservation Failures on Nihon Unisys SANARENA 5200 and 5800 Arrays (1006003)
Virtual machines may experience I/O failures due to too many SCSI reservation conflicts on some 3PAR arrays (1020366
Note: The list of arrays is not exhaustive and will be revised when other arrays are identified reporting these errors.
4.Follow these steps to resolve potential sources of the reservation:
a.Try to serialize the operations of the shared LUNs, if possible, limit the number of operations on different hosts that require SCSI reservation at the same time.
b.Increase the number of LUNs and try to limit the number of ESX hosts accessing the same LUN.
c.Reduce the number snapshots as they cause a lot of SCSI reservations.
d.Do not schedule backups (VCB or console based) in parallel from the same LUN.
e.Try to reduce the number of virtual machines per LUN. See vSphere 4.0 Configuration Maximums and ESX 3.5 Configuration Maximums.
f.What targets are being used to access LUNs?
g.Check if you have the latest HBA firmware across all ESX hosts.
h.Is the ESX running the latest BIOS (avoid conflict with HBA drivers)?
i.Contact your SAN vendor for information on SP timeout values and performance settings and storage array firmware.
j.Turn off 3rd party agents (storage agents) and rpms not certified for ESX.
k.MSCS rdms (active node holds permanent reservation). For more information, see ESX servers hosting passive MSCS nodes report reservation conflicts during storage operations (1009287).
l.Ensure correct Host Mode setting on the SAN array.
m.LUNs removed from the system without rescanning can appear as locked.
n.When SPs fail to release the reservation, either the request did not come through (hardware, firmware, pathing problems) or 3rd party apps running on the service console did not send the release. Busy virtual machine operations are still holding the lock.
Note: Use of SATA disks is not recommended in high I/O configuration or when the above changes do not resolve the problem while SATA disks are used.
If your array is not listed above and none of the above points eliminate the log messages, file a support request with VMware Support and note this KB Article ID in the problem description. For more information, see How to Submit a Support Request.
Keywords