SCSI3 PGR: "Want support on Symmetrix? Reboot 500 Windows servers. Continued.."

By Renegade on Monday 31 August 2009 09:27
Categories: SAN, Storage, Symmetrix, Windows, Views: 7.816

Alan Shugart introduced something called the "Shugart Associates System Interface" or in short "SASI" in 1981, and created something that can now be called a commodity. He probably didn't realize back then what an impact his new product would have later on.

You can find the SASI, or SCSI as it is now called, standard in a lot of hardware that is being produced in the storage oriented market today. Among others, you will find the standard in disks used in servers, you will find the protocol in fibre channel SAN's and you will find it being used in high availability cluster environments.

The part about the high availability clusters is the part I want to talk to you about today.

I wrote about HA clustering before and one of the parts that is important when it comes to clustering is consistency in the files used in the cluster.

Lucky for us, the protocol designed by Mr. Shugart (in later versions of the standard) implemented something called SCSI reservations. Basically you can send out SCSI commands like for example the 6 byte reserve command. Earlier versions of the SCSI protocol delivered to us something that a lot of people in clusters call "disk fencing", or SCSI-2.

SCSI-2 is based on exclusive reservations, meaning that only one node owns the disk. This also means that the other nodes can't reserve the disk, which can lead to some "undesired" behavior. For example, SCSI-2 is not reboot persistent. Meaning that a node that rebooted and came up, registered the disk and would be allowed read/write access to it. Not the most elegant solution I would say? :+

Now, SCSI-3 PGR works with group reservations, meaning that every node has a key on a dedicated area of the disk and other nodes can simply remove a nodes key to remove the nodes reservation. It also means that a host will need to register after a reboot, and it will have the option of checking the reservation state. This should avoid multiple hosts having read/write access at the same time, if we don't want them too.

Sounds like a useful feature? It is! :)

Now then, back to our problem with the reboot of 500 Windows hosts. After opening a case with EMC, things went a little dormant. Our host base was verified, and as usual we were asked for emcgrabs/emcreports from every attached Windows host in our environment... 8)7

We checked Enginuity versions on our DMX's and the dreaded support matrices from EMC and found that we really did not have an option, except not upgrading and running the risk of falling out of support.

Right now, the situation if even more tense, since Microsoft came out with a new version of the storport driver in a new hotfix. You can find more info on hotfix 950903 here. The problem being that when you run a HEAT report, this hotfix is recommended by Microsoft. But if the FA flags are not set up in a proper manner, you are bound to run in to problems.

Now, here's a small list of currently required flags for the various operating systems:

Windows Server 2003
  • Common Serial Number (C)
  • Enable Auto Negotiation (EAN)
  • Enable Point-to-point (PP)
  • Host SCSI Compliance 2007 (OS2007)
  • SCSI-3 SPC-2 Compliance (SPC-2)
  • Unique World Wide Name (UWN)
  • SCSI-3 compliance (SC3)
Windows Server 2003 with failover clustering
  • Common Serial Number (C)
  • Enable Auto Negotiation (EAN)
  • Enable Point-to-point (PP)
  • Host SCSI Compliance 2007 (OS2007)
  • SCSI-3 SPC-2 Compliance (SPC-2)
  • Unique World Wide Name (UWN)
  • SCSI-3 compliance (SC3)
Windows Server 2008
  • Common Serial Number (C)
  • Enable Auto Negotiation (EAN)
  • Enable Point-to-point (PP)
  • Host SCSI Compliance 2007 (OS2007)
  • SCSI-3 SPC-2 Compliance (SPC-2)
  • Unique World Wide Name (UWN)
  • SCSI-3 compliance (SC3)
Windows Server 2008 with failover clustering
  • Common Serial Number (C)
  • Enable Auto Negotiation (EAN)
  • Enable Point-to-point (PP)
  • Host SCSI Compliance 2007 (OS2007)
  • SCSI-3 SPC-2 Compliance (SPC-2)
  • Unique World Wide Name (UWN)
  • SCSI-3 compliance (SC3)
  • PER bit for each clustered device (attribute=SCSI3_persist_reserv)
As stated before, these flags are an absolute requirement to get support from EMC, but unfortunately the situation is still more or less the same. It's amazing how slow things can go along sometimes.

All I can recommend right now is to talk to your EMC representative and explain the situation, and ask for a solution. This will affect people more and more, and in my opinion, this needs to be solved.

Again, I will try to post an update as soon as I have newer information that I can share. Until then I for one am keeping my fingers crossed that we don't run in to problems.

Volgende: Storage provisioning: Do you really really really need that much? 09-'09 Storage provisioning: Do you really really really need that much?
Volgende: It's all about passion and being a geek! 08-'09 It's all about passion and being a geek!

Comments

Comments are closed