SCSI3 PGR: "Want support on Symmetrix? Reboot 500 Windows servers. Continued.."

By Renegade on Monday 31 August 2009 09:27 - Comments are closed
Categories: SAN, Storage, Symmetrix, Windows, Views: 7.636

Alan Shugart introduced something called the "Shugart Associates System Interface" or in short "SASI" in 1981, and created something that can now be called a commodity. He probably didn't realize back then what an impact his new product would have later on.

You can find the SASI, or SCSI as it is now called, standard in a lot of hardware that is being produced in the storage oriented market today. Among others, you will find the standard in disks used in servers, you will find the protocol in fibre channel SAN's and you will find it being used in high availability cluster environments.

The part about the high availability clusters is the part I want to talk to you about today.

I wrote about HA clustering before and one of the parts that is important when it comes to clustering is consistency in the files used in the cluster.

Lucky for us, the protocol designed by Mr. Shugart (in later versions of the standard) implemented something called SCSI reservations. Basically you can send out SCSI commands like for example the 6 byte reserve command. Earlier versions of the SCSI protocol delivered to us something that a lot of people in clusters call "disk fencing", or SCSI-2.

SCSI-2 is based on exclusive reservations, meaning that only one node owns the disk. This also means that the other nodes can't reserve the disk, which can lead to some "undesired" behavior. For example, SCSI-2 is not reboot persistent. Meaning that a node that rebooted and came up, registered the disk and would be allowed read/write access to it. Not the most elegant solution I would say? :+

Now, SCSI-3 PGR works with group reservations, meaning that every node has a key on a dedicated area of the disk and other nodes can simply remove a nodes key to remove the nodes reservation. It also means that a host will need to register after a reboot, and it will have the option of checking the reservation state. This should avoid multiple hosts having read/write access at the same time, if we don't want them too.

Sounds like a useful feature? It is! :)

Now then, back to our problem with the reboot of 500 Windows hosts. After opening a case with EMC, things went a little dormant. Our host base was verified, and as usual we were asked for emcgrabs/emcreports from every attached Windows host in our environment... 8)7

We checked Enginuity versions on our DMX's and the dreaded support matrices from EMC and found that we really did not have an option, except not upgrading and running the risk of falling out of support.

Right now, the situation if even more tense, since Microsoft came out with a new version of the storport driver in a new hotfix. You can find more info on hotfix 950903 here. The problem being that when you run a HEAT report, this hotfix is recommended by Microsoft. But if the FA flags are not set up in a proper manner, you are bound to run in to problems.

Now, here's a small list of currently required flags for the various operating systems:

Windows Server 2003
  • Common Serial Number (C)
  • Enable Auto Negotiation (EAN)
  • Enable Point-to-point (PP)
  • Host SCSI Compliance 2007 (OS2007)
  • SCSI-3 SPC-2 Compliance (SPC-2)
  • Unique World Wide Name (UWN)
  • SCSI-3 compliance (SC3)
Windows Server 2003 with failover clustering
  • Common Serial Number (C)
  • Enable Auto Negotiation (EAN)
  • Enable Point-to-point (PP)
  • Host SCSI Compliance 2007 (OS2007)
  • SCSI-3 SPC-2 Compliance (SPC-2)
  • Unique World Wide Name (UWN)
  • SCSI-3 compliance (SC3)
Windows Server 2008
  • Common Serial Number (C)
  • Enable Auto Negotiation (EAN)
  • Enable Point-to-point (PP)
  • Host SCSI Compliance 2007 (OS2007)
  • SCSI-3 SPC-2 Compliance (SPC-2)
  • Unique World Wide Name (UWN)
  • SCSI-3 compliance (SC3)
Windows Server 2008 with failover clustering
  • Common Serial Number (C)
  • Enable Auto Negotiation (EAN)
  • Enable Point-to-point (PP)
  • Host SCSI Compliance 2007 (OS2007)
  • SCSI-3 SPC-2 Compliance (SPC-2)
  • Unique World Wide Name (UWN)
  • SCSI-3 compliance (SC3)
  • PER bit for each clustered device (attribute=SCSI3_persist_reserv)
As stated before, these flags are an absolute requirement to get support from EMC, but unfortunately the situation is still more or less the same. It's amazing how slow things can go along sometimes.

All I can recommend right now is to talk to your EMC representative and explain the situation, and ask for a solution. This will affect people more and more, and in my opinion, this needs to be solved.

Again, I will try to post an update as soon as I have newer information that I can share. Until then I for one am keeping my fingers crossed that we don't run in to problems.

"Want support on Symmetrix? Reboot 500 Windows servers."

By Renegade on Thursday 18 June 2009 10:57 - Comments (13)
Categories: SAN, Storage, Symmetrix, Windows, Views: 6.807

I would dare to say that we have one of the bigger SAN environments here at our company. We have well over 1000 hosts connected to our SAN and use storage from different vendors.

Now, You have your everyday problems when your environment is big. Some problems are smaller, some are bigger, it comes with the territory. But sometimes you just run in to things that will make you think "uhuh, you didn't just say that 8)7 ".

This is the case with a service request that I opened with EMC. I was reading in the EMC Forum when someone made a short mention that the required flags for the Symmetrix front-end ports had changed. I decided to do some checks myself, and found the following document in EMC's Powerlink:
Powerlink ID: emc200609 / "What Symmetrix director flags / bits are required for Microsoft Windows Server 2008?"
Nothing out of the ordinary so far. New settings for a new OS are fine. So just to make sure I also checked for Windows 2003. And I did find something:
Powerlink ID emc201305 / "PowerPath showing loss of connectivity to server down all paths.":
...
The SPC2, SC3, and OS2007 flags are required flags on all Windows 2003 and 2008 servers connected to Symmetrix arrays.
Now that's something new to me. These were not mandatory before. So I opened a case with EMC and asked them to verify this for me and have them confirm that we need these settings to get some form of support from EMC. I received a longer mail back, but the third sentence in the mail stated the following:
Your observation is correct.
If you check the current ESM document, you will find that these settings are mandatory. To top it off, the mail stated that:
Please note, that setting the flags changes the inquiry page and hence the PnP id. You will need to reboot.
This means that we can make the change on the FE-port, or we could set the flags on an initiator base, but each way we would need to reboot about 500 hosts connected to the various Symmetrixes.

We are still talking to EMC to find another solution, but this issue is not an easy one. In all fairness it should be said that this decision is not EMC's fault. Microsoft changed the requirements for Windows 2008 hosts, and just said they want vendors to use the same flags for Windows 2008 and Windows 2003. The result being the issue described above.

I'll update this post or write a new post as soon as we hear anything more, but this is turning out to be an interesting change. Let's see what happens.