GestaltIT TFD - Day 2: Top secret at Data Robotics?

By Renegade on Tuesday 24 November 2009 12:13 - Comments are closed
Categories: Gestatl IT, NAS, Views: 5.592

So, during the afternoon of the GestaltIT Tech Field Day, we were invited to join Data Robotics for something that would turn out to be quite interesting. Since pictures say more than words I didn't want to hold back this picture from you guys just to show that we are all work and no play:

The Drobo sign

So, once we pulled Greg Knieriemen off of the sign, we went inside and entered a meeting room where we had the next issue with some signs:

Top Secret at Data Robotics

How's that for a greeting....? Yeah, I thought so.

So, once everyone settled down things got a little more interesting.

Now, in case you don't know Data Robotics yet, they have built quite a name for themselves with two products called the "Drobo" and the DroboPro.
Basically the Drobo is a small NAS device that holds up to four drives and offers you a Fire-wire 800 and a USB 2.0 interface. Besides that you get a connector for your power supply and a hole to plug in your Kensington lock.
The DroboPro is something that will offer you a bit more. It has bays for 8 drives, a Gbit Ethernet interface that allows you to use iSCSI. The other features are more or less the same, although this unit can be rack-mounted and even supports a dual parity setup (RAID6) and smart volumes.

So, one of the features that Data Robotics advertises with is something called "BeyondRAID", or as Data Robotics CEO Geoff Barrall states "The core differentiator for Drobo is BeyondRAID. BeyondRAID is what Drobo think RAID would be if RAID were designed today".

That's a bold statement to make, but the numbers that were presented seems to show that this product is in high demand, and it's gaining momentum quite rapidly.
Data Robotics actually had 100% growth in 2009 over 2008 with over 85,000 units shipped in just two years. Of those 85,000 there were more than 5,000 DroboPro's, and that's just since April.

Now, they also mentioned that the future market for the Drobo is seen in the SMB storage market, or to be more specific, they will focus on sub $15,000.- DAS and SAN attached storage market.
This brought up questions what will happen with the DroboShare that didn't receive as warm a welcome by customers as the Drobo itself. No real statement was made about the future of the DroboShare, but with a focus on the SMB, one can only assume that there is an uncertain future for the DroboShare.

So, after a quick introduction we finally got a clearer view of what was so top secret. Two new units the were actually introduced yesterday. The Drobo S and the DroboElite.

Drobo SThe Drobo S has some small but welcome changes. The number of drives has now been upgraded to a total of five. Besides the FireWire 800 and the USB 2.0 interface, you can also hook up your Drobo S via eSATA which should make a lot of people happy, even though eSATA is not available on any of the Apple Macs that are released to date. Supposedly you will get up to 50% more performance when compared to the regular Drobo.

The new unit also increases it's redundancy so you can actually lose (or pull out) two drives at the same time and continue to work with the data that is stored on it.
I should note that pulling all drives at the same time will actually stop you from accessing the data on the unit, as tested by Devang Panchigar, but since the disk layout and parity is stored on the disks, you can just power off the unit, insert all disks back in and your data will be back once you powered it back on.

The theoretical limit for the amount of storage is only limited by the size of the drives that are currently for sale, but the number of volumes also changed from just one 16TB volume on the Drobo to up to 15 on the Drobo S.

The DroboElite has some nice new changes that include a dual Gb Ethernet port with iSCSI support that will allow up to 16 hosts to connect to the unit. The number of volumes has been increased from 16 on the DroboPro to 255 on the DroboElite. All in all nothing to really shock anyone on this unit, but the dual interface is something that a lot of people will probably be quite happy about.

Pricing will start at $799.- RSP for the Drobo S and at $3499.- RSP for the DroboElite, but you will probably find other prices through various other channels.

I will do a deep dive in to the technology behind BeyondRAID as this is probably something that is interesting to a lot of people, and I will make sure to add a comparison to that which comes up quite regularly. "Can't I do the same much cheaper and easier with Linux and an LVM". The short answer is just a simple "No.", the longer answer will be contained in the post BeyondRAID post, so stay tuned!

Symmetrix access control: When unique is everything but unique

By Renegade on Friday 2 October 2009 10:22 - Comments (2)
Categories: SAN, Storage, Symmetrix, Views: 4.549

So, youve got a million dollar storage box standing there and want to make sure that it's secure? Sure thing you want to do that! And you ask your vendor "What can I do?". One of the replies could to use access control lists or ACL's. And all is great. Or is it?

From what I have heard, very few EMC customers in Europe tend to use ACL's on their Symmetrixes. Perhaps even for a good reason?

If you take a look at the documentation on Powerlink you can find some technical papers on Symmetrix Access Control, and the papers will state (among others) the following:
Today, anyone with access to Symmetrix-based management software can execute any function on any Symmetrix device. Many product applications such as EMC® ControlCenterTM, TimeFinder®, SRDF®, Optimizer®, Resource View, Database Tuner, and various ISV products can issue management commands to any device in a Symmetrix® complex. Open systems hosts can manipulate mainframe devices, Windows hosts can manipulate UNIX data, and vice versa.

Shared systems, such as these, may be vulnerable to one host, accidently or intentionally, tampering with another’s devices. To prevent this, the symacl command can be used by an administrator of the Symmetrix storage site to set up and restrict host access to defined sets of devices (access pools) across the various Symmetrix arrays.
Now, I have to admit that this info is from an older version of this guide, but the same is still true for the most part. You can change to in-band or out-of-band management, you can use the Symmetrix Management Console, but as soon as you install Solutions Enabler on a client connected to the storage box, you more or less open up a world of possibilities on said client.

Usually you don't want that, so why not implement some restrictions? symacl is just the thing for that! Normally I would create an access pool, in which I define permission to a host to perform certain Solutions Enabler functionality or commands on a specified set of devices. These sets of devices are referred to as access pools.

Now, once I have set up these access pools, I can assign single clients or groups of clients to these pools. I do that by creating access control groups. These contain unique access IDs and names, and are assigned to hosts and sorted into access control groups

So now I have one (or more) clients that I allow a certain piece of functionality or a certain (set of) command(s). In order to uniquely identify my client, I can run the following solution enabler command:

symacl -unique

and will receive an output similar to this:

The unique id for this host is: 254A30A9-54319DC0-8A476069

Now that we have the unique host id, we can add id to the configured access group via a command file using the normal preview, prepare and commit routine. After that, you should be good to go.

And that is where things can get nasty.

As we have found out the hard way, a unique host id is not necessarily unique. We have had occasions where we had multiple hosts with the same unique host id on the same Symmetrix. Fortunately, the DMX is so confused at that point that it won't allow any of the hosts to access the configured devices - and normally your masking and zoning provide some extra protection - but it is still a nasty thing that can happen.

That brings us to the second point. The unique host id can change. EMC will not tell you what changes influence the generation of the unique host id, but for example a change of FC-HBA will cause the unique host id to be changed. On Windows, there are versions of Solutions Enabler where a change in the NetBIOS stack seems to cause this change. Now you might think that you can check what unique host id was configured in the access group, but you would be wrong.

Unfortunately, all the unique host id's that are entered in to an access group will be crypted/hashed by the Symmetrix, and you won't be able to retrieve the unique host id. So my advice. If you want to compare the values you entered, store them somewhere so that you at least have the option to compare the values. It can make troubleshooting a bit easier.

Just as a hint, there is also a way to create static unique host id's, which are unaffected by hardware and software changes. Should you need it, ask your EMC support and refer to Powerlink ID emc198823. They should be able to give you a solution with that ID number. :)

A last word of advice. If you are working with ACL's and changing stuff, please make sure you back up your access logix database before you start with the changes. It might be a good idea to implement that as the first step in any scripts you might create.

ACL's are not a bad thing. They can increase your (sense of) securty. However, the way it was implemented in the Symmetrix environment leaves a bit to be desired, and troubleshooting issues can be a pain if you are not aware of the fact that the unique host id's aren't always unique.

Storage provisioning: Do you really really really need that much?

By Renegade on Wednesday 9 September 2009 09:46 - Comments (6)
Categories: EMC, General, Storage, Views: 2.954

I received a link to an article where we can find an interview with Symantec's Mathew Lodge and their view on data deduplication. I couldn't help but noticing the following quote:
According to a recent survey by Applied Research, more than half of all organizations expect to spend more on storage in 2009 than they did in 2008. But at the same time, the latest Symantec State of the Data Center Report indicates that storage utilisation hovers at just 50%.
Now, that got me thinking on a couple of things. First off, I tried to look up this survey. Unfortunately, the results from the Applied Research-West seem to be beyond my Google skills. On the other hand they seem to be the standard company used by Symantec for surveys that somehow seem to have results that are aligned with Symantec's product portfolio. Talk about a coincidence!

Anyway as they said, "more than half of all organizations expect to spend more on storage in 2009 than they did in 2008" I was pondering how this could be? We are seeing technologies like the deduplication mentioned in the article. Almost all vendors are able to offer something similar. Same can be said about thin or virtual provisioning. Heck, thanks to the effort in the blogosphere and feedback from partners and customers, EMC even decided to change it's policy and make virtual provisioning free for the V-Max, DMX4 and DMX3.

Seems a bit odd that almost all storage vendors are delivering methods to reduce the disk space footprint in their SAN and NAS, but we still see an increase in expenditure. Sure enough the licensing costs for such new features are to be included. And perhaps you even need to buy new hardware to fully utilize such new features. But all of the big vendors are quick enough to tell us the return on invest when we purchase new stuff. So that can't be it, right?

And you know what? They are right!

Simple enough, we don't know how much disk space our users need! Hell, most of the time, the user himself doesn't even know! And then there's the fact that it's too easy to get new storage.

We provision like there's no tomorrow. Not just disk space, but also computational power. You need to test something? Here, have a VM and go right ahead. What? You're on Solaris? No problem, here's a brand new sparkling zone, just for you. How much disk space do you need? Two Tera? No wonder we called it Terabyte, those are monstrous amounts of disk space.

I know the dilemma, and when you ask your users if they really need all of that, you usually get a blank look on their faces, outrage - How dare you ask me that, isn't it obvious? -, or perhaps even an educated guess.Some will even give you forecasts... If you are lucky.

Things will get better with technology like TP and dedupe. And things will get worse when we go for new technologies like cloud., but fact of the matter is, we have made provisioning too easy, and we've somehow lost the art of asking if they really really really need it. Usually the answer to anyone provisioning is a simple "no".

SCSI3 PGR: "Want support on Symmetrix? Reboot 500 Windows servers. Continued.."

By Renegade on Monday 31 August 2009 09:27 - Comments are closed
Categories: SAN, Storage, Symmetrix, Windows, Views: 7.636

Alan Shugart introduced something called the "Shugart Associates System Interface" or in short "SASI" in 1981, and created something that can now be called a commodity. He probably didn't realize back then what an impact his new product would have later on.

You can find the SASI, or SCSI as it is now called, standard in a lot of hardware that is being produced in the storage oriented market today. Among others, you will find the standard in disks used in servers, you will find the protocol in fibre channel SAN's and you will find it being used in high availability cluster environments.

The part about the high availability clusters is the part I want to talk to you about today.

I wrote about HA clustering before and one of the parts that is important when it comes to clustering is consistency in the files used in the cluster.

Lucky for us, the protocol designed by Mr. Shugart (in later versions of the standard) implemented something called SCSI reservations. Basically you can send out SCSI commands like for example the 6 byte reserve command. Earlier versions of the SCSI protocol delivered to us something that a lot of people in clusters call "disk fencing", or SCSI-2.

SCSI-2 is based on exclusive reservations, meaning that only one node owns the disk. This also means that the other nodes can't reserve the disk, which can lead to some "undesired" behavior. For example, SCSI-2 is not reboot persistent. Meaning that a node that rebooted and came up, registered the disk and would be allowed read/write access to it. Not the most elegant solution I would say? :+

Now, SCSI-3 PGR works with group reservations, meaning that every node has a key on a dedicated area of the disk and other nodes can simply remove a nodes key to remove the nodes reservation. It also means that a host will need to register after a reboot, and it will have the option of checking the reservation state. This should avoid multiple hosts having read/write access at the same time, if we don't want them too.

Sounds like a useful feature? It is! :)

Now then, back to our problem with the reboot of 500 Windows hosts. After opening a case with EMC, things went a little dormant. Our host base was verified, and as usual we were asked for emcgrabs/emcreports from every attached Windows host in our environment... 8)7

We checked Enginuity versions on our DMX's and the dreaded support matrices from EMC and found that we really did not have an option, except not upgrading and running the risk of falling out of support.

Right now, the situation if even more tense, since Microsoft came out with a new version of the storport driver in a new hotfix. You can find more info on hotfix 950903 here. The problem being that when you run a HEAT report, this hotfix is recommended by Microsoft. But if the FA flags are not set up in a proper manner, you are bound to run in to problems.

Now, here's a small list of currently required flags for the various operating systems:

Windows Server 2003
  • Common Serial Number (C)
  • Enable Auto Negotiation (EAN)
  • Enable Point-to-point (PP)
  • Host SCSI Compliance 2007 (OS2007)
  • SCSI-3 SPC-2 Compliance (SPC-2)
  • Unique World Wide Name (UWN)
  • SCSI-3 compliance (SC3)
Windows Server 2003 with failover clustering
  • Common Serial Number (C)
  • Enable Auto Negotiation (EAN)
  • Enable Point-to-point (PP)
  • Host SCSI Compliance 2007 (OS2007)
  • SCSI-3 SPC-2 Compliance (SPC-2)
  • Unique World Wide Name (UWN)
  • SCSI-3 compliance (SC3)
Windows Server 2008
  • Common Serial Number (C)
  • Enable Auto Negotiation (EAN)
  • Enable Point-to-point (PP)
  • Host SCSI Compliance 2007 (OS2007)
  • SCSI-3 SPC-2 Compliance (SPC-2)
  • Unique World Wide Name (UWN)
  • SCSI-3 compliance (SC3)
Windows Server 2008 with failover clustering
  • Common Serial Number (C)
  • Enable Auto Negotiation (EAN)
  • Enable Point-to-point (PP)
  • Host SCSI Compliance 2007 (OS2007)
  • SCSI-3 SPC-2 Compliance (SPC-2)
  • Unique World Wide Name (UWN)
  • SCSI-3 compliance (SC3)
  • PER bit for each clustered device (attribute=SCSI3_persist_reserv)
As stated before, these flags are an absolute requirement to get support from EMC, but unfortunately the situation is still more or less the same. It's amazing how slow things can go along sometimes.

All I can recommend right now is to talk to your EMC representative and explain the situation, and ask for a solution. This will affect people more and more, and in my opinion, this needs to be solved.

Again, I will try to post an update as soon as I have newer information that I can share. Until then I for one am keeping my fingers crossed that we don't run in to problems.

"Want support on Symmetrix? Reboot 500 Windows servers."

By Renegade on Thursday 18 June 2009 10:57 - Comments (13)
Categories: SAN, Storage, Symmetrix, Windows, Views: 6.807

I would dare to say that we have one of the bigger SAN environments here at our company. We have well over 1000 hosts connected to our SAN and use storage from different vendors.

Now, You have your everyday problems when your environment is big. Some problems are smaller, some are bigger, it comes with the territory. But sometimes you just run in to things that will make you think "uhuh, you didn't just say that 8)7 ".

This is the case with a service request that I opened with EMC. I was reading in the EMC Forum when someone made a short mention that the required flags for the Symmetrix front-end ports had changed. I decided to do some checks myself, and found the following document in EMC's Powerlink:
Powerlink ID: emc200609 / "What Symmetrix director flags / bits are required for Microsoft Windows Server 2008?"
Nothing out of the ordinary so far. New settings for a new OS are fine. So just to make sure I also checked for Windows 2003. And I did find something:
Powerlink ID emc201305 / "PowerPath showing loss of connectivity to server down all paths.":
The SPC2, SC3, and OS2007 flags are required flags on all Windows 2003 and 2008 servers connected to Symmetrix arrays.
Now that's something new to me. These were not mandatory before. So I opened a case with EMC and asked them to verify this for me and have them confirm that we need these settings to get some form of support from EMC. I received a longer mail back, but the third sentence in the mail stated the following:
Your observation is correct.
If you check the current ESM document, you will find that these settings are mandatory. To top it off, the mail stated that:
Please note, that setting the flags changes the inquiry page and hence the PnP id. You will need to reboot.
This means that we can make the change on the FE-port, or we could set the flags on an initiator base, but each way we would need to reboot about 500 hosts connected to the various Symmetrixes.

We are still talking to EMC to find another solution, but this issue is not an easy one. In all fairness it should be said that this decision is not EMC's fault. Microsoft changed the requirements for Windows 2008 hosts, and just said they want vendors to use the same flags for Windows 2008 and Windows 2003. The result being the issue described above.

I'll update this post or write a new post as soon as we hear anything more, but this is turning out to be an interesting change. Let's see what happens.