Sunday, March 14, 2010

Zen and the Art of Converged and Efficient Data Centers

What a few weeks it has been. Over the last month I have been fortunate enough to meet with the CTO from Frito-Lay, the CTO from NetApp, attend a joint SAP and NetApp executive briefing at SAP’s North American HQ in Philadelphia and tour two world class IT support centers (PepsiCo in Dallas and Dimension Data in Boston). I have spent days pouring through technical documentation geared towards architecting my organizations next generation data center centered around 10GB Ethernet, Virtualization, Blade Systems and efficient energy practices. So this post is probably as much for myself as anyone else, meant to simply document of few of the key learning’s I have taken away from the flurry of activity over the last few weeks. Hey, maybe someone else will find it interesting too?

PUE & The Green Grid

In a one on one conversation with Dave Robbins, the CTO of NetApp Information Technology, he asked what my data center space providers PUE is. My response was an inquisitive, what? PUE stands for Power Use Efficiency and is a measure of how effectively a data center is using its energy resources. Essentially, PUE is the amount of electricity used by a data center for cooling and mechanics divided by the actual IT load. Efficient Data Centers run at around 1.6. The concept of PUE and its measurement was created by an organization known as The Green Grid and you can find all kinds of great resources at their web site. This is an excellent tool for you to use when negotiating power costs with a Hosting provider. You should know their PUE and insist that you will not pay for their inefficacy. You can also find a cool tool for PUE calcualtion at 42U.com.

It is Time to Converge

The introduction of 10GB Ethernet in Data Centers (and perhaps even more important, lossless Ethernet) has truly created an opportunity to collapse Ethernet and Fiber Channel networks in the Data Center backbone, cutting huge costs in Fiber Channel infrastructure. 10 GB Ethernet and Lossless Ethernet serve as enablers for protocols such as FCoE and FIP which allow Fiber Channel frames to be encapsulated and carried across Ethernet backbones. There are a few watch outs when adopting FCoE that you need to be aware of. First, make sure your storage vendor has a CNA (Converged Network Adapter) that supports BOTH FCoE and other IP based traffic. Some of the early “converged” adapters only support FCoE, not much real convergence there. Put some effort in understanding Cisco’s current support of FCoE and Fiber Channel Initialization Protocol (FIP) in their Nexus line of switches. You will find some good resources here. The details of this are too complex for me to go into here but suffice it to say, you need to think long and hard about your data center switch layout in order to get full FCoE support across your 10GB backbone. Also, remember that lossless Ethernet or data center bridging are keys to FCoE success but are fairly new. So, when you hear people tell you they knew someone who tried FCoE a couple of years ago but found it lacking, take it with a grain of salt.


The FUD around Cisco UCS

Let me get one thing out of the way upfront, the Cisco Unified Computing System (UCS) is sexy. Cisco’s tight relationship with VMware, stateless computing and a seemingly end to end vision for the data center combine for a powerful allure. Competitors such as IBM and HP are quick to point out that their blade center products perform the same functions as Cisco’s UCS but with a proven track record. In general, these claims are true. I have been exposed to some competitive claims against the UCS that where simply meant to plant the seed of Fear Uncertainty and Doubt (FUD) in the mind of technology managers. What if Cisco changes their Chassis design, is your blade investment covered? UCS is meant for VMware only (not true). The list goes on. I have been heavily comparing the Cisco UCS to IBM’s H series Blade Center. I had originally convinced myself that the difference between these two offerings was all about the network. Cisco’s UCS does offer some interesting ways to scale across chassis and provides some great management tools. For a mid-sized organization, the ability to scale across chassis becomes less important however when you can get a concentrated amount of compute power inside one or maybe two chassis. Some new technology coming from IBM in the form of their MAX5 blades is going to allow for some massive compute power inside a two socket blade. If you are a large organization planning on adding many UCS chassis, the networking innovations in the UCS likely will fit your needs well. For a mid-sized company, consider getting more compute power inside fewer chassis by using some hefty blades. This not only reduces your need to scale across many chassis, it also helps lower your VMware costs. VMware is licensed by the socket so fewer sockets with more cores on blades with higher memory capabilities ultimately drives down your VMware licensing needs. Also, before you completely convince yourself that the Cisco UCS has a strong hold on the networking space in the data center, spend some time understanding IBM’s Virtual Fabric technology. This offers similar features to the VIC cards in the Cisco UCS. The point is this, don’t be immediately sucked in by the sexy UCS. Cisco has come to the blade market with some cool innovation and in some circumstances, it will be exactly what you need. Make the investment in time to really understanding competing products. Avoid FUD in all directions.

No comments:

Post a Comment