Monday, January 25, 2010

Cisco UCS vs IBM and HP - Where are the Brains?

UPDATE: Thank you to everyone for the great comments!  Please look for the updated sections that I have highlighted below.  I have learned a lot from everyone and I will continue to update this as more information rolls in.  I welcome any and all comments.  Thank you!

As many of you know, my company recently acquired some very nice lab gear for customer demonstrations and proof of concept work.  Many of my peers already know the UCS systems inside and out but I really need hands on to "get it".

As I learn the UCS system I will share my experiences here.  My perspective is to share what is different (good and bad) about UCS compared to the IBM and HP Blade products.  Before anyone asks, I will only be covering IBM and HP.  If you have additional experiences, please share them in the comments.  I also have no intention of picking sides.  At the end of the day I sell and support all of the above systems and I can get the job done with all of them.  They all have their own unique strengths and weaknesses that I intend to highlight.

In case you aren't familiar with what UCS is, I suggest you take a look at Colin's post over on his blog.  He does a great job putting all the pieces together.  Plus, I'm going to steal a few of his graphics. (thanks Colin!!)

A UCS system consists of one or more chassis and a pair of Cisco 6120 switches that provide both the 10GB bandwidth to the blades as well as the management of the system.  The last part of that statement is the key to understanding how UCS is currently different from the competition.  I define management in this example as the control of the blade hardware state.  This includes identification, power on, power off, remote control, remote media, and the virtual I/O assignments for MAC and WWPN's.

By moving the management from the chassis level to the switch level, the solution can now take advantage of a multi-chassis environment.  Here's a simple modification of Colin's diagram to illustrate this point.


(UPDATED!) What are the limitations to the Cisco UCS model?
Someone asked in the comments how this scales.  Honestly that was a great question.  I'm still learning Cisco and I was wrapped up in making it work.  Let's take a look at that.  Currently you can have up to 8 chassis per pair of UCS Managers (Cisco 6100's).  That number will increase in the upcoming weeks and eventually the limit will top out at 40.  But, the more realistic limitation is either 10 or 20 depending on the number of FEX uplinks from the chassis to the 6100's unless you are using double wide blades.  If you don't understand what that means right now, don't sweat it.  I'll be posting about that shortly.


(UPDATED) What if you need to manage more than the chassis limitations today?
If you need to go above the limit, then you have two options.  The first option is to purchase another pair of 6100's to create another UCS System and they will be independent of each other.  The second option is provided by BMC software.  This will allow you to manage more chassis and the solution also provides additional enhancements.  I admit I know little to nothing about the product so I'll just post the link from the comments and you can take a look.  The brain mapping for that would like this.



How do you get into the brains?
Each 6120 has an ip address and both 6120's are linked together to create a clustered ip address.  The clustered ip is the preferred way to access the software.  The clustering is handled over dual 1GB links labeled L1 and L2 on each switch.  They are connected together like this:



Cisco uses a program to manage this environment called creatively enough, Cisco UCS Manager or UCSM.  To access UCSM, point a browser at the clustered ip address.  Once authenticated, you will be prompted to download a 20MB java package (yes it is java, yuck!).  Here is a pic of ours with both chassis powered up.



Notice that both chassis are in the same "pane of glass".  This allows for management of all the blades from one interface and the movement of server profiles (covered later) from one chassis to another within the same management tool.


How does this compare to IBM? 

IBM is a two part answer.

IBM Part One - Single Chassis Interface in AMM

IBM uses a module in each BladeCenter chassis called the Advanced Management Module (AMM).  There can be up to two AMM's in each chassis.  If there are two AMM's, one is active and the other is passive.  They share the configuration and a single ip address on the network.  In the case of failure of the primary, the passive module becomes active and communication resumes on the original ip address.  The AMM will control power state, identification, virtual media and remote control out of the box.  Virtual I/O (both WWPN and MAC) is an additional purchased license in the AMM.  The product is called the Blade Open Fabric Manager (BOFM).  I don't know if BOFM supports 10GB but I know it supports 1GB ethernet and 2/4GB FC.  This is what it would look like with brains in place:


As you can see, each chassis is managed individually.  In my experience, this is the most common configuration I have seen.

IBM Part Two - Multiple Chassis Management with IBM Director

IBM does have a free management product called IBM Director that can pull all this together into a single pane of glass.  The blade administration tasks are built into the interface and virtualized I/O is handled through the Advanced BladeCenter Open Fabric Manager.  Advanced BOFM is a Director plug-in and is a fee based product.  Logically it would look something like this:




The downside to this solution is you now have another server in your environment to manage.  In my experience Director is a little flaky at times but I also haven't tried the newest version which is a redesign to address many of the issues.

How does this compare to HP?


HP is a two part answer as well.  I haven't implemented HP's Virtual Connect over multiple chassis so I will ask that if you know this answer and can throw some links my way, please do and I will update this section.


(UPDATED!) HP Part One - Single Chassis Interface in Onboard Administrator (OA)


HP's approach is very similar to IBM.  HP's management modules are called the Onboard Administrator and there can be a maximum of two in each chassis.  HP is different from IBM because each module requires an ip address.  At any given time one ip address is active and one ip address is passive.  If you access the passive module on the network, it will tell you that you are on the passive module and instruct you connect to the active module.  Like the IBM AMM, the OA will control all basic functions such as power state, identification, virtual media, and remote control.  Like IBM, HP has a separate product for virtual I/O called Virtual Connect.  Unlike the IBM and Cisco products, HP's Virtual Connect is implemented at the I/O module level.  The only way to achieve virtual I/O is to purchase the HP I/O modules.  HP's brain mapping is a little different than IBM because you can connect up to four chassis into one interface.  Since you probably won't be able to power more than four chassis in a rack, think of it as consolidation at the rack level.


(UPDATED!) HP Part Two - Multiple Chassis Interface in HP Insight Tools


After you get to four chassis, HP Insight Tools need to be brought in to fulfill the needs.  Based on the comments below it appears that two products will fit the bill.  To manage the chassis and blade functions you will need Insight Dynamics VSE Server Suite and to manage the virtual I/O you will need the Virtual Connect Enterprise Manager product.  Both the Insight Dynamics VSE Server adn the Virtual Connect Enterprise Manager is fee based.



Summary

(If you made it this far, I'm impressed!)  Cisco's approach feels very "up to date".  I really like the idea of not having to add another server (and additional fees for virtualized I/O) to the environment for management of the products.  By moving all of the management centrally to the switches you are better able to see the environment and implement a multi-chassis/multi-rack solution.  IBM and HP offer a similar solution that has grown over time but the roots of the interface are in single chassis/rack management.  But, at the end of the day both IBM and HP offer a centralized management solution.

Thoughts?  Concerns?  Please leave a comment!

26 comments:

Ken Oestreich said...

Aaron - nice summary of the Converged Infrastructure solution 'brain locations' out there. I've also attempted a comparison - not at the brain/networking level, but more at the services-provided level. http://bit.ly/jE9Yp

The observation with most of these products is that most of the 'brains' focus on manipulating the networking and IO guts rather than on providing higher-level services Admins care about. Thus, you'd still have to pair with products like BMC or Symantec to get HA/DR services.

In full disclosure, I work for Egenera... and tried to do a comparison to illustrate the 'stack' of products you'd need to use to get equivalent HP, CSCO and Egenera services at http://bit.ly/KrjsH

BTW, Egenera's "brains" live outside the blade enclosure on dual-redundant servers, in software called PAN Manager. That SW also provides all of the IO virtualization and converged network services. Plus HA, DR and provisioning services too.

Kevin said...

Aaron - good post. HP does have software that manages the physical, virtual and virtual connect infrastructure. It is called "Insight Dynamics – VSE Suite". Basically it's a suite of products that includes a capacity advisor, ypervisor integration, Insight Control suite, Server Migration software and integration with Virtual Connect Manager and Insight Orchestration. While it seems like a lot of software, HP does a great job of integrating it all together under a single pane of glass (Insight Control).

In regards to Cisco's UCS, I read in your post the 6120 provides a java based console for management of the hardware, and profile info, but I also know that Cisco resells BMC BladeLogic CM. Do you know what the BladeLogic CM software adds to the mix that doesn't come standard with UCS Manager?

Aaron Delp said...

@Ken - Excellent article! Thank you for sharing the information!

@Kevin - Thank you for the information! So, are you saying that I could/should put in a picture of a Insight Dynamics Server to manage the blades like I did for the Director server? I would be happy to do that if that is the correct architecture. Please confirm when you can.

I'm sorry but I can't comment on the BladeLogic products. I'll ask around. Thank you!

Lane said...

Hey Aaron, as usual a really great summary and I look forward to your more in depth analysis. At my company, MSI, we also sell all three product lines (HP, IBM, and Cisco UCS) and recently got a couple of UCS chassis which I plan to dig into in the months ahead. Also I recently went through HP's Matrix trainng, which is basically their response to Cisco's UCS. It is a single SKU product that you can purchase which includes a c7000 chassis a management blade (for running all their "Insight" software to manage the Matrix) - no redundancy for this management blade, which really sucks, and can optionally include an HP EVA for storage, and HP services to tie all the pieces together. The software included on the management blade includes HP Insight Control (iLO), Virtual Connect Enterprise Manager (for managing multiple VirtualConnect Interconnects across multiple chassis), Insight Dynamics, Insight Recovery (for DR - still needs some work), RDP (Rapid Deployment Pack), SMP (Server Migration Pack), Insight Power Manager, and the real "Brains" of the Matrix piece, the Insight Orchestration Suite ( which includes and Admin, Designer, and Self Services Portal - all different panes of glass with one that plugs into Systems Insight Manager). I see promise for the product, but now it still feels like a bunch of software cobbled together, but given time it may pan out. The real Achilles heel for both HP and UCS in my opinion is the lack of automated storage provisioning. I think Cisco has a leg up with their scriptable API, but I need to dig a little deeper into that one. Hope that helps, and feel free to email me if you need any more details.

Kenneth Fingerlos said...

A couple thoughts...

First, HP can link chassis in a single rack, which extends the managment of a single chassis (OA) to cover all the chassis in a rack, from a single view.

Likewise, a virtual connect domain can include 4 chassis, and manages as a single unit inclusive of all the servers.


The OA provides a portal link into the VC Domain, but it it a seperate application.

There is integration of both the OA and Virtual Connect (VCEM) into HP SIM - the latter being a paid-for option.

Djordje said...

With HP Virtual Connect Enterprise Manager you have a central Virtual Connect (V) management solution, up to 250 VC domains, up 1000 C7000 blade enclsoures and up to 160000 blade servcers. :)

Kind regards
Tschokko

richard_paradis said...

Hp allows up to four C level chassis's to be interconnected and viewed through Insight Manager in a single pane. At some point in the future they may allow more. VSE may allow more, but we currently don't use/own it so Kevin would probably know more on it.

Aaron Delp said...

Thank you everyone for the comments. Working on an update to the HP information right now.

Aaron Delp said...

The post in now up to date with information based on user comments including new diagrams to better reflect the IBM and HP solutions. Thank you very much to everyone for filling in the gaps!!!

Anonymous said...

Nice summary.

What happens when UCS needs another pair of 6120/6140, because of bandwidth ? (with 20ports in 6120, how many UCS 5108 can it support in real world?)

does UCS have a "central management server" for multiple pairs of 6120/6140? can service profiles fail over from blades controlled by different pairs of 6120/6140?

Anonymous said...

Good stuff. You can see more re BMC and Cisco UCS here http://tinyurl.com/ye9pr2j

Aaron Delp said...

UCS supports 5 chassis per pair of 6120's today. I hear this will be increased in the near future to a greater number and eventually a limit of 40 is the Cisco goal.

I don't know the answer of what happens once that limit is reached but I will reach out to Cisco folks right now to find out. Very good question!

Brad Hedlund said...

Aaron,
You should also depict the Network switch the blade chassis connect to, as well as FC switch.
For IBM & HP you need to depict a separate brain for these elements.
For Cisco UCS, the same brain that controls the server configuration is the same brain that controls the Network & FC SAN, which is extremely powerful for automating total service provisioning.

By the way, HP's Insight Dynamics VSE costs an extra $20K per chassis! The capabilities Insight VSE brings are already included in the standard Cisco UCS architecture at the standard price.

Cheers,
Brad

Aaron Delp said...

Brad - Awesome comments! Thank you very much! I agree completely but I wanted to keep the scope of this post for the brains to just be the management of the blade state and the virtual I/O. I will get into the "other brains" with FC and network in future posts.

I didn't know that about the HP software. I will update that as well shortly.

Thank you!

Kelvin said...

Aaron, your post is getting very interesting!

UCS + BMC solution architecture and "brains" look very much like IBM's and HP's. What's difference - BMC is not Cisco's, someone else's Bigger Brain is controlling UCS brains; while Director and SIM are both IBM and HP products respectively. That should have some impact on deep integration.

I fail to see how Cisco UCS have any features like HP Insight Dynamics.
Can UCS Manager do real-time capacity planning and workload placement across all 3 hypervisors (ESX/XEN/Hyper-V) _AND_ physical servers? This is one of the Insight Dynamic feature. It also provides Capacity Advisor to generate "what-if" analysis for server consolidation to virtual machines.
(ref: http://bit.ly/aqMS7k) This is extremely useful if someone wants to take multi-phase migration to virtual machines.

UCS Manager does not seem to have any of these feature, and will rely on vCenter for only vSphere related performance/capacity monitoring.

Also, UCS provides service profiles for physical blade servers, much like what hp virtual connect provides, but can UCS provide the same abstraction for VIRTUAL machines too? With Insight Dynamics, hp provides an abstraction across both Physical and all Virtual Machines called "Logical Server"; extending that feature, it allows u to automate a DR solution using Insight Recovery for both physical and virtual machines.
ref: http://bit.ly/9qZNL7

Finally, a note on the "same brain" that does network I/O and server config/firmware etc, what happens when there is a bad firmware patch/human error/firmware binaries corrupted, and that bring down both 6120/6140? Regardless of best practices during a firmware patch, Murphy laws applies and bad things happens.. Does this means that a 6140/6120 firmware maintenance need to be schedule during a downtime which possible risk of loosing control and data access?
Will all the UCS blade servers loose management control AND data access?
Is it risky to depend on 2 FI modules for the control and data access to all 5 or 40 UCS enclosures?
For IBM/HP, if a bad patch goes to the AMM/OA pairs, the network I/O still works...

Aaron Delp said...

Kelvin - Thanks for your comments. You are obviously an HP fan, I get that. I'm not looking to change anybody's viewpoint and I'm not looking to take sides here, just presenting facts.

I will ask you this about the advanced Insight products. Are you or do have any customers that are actually running them in production are happy?

The reason I ask is because I have many HP customers and no one using the advanced features. I believe this to be common to both HP and IBM. I'm not saying they are bad or that they don't work. What I am saying is I just don't see anybody that actually uses them. Many people just never make it past the trial stage with them in my experience.

When you are asking if UCS provides resources to virtual machines. I believe you are talking about Cisco's Palo adapter for UCS so I do believe they have an answer for that offering.

On your last point, I haven't tried it yet but I would think you would apply firmware to one 6120, reboot, then apply it to the other. That is the same way IBM applies and the same way you would apply it to the OA's on HP. I have to disagree with you on that point.

I'm not saying UCS is better than HP by any means, but I also don't want FUD in this comments section.

Kevin said...

While you discussed IBM's offering using Open Fabric Manager, you didn't discuss IBM's ability to connect to a Cisco Nexus 5000. I posted a few months ago (http://bladesmadesimple.com/2009/10/how-ibms-bladecenter-works-with-cisco-nexus-5000/) how a user could have a "pass-thru" module to connect into the Nexus 5000, where the Nexus would be the brains - at least for the I/O. Since that blog was written, IBM and Cisco came out with the Nexus 4000i for the IBM BladeCenter (http://bladesmadesimple.com/2009/10/officially-announced-ibm%e2%80%99s-nexus-4000-switch-4001i-part-2/) which still connects to the Nexus 5000, it just uses fewer ports. I'd like to know your thoughts about these two solutions.

Aaron Delp said...

Hey Kevin - Yes, I am making the "Network brains" different from the "management brains". What I mean by management brains is the power state, remote control, and virtual I/O (presenting a virtual MAC and WWPN) to the outside world.

I did an article on the 4001i (terrible name BTW) a ways back and I linked to your articles.

"Sometime" in the future I will be doing an article comparing the network/FC side and in that I will make sure I compare the 4001i to UCS and also the IBM/BNT 10GB switch with allows IBM to vNic like Flex-10 and in some ways like Palo.

I really like the 4001i and we are actually proposing one to a customer right now. WAY better solution than a pass-thru unless you just need dedicated 10GB line rate speed to the blade.

Thank you for the comment!

Sal said...

The one thing that has not been discussed is the standards-based, open XML API that comes with UCS. I've been writing applications that interact with it, and it's a very straightforward process to manage and configure multiple UCS pods using just about any programming language. Yes, it's build your own, but it's free, and you can do whatever you want.

Casper42 said...

2 quick notes on the HP c7000.

1) The OAs do NOT have to have 2 separate IP Addresses. With a decent firmware level (its been available for at least a year) you can set the OA to "Enclosure IP mode" and it will take the IP on the Active OA and automatically fail that IP over to the passive OA when there is a change in role.
Look under Enclosure Settings \ TCPIP Settings for a checkbox [x]

2) You can stack more than 4 chassis for the single pane of glass view. It was meant to show you up to 4 Chassis in a single rack, but since its just a Network cable, you can stack horizontally as well. I dont know if there is an upper limit (aside from your patience in watching the inventory process on login) but we have 6 chassis in a stack right now and it works just fine.
The ability to pick and choose which chassis you want to login to at the login screen is great as well. If you have 2 clustered blades, 1 in Rack A and one in Rack B, and you have 4, 6, 8, etc chassis in your stack, you can select only the 2 chassis your blades are in at the login screen and avoid alot of (based on the task at hand) unnecessary detail.
I do really wish there was a way to get a single pane of glass for the Blade Servers though, as opposed to having to hit the + on Device Bays on each chassis and work around the other sub items in each chassis I may not be interested in. Just from your UCS screen shots it looks like the Servers tab does exactly that.

Casper42 said...

PS: Virtual Connect domains, the aggregate for the I/O Layer you mentioned, ARE in fact limited to 4 chassis though. So while you can single pane of glass 8 chassis for server/chassis details, you would need 2 separate VC Domains to manage those 8 chassis when it comes to network and SAN.

On the flip side, if you loaded up those same Chassis with Cisco 3120X switches for the network stack, you can stack up to 9 switches together for something that looks like a virtual 6509.
If you only have 1 layer of Network I/O in each chassis, this means you can have 9 chassis stacked together at the Cisco level with what looks like 2 switches. Very similar to the old End-of-Row design with a pair of CAT-6509s or the Nexxus equivalent.

Aaron Delp said...

Casper - Great comments! Thank you for your insight!!

Jeff Allen said...

Full Disclosure: I worked at HP for 8 years as an Infrastructure Architect specializing in Bladesystem and Virtual Connect and I now work at Cisco as a UCS Architect.

Casper 42 said: "The OAs do NOT have to have 2 separate IP Addresses. With a decent firmware level (its been available for at least a year)"

When I last saw this feature, the 2 modules do have different IP addresses, but they are able to swap the "active" address to either module. You could set the standby module to DHCP if you like however (might even be required).

Casper 42 said "You can stack more than 4 chassis for the single pane of glass view. It was meant to show you up to 4 Chassis in a single rack, but since its just a Network cable, you can stack horizontally as well"

Not exactly the intention. You can stack up to 7 (again last time I checked), but that was because you can fit 7 C3000 chassis in the same rack. Since you download the same physical file on HP.COM for the c3000 and c7000, you get some crossover features like this where you could physically arrange 7 c7000 chassis' horizontally. However, the requirement is that the "rack name" be the same in all of the OA's. It would not make good practice to give the same rack name to different chassis residing in physically different racks IMO. If you violate this, OA will complain about a rack name mismatch constantly.

A word on the feature itself though:
What this feature gives you is the ability to "see" all the chassis in a single view. It does not manage them as such. For instance, if I add an SNMP trap setting Chassis 1, it does not carry over to the other chassis in the view. Similarly, if I add a user to Chassis 1, it does not add it to any other chassis in the view.

Aaron Delp said...

Jeff - Great information! Thank you for the comment!

Anonymous said...

How about UCS feature like expendable memory. Seems like one of the use cases for UCS is VDI. Thank you.

Aaron Delp said...

I agree with the extended memory comment. Actually using the extended memory for VDI has come up a lot lately. Users are more comfortable with a memory dense blade fro VDI that servers in my experience right now.