Shall I use Zones or LDOMs?
Recently (especially since the SPARC T4 release) I got this question a couple of times - "We are running/migrating to T2/T3/T4 servers, and considering for our setup the virtualization possibilities. What shall we go for, zones or ldoms?"
Of course one can't answer this question without talking about the platform requirements and the reasons to pick the right technologies, but before we'd go into details, let me get the most important statement straight:
Zones and LDOMs are not rivalling, but complementary technologies. If you need kernelspace separation, use ldoms. But run your applications in zones within those ldoms anyway!
Let's get some terminology clear first:
Now, why would you want to run zones?
And what are the reasons to run LDOMs?
As you see these two technologies fulfill different requirements, they are in different levels of your operation-stack, ldoms being a HW-virtualization - a host for kernels to run, and zones being an OS-virtualization, to provide containers for your application to run in:
Of course one can't answer this question without talking about the platform requirements and the reasons to pick the right technologies, but before we'd go into details, let me get the most important statement straight:
Zones and LDOMs are not rivalling, but complementary technologies. If you need kernelspace separation, use ldoms. But run your applications in zones within those ldoms anyway!
Let's get some terminology clear first:
- LDOMs are now called Oracle VM for SPARC. I will use these terms interchangably.
- Zones have started their lives as project Kevlar, then named zones, then marketed as containers, we are now back to zones again.
- LDOMs are the HW-Virtualization technology of the SPARC-T (CMT, ChipMultiThreading, Coolthread, sun4v, etc) server series, it is their ability to carve up the server into Logical DOMains, running on a hypervisor that runs in the firmware.
- Zones are the featherweight OS-Virtualization technology of Solaris on all of the platforms (Sparc-T, Sparc-M and x86 too)
- Every T server is running ldoms. If you don't partition your box into domains, you are still running one single large ldom, called the primary domain, encapsulating the complete server.
- Every Solaris 10+ OS installation has one zone, the global zone (GZ). This is where the [shared] kernel[space] runs, and the non-global zones (NGZ) are the containers separating applications in the userspace.
Now, why would you want to run zones?
- Container principle: They cleanly separate your applications from each other, by maintaining for them a separate set of Solaris packages, their dedicated CPU resources, their IP-stack, their filesystems, etc.
- Clean architecture: You won't poison your OS installation in the GZ running on the HW with additional packages/settings. The GZ manages the resources between the zones, runs the kernel, does the scheduling, runs the cluster, manages the devices, etc. The NGZs run the applications.
- Flexibility: You can simply detach a zone from the GZ and attach it to another GZ on another box, including the application. You can easily clone zones too.
- Security: Should a NGZ ever get compromised, the attacker can't bother the GZ, or applications running in other NGZs.
- Resource Management: You can dedicate the guaranteed amount of CPU shares a zone should get (using the FairShareScheduler), but as long as your CPU pool isn't 100% utilized, every zone can use more than the amount dedicated to it - that is, you can overcommit your resources.
And what are the reasons to run LDOMs?
- Kernel level separation:
- You might want to run different updates of Solaris 10 within a box.
- You might want to run Solaris 10 and Solaris 11 right next to each other within a box.
- Live migration: You can't livemigrate zones, but you can livemigrate ldoms.
- Some of your applications might require to run in the GZ, and you don't like the idea of running applications both in the GZ and its NGZ at the same time, hence you separate them into ldoms.
- You need to reduce the number of vCPUs in a box for licensing issues. LDOMs are now recognized as hardpartitions by Oracle, license boundaries.
- You don't want your I/O to depend on a single service domain - you can build multipathgroups of devices between two I/O device providing service domains.
As you see these two technologies fulfill different requirements, they are in different levels of your operation-stack, ldoms being a HW-virtualization - a host for kernels to run, and zones being an OS-virtualization, to provide containers for your application to run in:
To give you an idea: run S10 and S11 in ldoms next to eachother within the same box, run branded and native zones on top of them
To summarize:
The question shouldn't be about zones vs. ldoms. Use zones, they are your friends. The question is, if you partition your T-SPARC server into ldoms below your global zones to run your NGZs in.
Especially with Solaris 11, with Crossbow, the new network virtualization technology (that enables all your NGZs to have a dedicated IP stack) and the possibility to run Solaris 11 native zones and Solaris10 branded zones on top of Solaris 11, you have two quite powerful technologies to really get your server's worth - and by that I mean having a high server utilization. The higher that utilization is, the more you get for your costs.