VMware has announced recently a new extension for Big Data called vSphere Big Data Extensions. Currently in public Beta, the product allows the integration between Project Serengeti and vSphere. The in-between layer is the vSphere Big Data Extensions, which is visible in vSphere via a plugin through a vSphere Web client.
The vSphere Big Data Extensions allows to fully provision, create, manage or monitor hadoop based clusters. The VMware initiative is not new, since they were talking about Project Serengeti since about 2 years, but the newly created sets of tools which integrates itself into vSphere, is new. And it enables very fast deployments and managements of Hadoop clusters.
So on one side there is a Open Source Project Serengeti, on the other side there is VMware vSphere, and now the vSphere Big Data Extensions, which is kind of layer in between.
You can check out the Community Beta page at VMware here. From there you can also download the product. Note that there is two versions, one for vSphere Enterprise or Enterprise Plus, while other version is for a standard version of vSphere.
As an additional download there is also a Big Data Extensions Management Server VM. There is an Administrator guide available (find it in the links section), which shows you what do you need and which component to deploy first (the Serengeti vApp has to be deployed first, and then you install the Big Data Extensions Plugin. Once deployed, there is a new icon in the vSphere Web client.
vSphere Big Data Extensions Beta Features
- Deploy Hadoop clusters within vSphere (possibility to deploy number of instances of your choice).
- Easily manage vSphere resources for use by Hadoop clusters.
- Manage and monitor Hadoop clusters.
- Monitor the Hadoop resource usage.
Pic from the UI (Source VMware Office of the CTO Blog)
Through the web console you are able to:
- Start, stop the Hadoop cluster
- Scale out a Hadoop cluster
- Delete Hadoop cluster
- You can access the resource usage and Elastic scaling
- Use disk shares to prioritize cluster VMs by settings the disk IO shares for VMs.
Quote from the official VMware Hadoop Page:
Virtualizing Apache Hadoop on vSphere gives users the ability to create and deploy a cluster in minutes while not sacrificing performance. Virtualizing Apache Hadoop on vSphere using BDE also frees enterprises from buying dedicated hardware for Apache Hadoop.
Hadoop and HBase clusters are composed of three different node types: master nodes, worker nodes, and client nodes. Toyou get the deep insight on Hadoop you'll certainly want to study some documentation first. Here are some links related to Hadoop and vSphere Big Data Extensions.
Links:
- Project Serengeti (Source Forge)
- VMware Hadoop
- The Main documentation set for the beta
It's certainly interesting to see the VMware initiative to show up with a product, even if not in the final release yet. But it seems that the deployment and the configuration of Hadoop cluster on VMware vSphere and the related tasks should get real quick. If you're involved with Apache and Hadoop, this is certainly a product to watch.
Feel free to subscribe to our RSS Feed.