The Bare Metal Imaging (BMI) is a core component of the Mass Open Cloud that (i) provisions numerous nodes as quickly as possible while preserving support for multitenancy using Hardware Isolation Layer (HIL) and (ii) introduces the image management techniques that are supported by virtual machines, with little to no impact on application performance.
Imagine thousands of nodes in a data center that supports a multitenant bare metal cloud. We need a central image management service that can quickly image the nodes to meet the customer’s needs. Upon completion of a customer’s project, the data center administrator should ideally be able to reallocate the resources within few minutes to use them for another customer. As of now, these techniques are in use for Virtual Machines (VMs), but not for bare metal systems. This project aims to bridge this gap by creating a service that can address the above mentioned issues.
Bare metal systems that support Infrastructure as a Service (IaaS) solutions are gaining traction in the cloud. Some of the advantages include:
- Best isolation with respect to containers or VMs.
- Predictable/stable performance when compared to VMs or containers, especially on input/output (I/O) intensive workloads such as Hadoop jobs, which need predictable storage and network I/O.
- Leveraging benefits of cloud services, such as economics of scale. As of now, VMs are scalable and elastic, as a customer pays for his/her usage based on resource consumption.
- The baremetal nodes could be used for other IaaS services like Openstack or applications like HPC which can consume ideal CPU cycles.
The main concerns of a bare metal system are the inherent slowness in provisioning the nodes and the lack of features, such as cloning, snapshotting, etc.
This project proposes a system that includes all of the above advantages and also addresses the fast provisioning issue for a bare metal system. Using this system, we propose to provision and release hundreds of nodes as quickly as possible with little impact on application performance.
Current BMI (IMS) Architecture
The current design consists of a pluggable architecture that exposes an API for such features as:
- provision – Provisions a physical node with given image
- snapshot – snapshots the given image, so that we can use this as golden image
- rm – Removes the image from project
- list – Lists the images available in project
- upload – Uploads the image to library
We use Ceph as a storage back-end to save OS images. For every application we support, we have a “golden image,” which acts as a source of truth. When a user logs-in and requests a big data environment, we clone from this golden image and provision nodes using the cloned image and a PXE bootloader. Hardware Isolation Layer (HIL) serves as a network isolation tool through which we achieve multitenancy. HIL provides a service for node allocation and deallocation. For more details about HIL, please visit https://github.com/CCI-MOC/haas.
- Prof. Orran Krieger, Boston University
- Prof. Gene Cooperman, Northeastern University
- Prof. Peter Desnoyers, Northeastern University
Research Scientist(s), Postdoc(s) and Engineer(s)
- Dr. Jason Hennesey (Postdoc – now at NetApp Inc.)
- Nasibeh Teimouri (Ph.D. Student)
- Ugur Kaynar (Ph.D. Student)
- Sahil Tikale (Ph.D. Student)
- Ravi Santosh Gudimetla (Masters Student – now at Red Hat Inc.)
- Sourabh Bollapragada (Masters Student – now at Arista Networks)
- Daniel Finn (Masters Student – now at Wayfair)
- Pranay Surana (Masters Student – now at Lighthouse AI)
- Sirushti Murugesan (Masters Student)
- Explore sensitivities of applications like OpenStack, HPC when using a network mounted system
- Explore moving Operating systems from physical to virtual systems seamlessly.
- Integrate support for attestation infrastructure for secure cloud.
- A functional BMI on Engage1 cluster without modifications to existing infrastructure.
- Ongoing support to Secure Cloud project.
- Automated install setup for Redhat and other OS.
- Performance evaluation for iSCSI server: IET vs TGT
- Automated BMI install setup for ubuntu.
- Multi-tenant iSCSI using TGT as backend.
- Built a simple scheduler for dynamically moving the nodes across various clusters.
- Performed experiments for improving the overall utilization of datacenter.
- Built a proof of concept CI integration using Jenkins in Openshift.
- Built OpenStack and HPC custom scripts for dynamically adding or removing nodes from a cluster.
- Enhancements to code base e.g., Re-wrote exception, Database classes.
- Poster accepted to SC’16 with initial findings on datacenter utilization improvements using BMI and HIL(close to 20% improvement with simple scheduler policies).
- Performance analysis of BMI vs Ironic and other provisioning systems. – In progress.
- Continuous Integration using Redhat Openshift. – In progress
- Publish BMI paper – In Progress
- Making BMI pluggable – to work with different Network Isolators iSCSI Servers and Storage Backends.
- Testing/Deploying BMI in production MRI Cluster
- Exploring iSCSI-Multipathing for load balacing and fault tolerance
- Exploring security issues for provding a publically available provisioning service using BMI
- Improving the overall UX(User Experience) by having a single interface that can talk to HIL and BMI.
- Enhancements to the code base e.g., Complete unit tests for iSCSI, HIL etc – PR’s on the way.
- Exploring other features of true virtualization on physical nodes like suspend and resume operations.
- Exploring other scheduler framework policies to further improve datacenter utilizations.
Planning and Getting Involved
To get involved in this project, please send email to (MOC team-list) and/or join the #moc irc channel on freenode.