Overview

 

The Mass Open Cloud (MOC) is a shared cloud platform operated by Boston University (BU), Northeastern University (NU), Harvard University, University of Massachusetts, and Massachusetts Institute of Technology in the MGHPCC data center.  It currently has an Infrastructure as a Service offering based on OpenStack.  Users can, in a self-service fashion, stand up virtual machines, use object storage, and create on-demand HDFS environments (with Hadoop, SPARK, PIG…).  

The MOC provides its users an alternative to public clouds, like AWS or Azure to, for example:

  • stand up long-running services that can be accessed over the internet (e.g., rich websites)
  • deploy low-level software that is incompatible with today’s (e.g., institutional) production HPC clusters (e.g., operating systems, specialized libraries, etc),
  • stand up private on-demand Big Data environments.  

The MOC is not an alternative to existing batch scheduled institutional HPC clusters (NU Discovery, BU SCC, etc.), but instead serves as a complimentary service that offers users long-term, interactive use of virtualized and bare metal resources.  Examples of teams/projects currently using the MOC include Dataverse, Worldmap, SESA/EbbRT, the Billion Object Project, and the Conclave project on scalable MPC, and largescale interdisciplinary NLP projects. The MOC is also being used for a number of courses where students need low-level access to virtualized environments, such as the NU/BU cloud computing course and the BU Data Mechanics course

Services available soon through the MOC include OpenShift (self-scaling container-based Platform-as-a-Service with support for most popular web frameworks), Cloud Dataverse (i.e. efficient object-level access to large datasets managed by Dataverse) and a new simpler to use GUI for end-users.  The MOC provides (trusted) researchers that require low-level access to computers a Hardware-as-a-Service offering that we plan to augment with a testbed that will contain a number of different accelerators, such as GPUs and FPGAs. We will also provide access to the North Eastern Storage Exchange (NESE) as it becomes available.

Upon request, basic MOC accounts are available to faculty and students from participating institutions at no cost, with the default quota on the production “Kaizen” OpenStack cloud of 10 instances, 20 VCPUs, 50GB memory, 1TB of storage, and 2 external IP addresses. Additional resources are typically granted upon PI request. You can apply for an account by going to the MOC web page and completing the form to request an account.

While the MOC has been highly reliable, and all data is triply replicated in our storage, the service is currently still provided in AS IS basis, and we strongly encourage users to back up their data externally.

Additional Documentation

Infrastructure backing the main MOC cluster that Kaizen is part of currently consists of

  • 4 Dell High Capacity – 480 TB storage (usable 160 TB)
  • 4 Fujitsu High Performance Storage – 136 TB (usable 45 TB)

 

Monitoring

Goal

The MOC integrates a production datacenter within a research environment. This unique blend gives researchers the power to collect and utilize cloud-usage data that most cloud providers do not collect or explicitly hide (e.g., due to competitive reasons).  The goal of this working group is to leverage this unique opportunity to understand how to simplify and automate the tasks today’s cloud operators find most difficult or expensive. These include, but are not limited to, detecting and diagnosing problems, ensuring security and privacy, reducing power utilization, and brokering.

Key Working Group Topics

  1. Monitoring & tracing: What data needs to be collected to help admins, devops teams, and tenants with important operational tasks?  How can this data be collected in production settings without incurring noticeable overhead?
  2. Diagnosis: What techniques, such as those that use machine learning, statistics, or queueing theory, can be used to automate diagnosis tasks and automatically put in place fixes?
  3. Brokering: What mechanisms are necessary to help tenants and infrastructure providers choose the best compute, storage, and networking resources for their needs?
  4. Data sharing (cross cutting with the above areas): How can tenants and providers share information to help with operational tasks?
  5. Security & Privacy: What mechanisms are necessary to keep cloud providers from learning too much about tenants’ applications?

Logistics

Meetings are held every Thursday at 9 am eastern over GoToMeeting.  Meetings are devoted to two topics.  The first is the existing monitoring and tracing infrastructure at the MOC and how to improve it.  The second is an exploration of what working group members (both from academia and industry) are working on.  Members have access to a Google Drive folder with various resources, including past meeting minutes and a list of papers of interest to the the group.

If you are interested in participating, please contact Raja Sambasivan <first initial.r.last initial@bu.edu>.

Working Group Members

  • Co-lead: Raja Sambasivan (BU)
  • Co-lead: Alina Oprea (Northeastern)
  • Emre Ates (BU)
  • Ayse Coskun (BU)
  • Rodrigo Fonseca (Brown)
  • Rick Friedman (Cycle Computing)
  • Ravi Gudimetla (Red Hat)
  • Laura Kamfonik (BU)
  • Ashish Kamra (Red Hat)
  • Scott Kelso (Lenovo)
  • Ugur Kaynar (BU)
  • Orran Krieger (BU)
  • Peter Portante (Red Hat)
  • Lily Sturmann (BU)
  • Shuwen (Jethro) Sun (BU)
  • Ozan Tuncer (BU)
  • Ata Turk (BU)
  • Da Yu (Brown)

Links

Communication

December 2017

Click here to see the latest news on Kaizen and OpenShift!

April 14, 2017

Four of the proposed talks from the MOC had been accepted to the Boston OpenStack Summit. They are

  • Monday 8, 11:15am-11:55am Cloud Dataverse: Data repository software for open clouds
  • Monday 8, 3:40pm-4:20pm, MOC Simple GUI – AN “and” to OpenStack Horizon
  • Wednesday 10, 11:00-11:40am, When Dataverse meets OpenStack…
  • Wednesday 10, 4:30pm-5:10pm, Per API Role Based Access Control

February 14, 2017 …Update:

It is a feeling of great happiness to welcome our student /colleague back to his home here in Boston!

February 2, 2017

Since last Friday, as the world has scrambled to understand and respond to an executive order titled “Protecting the Nation From Foreign Terrorist Entry Into the United States,” the Massachusetts Open Cloud has been grappling with the direct and immediate impacts on the MOC community. A member of our community was traveling abroad when the executive order was signed; as a result, this colleague is unable to return the United States indefinitely.

The implications of this are real. This student (due to the legal nature of this issue, we are not including a name), is blocked from working on a PhD research project, a joint project with several of our industry and university partners. The impact is personal as well as professional, as the student is currently barred from an apartment, possessions, and an entire life here in the United States.

The MOC community is working to address the situation with our colleague in whatever capacity it can. At the moment, there is very little that is certain, but we do know that this is having a very disruptive and damaging impact on our colleague and impacting the research plans of the MOC. More broadly, the events of this past weekend are counter to the vision and ideals of the MOC and the broader academic community.

For more on how the executive order is being addressed by each of the MOC academic partners, visit the links below:

http://www.bu.edu/today/2017/trump-immigration-ban/

http://www.northeastern.edu/president/2017/01/28/embracing-our-global-community/

http://www.thecrimson.com/article/2017/1/30/faust-immigration-email/

http://news.mit.edu/2017/mit-responds-trumps-executive-order-travel-0130

http://www.masslive.com/news/index.ssf/2017/01/umass_president_vows_to_help_i.html

 

Orran Krieger                                                            Peter Desnoyers                                                                     Piyanai Saowarattitada

2016 MOC Annual Workshop Retrospective

Thank you for attending the 2016 MOC Workshop!

The Second Annual Mass Open Cloud Workshop and MOC Hands-On Session was held December 6 – 7, 2016 at Boston University.

The Workshop on December 6 featured technical presentations, insights from MOC users, working group sessions and an Industry Partner Roundtable. December 7 featured a limited capacity MOC Hands-On Session.


Mass Open Cloud Workshop
Tuesday December 6, 2016
Boston University
George Sherman Union, Metcalf Hall (2nd Floor)
775 Commonwealth Ave
Boston, MA 02215

2016 MOC Fall workshop announcement page

Talks:

  • MGHPCC as a platform for MOC, Jim Culbert, MGHPCC abstract, slides
  • Resource Federation in a Multi-Landlord Cloud, Kristi Nikolla, MOC abstractslides, poster
  • HIL: an exokernel for the data center, Jason Hennessey, MOC abstract, slides, poster
  • Rapid Bare-Metal Provisioning and Image Management,  Ravisantosh Gudimetla and Apoorve Mohan, MOC abstract, slides, poster
  • BD Cache: Big Data Caching for Data Centers, MOC: Mania Abdi, Northeastern University; Peter Desnoyers, Northeastern University; Emine Ugur Kaynar, Boston University; and Mohammad Hossein Hajkazemi, Northeastern University abstract, slides, poster
  • Cloud Dataverse, Mercé Crosas, Harvard University abstractslides, poster
  • Big Data as a Service (BDaaS) and Public Datasets Repository @MOC, Ata Turk, MOC abstractslides
  • What’s in your petabyte?, Chris Hill, MIT abstract, slides
  • Elastic HPC on the MOC, Evan Weinberg, MOC abstract, slides, poster
  • Case Study of Elastic Slurm, Rajul Kumar, MOC abstractslides
  • High Performance Computing with FPGA-Enhanced Clouds, Martin Herbordt, Boston University abstract, slides
  • The Boston Area Research Initiative: Leveraging Modern Digital Data for Research, Policy, and Practice, Dan O’Brien, Northeastern University abstractslides 
  • Building the BOP (Billion Object Platform), a System to Lower Barriers to the Access of Large Spatial Datasets, Ben Lewis, Harvard University abstract, slides
  • Secure firmware for secure clouds, Trammel Hudson, Two Sigma abstract, slides
  • Elastic Cyberinfrastructure for Research Computing, Glenn Bresnahan, Boston University abstract, slides
  • MACS and the MOC, Mayank Varia, Boston University abstract, slides, poster
  • Hiding communication metadata using mutually distrustful servers, Nickolai Zeldovich, MIT abstract, slides 
  • Software Design and Development in Research Contexts across Disciplines, Andrei Lapets, Boston University abstractslides 
  • Analytics in Cloud Security, Alina Oprea, Northeastern University abstract, slides, poster
  • Data Center Power Analytics, Ayse Coskun, Boston University abstract, slides, poster
  • Networking Marketplace Developments, Rodrigo Fonseca, Brown University abstract, slides, poster
  • Enabling big data workflows over distributed, federated data, Nikolaj Volgushev, Boston University abstract, slides
  • SESA and the MOC, Jim Cadden, Boston University abstract, slides
  • Secure DevOps for Government in MOC, Peter Walsh, Jackpine Technologies abstract, slides, poster
  • Hanscom Academic Cloud Team (HACT) Program, Brenton Byrd-Fulbright and Tony Janeczek, US Air Force Life Cycle Management Center abstract, slides 
  • MOC Development, Piyanai Saowarattitada, Mass Open Cloud, slides

Additional Posters:

Current work on Cloud Dataverse

-by Anuj Thakur

This blog talks about the work that is going on the Cloud Dataverse project. The Cloud Dataverse is the Dataverse installation on MOC OpenStack VM. This installation has the capability to harvest metadata and replicate the data files from a OAI Set of any Dataverse installation. To do so you need to follow the guide for creating a OAI Client and run the normal harvest. At the end of this harvest the data files for every datasets is stored locally. Locally in this scenario is defined as the storage area for the OpenStack VM i.e the Swift Service endpoint of MOC production environment.

The immediate work is being done on improving the performance of the upload functionality and on the concept of “Cloud enabled dataset”. The Cloud enabled dataset will be marked cloud enabled and the back-end will reflect the same. Once a dataset is cloud enabled it will be included into a OAI set created specifically for replication of the datasets to Cloud Dataverse installation. The Cloud Dataverse installation will synchronize for the changes that are made on the Dataverse installation side every time the harvest is run.