Micro-talks II

March 2, 2020, 3:30 – 4:30 PM

Harvard Data Commons

Abstract: A Data Commons brings together data with cloud computing infrastructure and software tools and services to provide an integrated solution for research data management for a community. The Harvard Data Commons plans to integrate research tools used actively by the Harvard community with data shared and managed using Harvard Dataverse and access to cloud computing and storage provided by the New England Research Computing (NERC) and Northeast Storage Exchange (NESE). This proposed model for a Data Commons at Harvard could be applied to other institutions and research organizations in New England.
Presenter: Mercè Crosas is the University Research Data Management Officer, with Harvard University Information Technology (HUIT), and Chief Data Science and Technology Officer at Harvard’s Institute for Quantitative Social Science (IQSS). Before re-joining Harvard in 2004, Dr. Crosas worked for six years in the educational software and biotech industries, initially as a software developer, and subsequently as director of the software development team. She contributed to the development of lab information management systems (LIMS) for SNP discovery and genotyping and mass spectrometry. Before that, she spent six years at the Harvard-Smithsonian Center for Astrophysics, first as a pre-doctoral fellow for her Ph.D. in Astrophysics from Rice University, and later as a post-doctoral fellow, researcher, and software engineer with the Radioastronomy division. There she worked on Monte Carlo simulations of radiative transfer in evolved stars and contributed to the software for the Submillimeter Array interferometer. She earned a B.S. in Physics from the Universitat de Barcelona, Spain.

RSpace electronic lab notebook: an inter-operable active data management tool forming part of the Harvard Data Commons, and plans for a pilot on NERC

Abstract: The talk will first briefly overview RSpace, an electronic lab notebool used in managing active research data. It will go on to explain how RSpace fits into the Harvard Data Commons ecosystem of research tools. Finally it will discuss how having an ELN local to the datacenter where the storage and compute are could be powerful/transformative, and present plans for integrating RSpace with Jupyter notebooks and deployment on NERC.

Presenter: Rory Macneil is founder and CEO of Research Space, which provides the RSpace electronic lab notebook. RSpace serves as a digital data hub for managing data produced in life sciences and related research fields, in academic and commercial settings. The connected approach that informs the design and development of RSpace reflect Rory’s passion for providing the research community with an ecosystem of inter-operable tools that work together to support a wide range of workflows. Rory is based in Cambridge.

The Open Storage Network: Distributed Storage Cyberinfrastructure for Data-Driven Science

Abstract: The Open Storage Network(OSN) is a network of storage nodes distributed across the US that is designed to simplify sharing of active scientific data sets.  While other uses may emerge over time, the OSN is intended initially to serve two principal needs: (1) facilitate smooth flow of large data sets between data and computing resources such as instruments, synthetic data projects, campus or regional data centers, and cloud providers; and (2) make it easy to expose long tail data sets to the entire scientific community.  The talk will describe the current pilot deployment, which consists of CEPH storage nodes at five sites across the US, and some of the scientific use cases that are starting to use the pilot system.

Presenter: John Goodhue is the Executive Director of the Massachusetts Green High Performance Computing Center (MGHPCC), which  is dedicated to supporting the growing scientific computing needs of faculty-driven research at MIT, University of Massachusetts, Boston University, Northeastern University, and Harvard University. John is a business and technical leader with 30 years experience in networking and high performance computing. He has held senior engineering management, general management, and technology leadership positions at Cisco Systems, where he led the development and marketing of Internet routers for service providers, and BBN Technologies, where he led projects to develop Internet routing and High Performance Computing technologies. He has also been on the early management teams for several Boston-area startup companies. John holds a B.S. in Computer Engineering from the Massachusetts Institute of Technology.

What is FABRIC?

Abstract: FABRIC is a unique national research infrastructure to enable cutting-edge and exploratory research at-scale in networking, cybersecurity, distributed computing and storage systems, machine learning, and science applications.

It is an everywhere programmable nationwide instrument comprised of novel extensible network elements equipped with large amounts of compute and storage, interconnected by high speed, dedicated optical links. It will connect a number of specialized testbeds (5G/IoT PAWR, NSF Clouds) and high-performance computing facilities to create a rich fabric for a wide variety of experimental activities.

This talk will be an introduction to FABRIC with emphasis on establishing collaboration with cloud testbeds and facilities in Massachusetts.

Presenter: Paul Ruth is an Assistant Director for Network Research and Infrastructure at RENCI-UNC Chapel Hill. He is a core member of the RENCI team that has an established history of building and supporting cloud and networking testbeds for computer systems and networking researchers. He spent many years working on the NSF ExoGENI testbed and is now using that experience to enable the widest array of experiments possible on NSF Cloud Chameleon and NSF’s newly funded nationwide programmable core networking testbed called FABRIC.  He is excited about the future of these, and other, testbeds and is working toward enabling experiments that span several testbeds composing large federated experimental facilities.

Where is Ironic, and where is it going?

Abstract: OpenStack Ironic is the Bare Metal as a Service project with-in the OpenStack community. This Open Source community developed and maintained foundation helps orchestrate the bare metal resources. This community has made great strives as we see further adoption, increased usage, and new use cases appearing. So the question becomes, where are we and where we have been.

Presenter: Julia Kreger is a Principal Software Engineer with Red Hat. While she has only been with Red Hat for a few years, she has been working on the OpenStack Ironic project for the past five years as improving the quality of life of data center operators is one of her passions… for she is a former data center operator herself. She has been the Ironic Project Team Leader (PTL) for the last few development cycles and currently serves on the OpenStack Board of Directors. She has previously served on the OpenStack Technical Committee (TC).