34th International Conference
on Massive Storage Systems
and Technology (MSST 2018)
May 14 — 16, 2018

Sponsored by Santa Clara University,
School of Engineering


Since the conference was founded by the leading national laboratories, MSST has been a venue for massive-scale storage system designers and implementers, storage architects, researchers, and vendors to share best practices and discuss building and securing the world's largest storage systems for high-performance computing, web-scale systems, and enterprises.
    



Hosted at
Santa Clara University
Santa Clara, CA


2018 Conference


In 2018, MSST focused on distributed storage system technologies, including persistent memory, long-term data retention (tape, optical disks...), solid state storage (flash, MRAM, RRAM...), software-defined storage, OS- and file-system technologies, cloud storage, big data, and data centers (private and public). The conference focused on current challenges and future trends in storage technologies.

MSST 2018 included a day of tutorials and two days of invited papers. The conference was held, once again, on the beautiful campus of Santa Clara University, in the heart of Silicon Valley.

Santa Clara University


2018 Program

Tutorial, Monday, May 14th
9:00am—1:00pm

LOCKSS (Lots of Copies Keeps Stuff Safe)
Workshop on Distributed Digital Preservation


This workshop will focus on:

  1. Trends in digital preservation
  2. Information resources
  3. Distributed digital preservation concepts
  4. LOCKSS preservation features and storage framework
  5. The methodology of the LOCKSS open source software re-architecture
  6. Open discussion: Opportunities for technology collaboration

LOCKSS Background:


LOCKSS is a leading technology for peer-to-peer distributed digital preservation. LOCKSS Java software is going through a major update and revision that will affect existing LOCKSS users and also future community innovators. Originally designed and launched in the late 1990s, LOCKSS was utilized in the library community to help maintain and preserve eBooks and eJournals. The present software re-design will change not only the focus and capabilities of the LOCKSS technology, but also the collaborative partnerships, support structure, and market focus. It will make LOCKSS technology applicable to new types of users, integrations, and content.

Instructors:


Thib Guicherd-Callin, LOCKSS Development Manager, Stanford University Libraries (bio)

Nicholas Taylor, Program Manager, LOCKSS and Web Archiving, Stanford University Libraries (bio)

Art Pasquinelli, LOCKSS Partnership Manager, Stanford University Libraries, Preservation and
Archiving Special Interest Group (PASIG) Steering Committee (bio)

Presentations: 1, 2, 3, 4



Tutorial, Monday, May 14th
2:00pm—6:00pm

Big Data for Big Problems
(and the technology that underpins it)


This tutorial will present a Big Data Analytics project as a case study in handling big data and the software technology that is used to construct it. The platform is the key component of a comprehensive research program at Georgetown University to enable transformative multidisciplinary integrative research. Research methodologies and computational techniques to model, process, and analyze massive amounts of data efficiently, while assuring privacy. Target data technologies that will be covered include big data, data privacy, and high-assurance systems.

This class is for data scientists, system engineers and technical team managers.

Tutorial objectives:

Class Structure:

  1. Overview of the architecture
  2. The Toolkit
  3. The Visualization Utility (Advance VU)
  4. The ontology and taxonomies
  5. The road-map for the future, questions, and discussion

Instructors:


Norman R. Kraft, Georgetown University

Helen E. Karn, Georgetown University

Stephen Baird, AdaCore

Presentations: 1, 2



Invited Track, Tuesday, May 15th
Welcome Address
Dr. Alfonso Ortega, Dean, School of Engineering, Santa Clara University
Persistent Memory, NVM Programming Model and NVDIMMS (presentation, video)
Dr. Thomas Coughlin, President, Coughlin Associates (bio)
SNIA, its technical work, and its outreach initiatives are key contributors to an ecosystem driving system memory and storage into a single, unified “persistent memory” entity. Learn how the SNIA Non Volatile Memory Technical Work Group is delivering specifications describing the behavior of a common set of software interfaces that provide access to non volatile memory, and how hardware and OS support persistent memory today.
Erasure Coding for Scale-Out, Shared Storage Infrastructures in Converged Data Centers
(presentation, video)
James Jackson, Excelero (bio)
Storage capacity and performance requirements are higher than ever. The flash revolution enables meeting both requirements without increasing, in most cases even decreasing, the storage footprint. This makes individual drives even more valuable as they store more and more data, which requires protection. Data protection can be achieved through replication or erasure coding. Both methods have their benefits and costs: mirroring has virtually no impact on performance but there is a higher cost as each copy requires more capacity. Erasure coding drastically reduces the capacity requirement, for similar or better data protection levels, but you have a performance trade-off. In this talk, we will show how to pay less for the same level of protection, without the performance trade-off.
Cross-Cloud Distributed File Systems: Bursting File System Dependent Workflows
to the Public Cloud (presentation)
Dr. Allon Cohen, VP, Elastifile (bio)
As enterprises seek to handle more projects in shorter time frames, the benefits of cloud integration have become increasingly apparent. The scalability and elasticity of cloud infrastructure is an ideal solution for scaling compute resources and/or for offloading surplus jobs. In this session, we will explain how Elastifile's data management platform enables enterprises to efficiently burst file system dependent workloads to cloud...allowing them to apply cloud resources to existing tools and scripts, without application refactoring.
Scaling Out Data Center Storage (presentation, video)
Ted Deffenbaugh, Seagate
As data grows, the desire to scale out at the lowest cost to store everything has become the norm. This presentation will cover some of technical and business fundamentals to enable lower, physical hardware costs. We'll look at how hard drive performance can lower the overall cost at the technology level.
Building cost-effective data density in the pre-DNA-storage Era (presentation, video)
Mark Pastor, Quantum Corporation (bio)
Another speaker will discuss DNA as a storage medium. Its appeal is extreme density, very low cost, and ease of replication. While we are waiting for the commercialization of DNA storage, what is the best option today for durable, exabyte scale storage? What storage technology is dominant for these use cases? What trends are we seeing today related to use cases, and advances in high-capacity storage such as density, low cost, performance, durability and integration with high performance storage?
Programming Models for Accessing NVM Over RDMA (presentation, video)
Wendy Elsasser, Arm
Application programming models for using RDMA capable networks have begun to incorporate support for persistent memory. Because RDMA APIs have been designed under the assumption that the remote data being accessed is, in fact, byte addressable memory, the facility to use NVM as storage (or something storage-like) opens up new challenges and opportunities. This talk will give a comparison and overview of the current state of libraries and programming models for NVM access over RDMA networks.
Incorporating NVM into Data-Intensive Scientific Computing (presentation)
Dr. Philip Carns, Argonne National Laboratory
Two concurrent trends are motivating the HPC community to rethink scientific data service architectures: the emergence of NVM devices with radically different performance characteristics, and a growing interest in specialized data services that provide performance, convenience, or features beyond those of a conventional file system. The convergence of these trends will bring about a fundamental shift in the productivity of data-intensive scientific computing, but only if we capitalize on NVM characteristics through the use of efficient, portable, and flexible interfaces that complement HPC network and CPU capabilities. This talk will highlight those challenges from an HPC perspective and discuss how the state of the practice can be adapted to meet them.
Programming with Persistent Fabric-Attached Memory (presentation, video)
Dr. Kimberly Keeton, Hewlett Packard Enterprise
Recent technology advances in high-density, byte-addressable non-volatile memory (NVM) and low-latency interconnects have enabled building large-scale systems with a large disaggregated fabric-attached memory (FAM) pool shared across heterogeneous and decentralized compute nodes. In this model, compute nodes are decoupled from FAM, which allows separate evolution and scaling of processing and fabric-attached memory. The large capacity of the FAM pool means that large working sets can be maintained as in-memory data structures. The fact that all compute nodes share a common view of memory means that data sharing and communication may be done efficiently through shared memory, without requiring explicit messages to be sent over heavyweight network protocol stacks. Additionally, data sets no longer need to be partitioned between compute nodes, as is typically done in clustered environments. Any compute node can operate on any data item, which enables more dynamic and flexible load balancing.

This talk will describe the OpenFAM API, an API for programming with persistent FAM that is inspired by partitioned global address space (PGAS) models. Unlike traditional PGAS models, where each node contributes local memory toward a logically shared global address space, FAM isn't associated with a particular node and can be addressed directly from any node without the cooperation or involvement of another node. The OpenFAM API enables programmers to manage memory allocations, access FAM-resident data structures, and order FAM operations. Because state in FAM can survive program termination, the API also provides interfaces for naming and managing data beyond the lifetime of a single program invocation.
APIs for Persistent Memory Programming (presentation, video)
Andy Rudoff, Intel
This talk will cover the current state of persistent memory APIs available on various operating systems. It will describe the low-level APIs provided by the operating system vendors, as well as higher level libraries and language support, covering a variety of use cases for persistent memory.
MarFS and Multi-Tier Erasure (presentation, video)
Garrett Ransom, Los Alamos National Laboratory (bio)
The ever-increasing bandwidth and capacity requirements of HPC data sets have spurred the development of innovative solutions throughout the storage stack. MarFS, an open source file system providing a near-Posix interface and metadata/data storage abstraction, has leveraged object storage semantics and a multi-tiered erasure structure to provide resilient data storage parallelized across tens of petabytes of commodity disk. Known as MarFS Multi-Component, this implementation serves as the near-archive, Campaign Storage tier at LANL. Providing months of residency, the system offers a cheaper and faster alternative to traditional tape archiving solutions. This talk will focus on the design principles, production experiences, and future development of Campaign Storage and the MarFS Multi-Component system.
Ten Year Storage Technology Landscape for HDD, NAND, and Tape (presentation, video)
Dr. Robert E. Fontana, Jr., IBM Systems (bio)
Data in the “Cloud” are stored in isolated regions or bits on media in storage components, i.e. in magnetic bits on tape media in “Linear Tape Open” (LTO) cartridges, in magnetic bits on disk platters in hard disk drives (HDD), or in electric charge levels on silicon-based Flash Memory (NAND) chips. This paper examines the storage landscape associated with technologies that produce TAPE, HDD, and NAND components by presenting the past 10-year trends in these storage technologies, with all data available from publically available sources. The landscape includes both economic metrics, i.e. component revenue and bit shipments, as well as technology metrics, i.e. bit densities and cost per bit and manufacturing capacity. Using these data future trends that may influence storage in the “Cloud” are then projected.
Short Talks
Attendees and vendors can sign up in advance, or at the conference, to give 5-15 minute
works-in-progress or summary updates on work of interest to conference attendees.
Modernizing Xroot Protocol (presentation)
Dr. Michal Simon, CERN (bio)
XRootD is a low-latency, file-access framework based on a scalable architecture and a robust communication protocol. It is widely used in the high-energy physics community, notably at CERN. This talk will outline recent advancements in the xroot protocol, including TLS encryption, support for extended attributes and latency-reducing enhancements like vector writes and bundled requests.
Cluster-Aware Raid in Linux (presentation, video)
Guoqing Jiang, SUSE (bio)
The cluster multi-device (Cluster MD) is a software-based RAID storage solution for a typical Linux cluster. This short talk will describe the background and current status of the feature.


Invited Track, Wednesday May 16th
Memory Technologies and the Evolution of Distributed Storage Software
(presentation, video)
Dr. Peter Braam, Campaign Storage, LLC (bio)
A whirlwind of new memory and storage devices has begun and will continue to change data centers. Handling multiple storage tiers, performance aligned with that of memory, and new consistency models illustrate the breadth of new requirements for storage software. In the context of large-scale HPC, we will review how storage systems have changed and overcome many difficulties. From there we proceed to look at what is planned and anticipated going forward, indicating roles for technologies such as containers, file systems, object storage and access libraries. This is an area with many exciting opportunities and presently only a handful of solutions are available or under development.
Scale Challenges of the MeerKAT Radio Telescope (presentation, video)
Thomas Bennett, Square Kilometre Array (bio)
A discussion on the MeerKAT Radio Telescope, currently nearing completion in the Karoo desert region of South Africa. This talk covers a quick introduction to radio astronomy data processing and the scale challenges inherent therein. The solutions to the challenges posed will be discussed, with a particular focus on the various data storage regimes in our processing and analysis pipelines. Our use of CEPH, including our self-build hardware, 20PB science archive will be the central theme.
Building Extreme-Scale File Services in the Oracle Public Cloud
Ed Beauvais, Oracle (bio)
Do you have to choose between scalability and performance? Modern big data applications are generating massive amounts of structured, unstructured and streaming data, which can overwhelm traditional storage platforms. Oracle Cloud Infrastructure is a public cloud that is built to enable both enterprise applications and next generation workloads. This talk covers Oracle’s Cloud Storage offerings with particular focus on Oracle’s newest storage service. Oracle’s File Storage Service, was just recently launched and is designed from the ground up with full elasticity to support dynamic application workload challenges. Learn how our architecture is ready to enable performance, security and scale for file specific workloads.
Maintaining a Large Scale, Very Active Tape Archive (presentation, video)
Stephen Richards, European Centre for Medium-Range Weather Forecasts (bio)
ECMWF is a research and operational centre that produces medium and seasonal weather forecasts for its European member states and many organisations around the world. The Centre focuses its research efforts on improving its global weather model, using HPC with a rich archive of past observations and experimental data. To support this it maintains one of the most active tape archives, providing an affordable means to address 240PB. The presentation will be a short review of the current storage environment, together with a summary of our LTO-7 and IBM TS11xx tape testing. Which options we are considering in moving the archive to our new data centre in Italy, and outlining the changes we plan make in the primary and secondary tape environments.
The Medium-Term Prospects For Long-Term Storage (video)
David Rosenthal, Stanford University Libraries, Retired (bio)
At scale storage is organized as a hierarchy, with small amounts of "hot" storage at the top, and large amounts of "cold" storage at the bottom. The hot layers have been evolving rapidly as flash displaces spinning disk; the cold layers, not so much. Will this change in the medium term? What are the factors driving this part of the storage market?
Panel: Metadata Management at Scale
Chair: Wendy Poole
Managing Lustre Metadata with HDFS
Aaron Steichen, ExxonMobil Technical Computing Company (bio)
HPC applications generate massive amounts of files and data. This presents challenges for managing the file systems. The MySQL-based Robinhood Policy Engine is not an ideal fit for custom queries and becomes harder to rely on as inode count increases. We wanted a quicker, more flexible way to analyze the data on our Lustre file systems. We attempted to replace Robinhood's functionality with an HDFS solution. We found that while HDFS was not a good fit for the depth first searches required to rebuild file paths, the custom queries ran orders of magnitude faster in HDFS once the paths were pre-generated in MySQL. The increased speed of the queries and flexibility of the data structure has allowed us to run queries that were not feasible before putting the data in HDFS.
JGI Archive and Metadata Organizer (JAMO) (presentation)
Chris Beecroft, Lawrence Berkeley Laboratory (bio)
The Department of Energy Joint Genome Institute (JGI) is a national user facility that generates petabytes of data from instruments and analysis. Over the 2000-2018 timeframe, the JGI has experienced exponential growth in data generation. In 2013 the JGI deployed a hierarchical data management system to handle this data deluge. This system called the JGI Archive and Metadata Organizer (JAMO) enables JGI staff and scientists to write pipelines that automatically associate the files generated from instruments and analysis pipelines with a rich set of metadata. The JAMO system has saved JGI countless FTE hours that were historically spent trying to locate data on various storage systems for sharing internally or with collaborators. In this talk I will provide a high level overview of the system and how it was deployed at the JGI.
Metadata in Feature Animation Film Production
Scott Miller, Dreamworks (bio)
Visual complexity, audience expectation and competition for eyeballs is increasing. MetaData and analytics are driving efficiency in character and environmental design, overall film design, application implementation, resource scheduling and workflow management to help create even more compelling Feature Animated films than before. This talk provides a brief glimpse into the film making process and how metadata is making a difference.
Massive Scale Metadata Efforts and Solutions (presentation)
Dave Bonnie, Los Alamos National Laboratory (bio)
This talk will provide an overview of scalable metadata management solutions in research, development, and production at Los Alamos National Lab. The talk will target three primary efforts. The Grand Unified File Indexing (GUFI) System is a hybrid indexing capability using both file system trees and embedded SQL to enable a fast and efficient file metadata indexing system that can be used by both system administrators and users due to its unique approach to securing access to the index. Delta-FS is a user space namespace that utilizes concepts from git to allow applications to "check out" name spaces and "merge" name space changes enabling namespace operations to scale with the application. The Hexa-Dimensional Hashing Indexing Middleware (HXHIM) system is a user space, linkable, parallel, many dimensional key value store framework that allows applications to plug in their favorite multi-dimensional key value store/data base in and have hundreds to thousands of copies instantiated in parallel to form a distributed/parallel multi-dimensional indexing capability.
Write-Optimization for Metadata (presentation)
Dr. Rob Johnson, Stony Brook University (bio)
This talk will describe how BetrFS, a file system built from the ground up on write-optimized data structures, uses write-optimization to accelerate file-system metadata operations, such as atime updates and file and directory creations, renames, and deletes. BetrFS offers orders-of-magnitude performance improvement on some of these operations compared to convention "update-in-place" file systems, without suffering from the fragmentation challenges of log-structured file systems.

The talk will also discuss challenges and opportunities for using write-optimization in file systems to accelerate application-level metadata maintenance.
What is the Right Amount of Metadata?
Kent Blancett, BP (bio)
What is metadata and when do you know you have something useful? We’ll discuss some of the issues regarding metadata, not so much about where and how to store it but what metadata is valuable and when do you have enough of it and what is required to keep it updated. Upstream Oil and Gas is and evolution of educated guesswork, some would say art based on science and math. Of course, we can talk about the trivial metadata such as X and Y coordinates but that doesn’t come close to being useful. What other information is useful to us now and in the future.
LBNL/NERSC Archival Storage Overview (presentation, video)
Nick Balthaser, LBNL/NERSC (bio)
A brief overview of the LBNL/NERSC archive storage environment. Topics include current hardware and software environment, recent challenges, and future directions.
Short Talks
Attendees and vendors can sign up in advance, or at the conference, to give 5-15 minute
works-in-progress or summary updates on work of interest to conference attendees.
DNA's Niche in the Storage Market (video)
David Rosenthal, Stanford University Libraries, Retired (bio)
DNA has many attractive properties as an archival storage medium, being extremely dense, very stable in shirt-sleeve environments, and very cheap to replicate so Lots Of Copies Keep Stuff Safe. Since 2012, both the popular and technical presses have hyped the various lab demonstrations of writing and reading data using DNA as a medium. How does this technology work? What needs to happen to move it from the labs to the market?
Recent Progress in DNA Memory (video)
Brian Bramlett, Twist Bioscience (bio)
Over the past 6 years, interest in using DNA as a digital storage medium has intensified along with levels of industry and government funding. This talk focuses on programs to accelerate research and development toward a practical, commercial offering.
File Transfer Service at Exabyte Scale (presentation, video)
Dr. Michal Simon, CERN (bio)
The File Transfer Service (FTS) is an open-source, data-movement solution developed at CERN. It has been used for almost four years for the CERN LHC experiments' data distribution in the WLCG infrastructure. During this period, the usage has been extended to non-CERN and non-HEP users, reaching almost an Exabyte of transferred data in 2017. The talk will focus on the service architecture and main features, like the transfer optimizer, the multi protocol support, cloud extensions and monitoring.
Local Data De-duplication in Solid State Drives (video)
Hongmei Xie and Erich F. Haratsch, Seagate Technology
Local data deduplication (LLD) algorithms are proposed to remove duplicated data blocks within a local range. Duplication space locality is measured over a large data set and shows a favorable distribution. Results show that LDD reaches high deduplication success rate with extremely low cost.


2018 Organizers
Conference Co-Chairs     Dr. Ahmed Amer,  Dr. Sam Coleman
Program Committee     Gary Grider, Dr. Matthew O'Keefe, Arthur Pasquinelli, Gaz Salih
Industry Chair     Arthur Pasquinelli
Communications Chair     Meghan Wingate McClelland
Registration Chair     Yi Fang


Page Updated January 12, 2024