IEEE Computer Society MSST

Sponsored by the IEEE Mass Storage Systems
Technical Committee
IEEE Computer Society

April 16 — 20, 2012
Asilomar Conference Grounds
Pacific Grove, CA


MSST 2012 Speaker

Matt Foley, Hortonworks

Matt Foley
Tutorial Abstract: Petabyte-scale Data with Apache HDFS

A practical overview of how to configure a multi-hundred-petabyte storage system with Apache Hadoop's HDFS. We will do a brief overview of HDFS architecture, then talk about JBOD vs RAID, rack topology awareness, federated metadata servers, and how to calculate the probability of data loss.

MSST Abstract: High Availability HDFS

For years, Apache Hadoop's HDFS metadata services appeared to have a single point of failure. We will consider the ways this concern was and was not valid, then look at the new "HA" feature in HDFS v23, which seeks to completely resolve this concern. We'll also briefly talk about other features in the new release.



Member of Technical Staff at Hortonworks, Inc. focused on enabling and expanding the Hadoop ecosystem. "As leader of infrastructure engineering at Yahoo! spin-out Hortonworks, and as an Apache Hadoop Committer and PMC member, I am working with my team to make Hadoop bigger, better, and easier to install, manage, and use." Previously at Yahoo, working on HDFS and Y!Mail. Over 20 years in the industry. Computer Engineering MS, Santa Clara University, 1991, Education MA, Stanford, 1982, and Physics BS, Stanford, 1981.


Page Updated January 12, 2024