Community Office Hours

Many organizations are leveraging EMR to run big data analytics on public cloud. However, reading and writing data to S3 directly can result in slow and inconsistent performance. Alluxio is a data orchestration layer for the cloud, and in this use case it caches data for S3, ensuring high and predictable performance as well as reduced network traffic. 

In this Office Hour I'll go over:

  • How to set up Alluxio with the EMR stack so that Presto jobs can seamlessly read from and write to S3
  • Compare the performance between Presto on EMR with Presto and Alluxio on EMR
  • Open Session for discussion on any topics such as solving the separation of compute and storage problem, and more

Interested in learning more? 

Thanks for you interest! This event has concluded.

Running Presto with Alluxio on Amazon EMR

Alex Ma is an open source veteran. Prior to Alluxio, he worked for Couchbase, where he was the Director of Solutions Engineering and Principal Architect. 

Director of Solutions Engineering at Alluxio

Speaker: Alex Ma

Prior to Alluxio, Nakkul worked as a consultant where he built and supported an entirely open source Hadoop platform for financial services clients.

Software Engineer at Alluxio

Speaker: Nakkul Sreenivas virtual distributed file system that provides a unified data access layer for hybrid and multi cloud deployments. 

Alluxio resides between storage systems such as Amazon S3 or Apache HDFS and computations frameworks and applications such as Apache Spark or Presto.

With Alluxio, your data is centralized and applications have a single common interface and namespace for data access.

Alluxio is an...