Community Office Hours

In this Office Hour you'll learn about:

  • Using Alluxio as the input/output for Spark applications 
  • Saving and loading Spark RDDs and Dataframes with Alluxio 
  • Open Session for discussion on any topics such as solving the separation of compute and storage problem, unifying multiple storage systems, and more

Interested in learning more? 

Thanks for your interest! This event has concluded.

Running Apache Spark with Alluxio for Fast Data Analytics

Speaker: Bin Fan

Bin Fan is the founding engineer of Alluxio, Inc. and the PMC member of Alluxio open source project. Prior to Alluxio, he worked for Google where he won the Technical Infrastructure Award. Bin received his Ph.D. in Computer Science from Carnegie Mellon University working on distributed systems.

Evangelist and Founding Member at Alluxio virtual distributed file system that provides a unified data access layer for hybrid and multi cloud deployments. 

Alluxio resides between storage systems such as Amazon S3 or Apache HDFS and computations frameworks and applications such as Apache Spark or Presto.

With Alluxio, your data is centralized and applications have a single common interface and namespace for data access.

Alluxio is an...