Session: 2 for 1: Real Time Architectures Real Fast / Fueling Fast Data: Apache Gluten and the Next Gen of Spark Performance

Real Time Architectures Real Fast – Hugh Evans

Real-time data is data that is available almost the instance it is created, often in less than a second, building a real time data architecture can take much, much longer. Data engineers use their knowledge of streaming, database engineering, caching, and DevOps to build infrastructure that can get data where it needs to be when it needs to be there but what if you want to build for real-time without having to go deep on all those topics first?

In this talk Hugh will cover an example of a super simple real time data architecture using all OSS projects to track aircraft in real time that can be adapted to fit many common real time use cases.

Fueling Fast Data: Apache Gluten and the Next Gen of Spark Performance – Binwei Yang & Chengcheng Jin

Apache Gluten (incubator) is an emerging open-source project in the Apache software ecosystem. It’s designed to enhance the performance and scalability of data processing frameworks such as Apache Spark. By leveraging cutting-edge technologies like vectorized execution, columnar data formats, and advanced memory management techniques, Apache Gluten aims to deliver significant improvements in data processing speed and efficiency.

The primary goal of Apache Gluten is to address the ever-growing demand for real-time data analytics and large-scale data processing. It achieves this by optimizing the execution of complex data processing tasks and reducing the overall resource consumption. As a result, organizations can process massive datasets more quickly and cost-effectively, enabling them to gain valuable insights and make data-driven decisions faster than ever before.

Presenters: