Earlier this year at Flink Forward 2024 Berlin we announced Fluss and today we are thrilled to announce open-sourcing the project. Fluss is a streaming storage system designed to power real-time analytics. Fluss changes how organizations approach real-time data by acting as the real-time data layer for the Lakehouse. Its cutting-edge design enables businesses to achieve sub-second latency, high throughput, and cost efficiency for data analytics, making it the ideal solution for modern data-driven applications.
Historically, we have invested significant effort into advancing the data streaming ecosystem, including major contributions to Apache Flink, Apache Flink CDC, and Apache Paimon. As part of our commitment, Fluss is now open source under the Apache 2.0 license and is available on GitHub, inviting users to create the next generation of real-time architectures.
The need for real-time insights has grown exponentially, especially with the recent explosion of Artificial Intelligence (AI). However, the tools and architectures we’ve relied on for years weren’t designed with streaming-first analytical workflows in mind. Traditional architectures often involve complex integrations between message queues like Kafka, processing engines like Flink, and storage systems that are more batch than real-time oriented. This approach not only increases latency but also adds operational overhead and cost. Fluss offers a unified streaming storage layer purpose-built for real-time analytics.