The Arrow PMC and newly created DataFusion PMC are happy to announce that as of April 16, 2024 the Apache Arrow DataFusion subproject is now a top lev

Announcing Apache Arrow DataFusion is now Apache DataFusion

submited by
Style Pass
2024-05-07 13:30:05

The Arrow PMC and newly created DataFusion PMC are happy to announce that as of April 16, 2024 the Apache Arrow DataFusion subproject is now a top level Apache Software Foundation project.

Apache DataFusion is a fast, extensible query engine for building high-quality data-centric systems in Rust, using the Apache Arrow in-memory format.

When DataFusion was donated to the Apache Software Foundation in 2019, the DataFusion community was not large enough to stand on its own and the Arrow project agreed to help support it. The community has grown significantly since 2019, benefiting immensely from being part of Arrow and following The Apache Way.

The community discussed graduating to a top level project publicly for almost a year, as the project seemed ready to stand on its own and would benefit from more focused governance. For example, earlier in DataFusion’s life many contributed to both arrow-rs and DataFusion, but as DataFusion has matured many contributors, committers and PMC members focused more and more exclusively on DataFusion.

The future looks bright. There are now 10s of known projects built with DataFusion, and that number continues to grow. We recently held our first in person meetup passed 5000 stars on GitHub, wrote a paper that was accepted at SIGMOD 2024, and began work on Comet, an Apache Spark accelerator initially donated by Apple.

Leave a Comment