Today, we announced the launch of the Databricks Feature Store, the first of its kind that has been co-designed with Delta Lake and MLflow to accelera

Databricks Announces the First Feature Store Co-designed with a Data and MLOps Platform

submited by
Style Pass
2021-05-29 21:00:16

Today, we announced the launch of the Databricks Feature Store, the first of its kind that has been co-designed with Delta Lake and MLflow to accelerate ML deployments. It inherits all of the benefits from Delta Lake, most importantly: data stored in an open format, built-in versioning and automated lineage tracking to facilitate feature discovery. By packaging up feature information with the MLflow model format, it provides lineage information from features to models, which facilitates end-to-end governance and model retraining when data changes. At model deployment, the models look up features from the Feature Store directly, significantly simplifying the process of deploying new models and features.

Raw data (transaction logs, click history, images, text, etc.) cannot be used directly for machine learning (ML). Data engineers, data scientists and ML engineers spend an inordinately large amount of time transforming data from its raw form into the final “features” that can be consumed by ML models. This process is also called “feature engineering” and can involve anything from aggregating data (e.g. number of purchases for a user in a given time window) to complex features that are the result of ML algorithms (e.g. word embeddings).

This interdependency of data transformations and ML algorithms poses major challenges for the development and deployment of ML models.

Leave a Comment