Data lake systems such as S3, ADLS, and GCS store the majority of data in today’s enterprises thanks to their scalability, low cost, and open interf

Introducing Databricks Unity Catalog: Fine-grained Governance for Data and AI on the Lakehouse

submited by
Style Pass
2021-05-27 08:30:04

Data lake systems such as S3, ADLS, and GCS store the majority of data in today’s enterprises thanks to their scalability, low cost, and open interfaces. Over time, these systems have also become an attractive place to process data thanks to lakehouse technologies such as Delta Lake that enable ACID transactions and fast queries. However, one area where data lakes have remained harder to manage than traditional databases is governance; so far, these systems have only offered tools to manage permissions at the file level (e.g. S3 and ADLS ACLs), using cloud-specific concepts like IAM roles that are unfamiliar to most data professionals.

That’s why we’re thrilled to announce our Unity Catalog, which brings fine-grained governance and security to lakehouse data using a familiar, open interface. Unity Catalog lets organizations manage fine-grained data permissions using standard ANSI SQL or a simple UI, enabling them to safely open their lakehouse for broad internal consumption. It works uniformly across clouds and data types. Finally, it goes beyond managing tables to govern other types of data assets, such as ML models and files. Thus, enterprises get a simple way to govern all their data and AI assets:

Although all cloud storage systems (e.g. S3, ADLS and GCS) offer security controls today, these tools are file-oriented and cloud-specific, both of which cause problems as organizations scale up. We’ve often seen customers run into four problems:

Leave a Comment