PII Detective is a web application designed to identify, classify, and protect Personally Identifiable Information (PII) in data platforms such as Big

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-10-04 15:30:03

PII Detective is a web application designed to identify, classify, and protect Personally Identifiable Information (PII) in data platforms such as BigQuery and Snowflake. It leverages LLMs to identify PII column names, and with human-in-the-loop validation, uses Dynamic Data Masking Policies to easily enforce Access Control Limits (ACLs) while minimizing user friction.

Dynamic Data Masking is an extremely powerful and user-friendly way to protect sensitive data such as PII. SHA256 encryption lets data scientists interact with PII data (filtering, aggregations, relational JOINs, etc.) without having to view the raw PII data. Data platforms such as BigQuery and Snowflake have very easy way to set up data masking, however, knowing where PII columns are can be a massive challenge, especially if the platform is used heavily across multiple functions in your organization.

For example, GCP has a Sensitive Data Protection service which promises similar functionality, but it can become extremely costly since it runs hundreds of regex queries on the entire contents of the table. For comparison, PII Detective only uses table metadata such as table and columns names, so you can detect PII in thousands of tables for less than $5 of OpenAI credits!

Leave a Comment