Introduction
Protecting personal and sensitive data, and complying with security and privacy regulations, is high priority for organizations. Inability to protect sensitive user and customer data not only violates regulations but also leads to loss of reputation and hence loss of current and future customers, along with costs to correct a data breach. The average cost of a data breach in 2020 was $3.86 million worldwide and $8.64 million in the U.S. Data privacy regulations such as HIPAA, GDPR, PCI require organizations to ensure that only the people who need to see data can see it and that they only see it when they should. One of the ways to ensure compliance is to redact or mask sensitive data when accessed internally and/or externally.
Data redaction obfuscates all or part of the data, reducing unnecessary exposure of sensitive data while at the same time maintaining its usability. Various terms such as data masking, data obfuscation and data anonymization or de-identification are used to describe this functionality in databases. Data redaction allows an organization to:
- Meet regulations such as HIPAA, PCI, PHI, GDPR
- Protect against insider threats
- Use production data for non-production use cases (e.g., testing and training)
In this blog, we show how to assure privacy of sensitive data stored in Kinetica.
Column Data Security in Kinetica
For the purpose of masking, Kinetica allows users to define data privacy at a column level.
Kinetica allows a user to have full access to the data stored in a column, masked access to the data stored in a column, or not have access to the data stored in a column at all. This is done so that the security permission logic is at a database level and is transparent to applications. For example, in the same application, different levels of data access will be permitted for two different users based on the user’s security policy in the database. If you are familiar with how table permissions work in Kinetica, you will find that column permissions naturally fit in the GRANT construct.
Full read access to all the columns of a table
To grant ability to select all columns from a table to a user,
GRANT SELECT ON <table name> TO <user name | role name>
Example:
GRANT SELECT on employee to user_dave;
User user_dave will be able to see all the columns of the table employee, when the user executes “SELECT * from employee” against the Kinetica database.
name | office | phone | ssn | emp_id |
John | London | 202-555-1212 | 123-45-6789 | 1001 |
Paul | Ballston | 202-555-2222 | 111-11-1111 | 1002 |
George | Liverpool | 202-555-3333 | 222-22-2222 | 1003 |
Ringo | LA | 202-555-4444 | 333-33-3333 | 1004 |
Restricted read access to only certain columns of a table
To grant access only certain columns of a table to a user or a role
GRANT SELECT ON <table name> (column1, column2, …)
TO <user name | role name>
Example:
GRANT SELECT ON employee (name, office, phone) TO manager
Any user with role manager will only be able to see name, office and phone columns when the user executes “SELECT * from employee” against the Kinetica database.
name | office | phone |
John | London | 202-555-1212 |
Paul | Ballston | 202-555-2222 |
George | Liverpool | 202-555-3333 |
Ringo | LA | 202-555-4444 |
Masked read access to certain columns of a table
To grant access to masked data for a column of a table to a user or a role
GRANT SELECT ON <table name>
(MASK(col1, <start>, <length>[,‘<char>’]), col2, col3, ... )
TO <user name | role name>
Example:
GRANT SELECT ON employee
(MASK(ssn,1,6,'x'),name,office, phone) TO hr;
This allows all users in the “hr” role to see the “ssn” column in the “employee” table, but not the first 6 characters, which will be replaced with ‘x’ characters. And for the name, office, and phone column the user will be able to see the actual value. So, when the user executes “SELECT * from employee” against the Kinetica database, the following results will be displayed.
name | office | phone | ssn |
John | London | 202-555-1212 | xxxxxx-6789 |
Paul | Ballston | 202-555-2222 | xxxxxx-1111 |
George | Liverpool | 202-555-3333 | xxxxxx-2222 |
Ringo | LA | 202-555-4444 | xxxxxx-3333 |
If you prefer to work through UI instead of SQL commands, you can go to Advanced Table security to configure the security policy for masking the SSN column as shown below.
Anonymized or Obfuscated read access to certain columns of a table
To grant access to obfuscated data for a column of a table to a user or a role
GRANT SELECT ON <table name>
(OBFUSCATE(column1), column2, column3, ...)
TO <user name | role name>
Example:
GRANT SELECT ON employee_transactions(OBFUSCATE(empId), *) TO user_joel;
This allows the user “user_joel” to see a hashed value of the “empId” column data in the “employee_history” table and the actual value of the data for all other columns. They would not be able to see any actual empId values, but they would be able to see which were made by the same customer.
So when the user executes “SELECT * from employee_transactions” against the Kinetica database, the user will get these results. As you can see empId is anonymized instead of original values 1001 and 1002.
empId | office | transactionId | date | item | qty |
593324567 | London | 10001 | 12/31/2019 | 5 | 20 |
256734568 | Ballston | 10010 | 1/1/2020 | 7 | 30 |
593324567 | London | 10020 | 1/1/2020 | 3 | 8 |
For more details on how to set up the security policy, please see our documentation for Kinetica Column Level Security.
If you would like to allow users to look at only a subset of rows based on user profile – which we call “Row Level Security,” read how to “Manage Table Permission at Row Level” in Kinetica.
To experience Kinetica Streaming data warehouse, try it for yourself or contact us to learn more.
References :
- Kinetica Column Level Security
- Manage Table Permission at Row Level
- What is the cost of a security breach?
- The biggest data breach fines, penalties and settlement so far
Dipti Joshi is director of product management at Kinetica.