Data Masking - Get started
Last updated
Last updated
A thorough and well thought out Data Masking Plan ensures maximum security and retains the highest business value. Using the following best practice guidelines will ensure that data masking results in secure sensitive data.
The responsibilities table details steps that are the customer responsibility and those that are cegedim.cloud's.
Create a catalogue of each data source. Document the following types of information:
Accessed from: onshore, offshore, or both.
Usage: development, QA, testing, training or other types of usage.
Database type: Oracle, MS SQL Server, IMS, DB2 for z or other databases.
Data movement: list of data feeds into the environment.
Frequency of refresh: yearly, quarterly, ad hoc, or other intervals.
Owners: database and application owners.
Risk level: high, medium, or low.
It is also important to define which regulations apply to the customer organization and determine the type of data that is sensitive or confidential to the business.
Research which regulations apply to the masking project (ex: GDPR, HDS, etc.) and ensure that related obligations have been performed (ex: inform data subject, add the process to the record of processing activities, carry out a data protection impact assessment (DPIA) if necessary, etc.).
Designate someone with good knowledge of the privacy challenges of the project database. This person must have good knowledge of the database to be masked, its dependencies and the integrity constraints within.
It can be for example the responsible of the project, or the database Administrator providing he has sufficient information concerning the privacy context.
This person will be the preferential contact point to complete information regarding the Express of needs and the file with project information.
Define a list of sensitive or confidential data domains, such as last name, first name, credit card, etc.
Describe the characteristics of each data domain including the probable data type, data sensitivity, descriptions, and data and metadata patterns. This step ensures collaboration between business, security, data governance, and IT.
Identify identity constraints within the database: constraints such as foreign keys have to be identified for the masking job to be performed.
Some information are needed before performing the masking jobs.
Type of masking: in-place or in-stream
In-place: one database. Masking is performed on the database.
In-Stream: This is the commonly used masking type. It involves two database: one "source" containing the data the customer wants to mask, and one destination empty database. Masking is performed "on the fly" while copying data from the source to the destination database. The destination database must be empty.
Complexity: Simple, Medium or Complex. These categories represent the global difficulty to implement the masking project. It directly depends on the database complexity. It also takes into account if the masking specifications requires custom settings (masking rules or dictionaries).
Complexity | Number of columns | Masking complexity |
---|---|---|
Simple | < 10 | Only pre-existent masking rules, few different masking rules |
Medium | > 10 and < 20 |
|
Complex | > 20 | specific masking rules, lot of constraints (foreign key, etc.), specific dictionaries |
Database type: Oracle, MS SQL Server, IMS, DB2 for z or other databases.
In order to perform the masking, we need to understand how the customer want the data to be masked.
To do so, the customer needs to define for each column of each table of the database which "masking rule" the customer wants to be applied. One masking rule is applied to one row.
For example, the customer can use the following techniques:
Nullify highly sensitive data
Use non-unique repeatable substitution (based on dictionaries).
Use random masking with a range (for numeric value)
Use special techniques (credit card masking, IP address masking, etc.)
Custom masking rules can be defined for the need of complex projects.
Designing masking rules might require detailed explanation (see column "advanced masking rule specification), more time and more information exchange between both parties.
This information will be needed for the implementation of the masking rules. The customer should fulfill the joined "specification" document given in appendix 1. It should contain all the information needed in order to setup the masking rules: tables, columns, description of the masking the customer want to perform, corresponding masking rule to be applied, specific requests (ex: custom masking rule, bijection of the masking).
The deliverable is the database containing the masked data. As cegedim.cloud does not have direct access to the database, it is up to the client to set up validation rules to verify that all sensitive data are masked in the non-production environment according to the desired specifications.