Data Masking

Description

The Data Masking solution transforms sensitive data contained in databases into less sensitive data according to your own specification and needs.

The solution allows companies to use this data while reducing data breach impact. It offers a response to security and regulatory issues (e.g. GDPR).

A substantial amount of masking techniques and efficient pseudonymisation parameters can be deployed while keeping the consistency of the data.

The solution is adaptable and it covers a large variety of database types such as Oracle, DB2, SQL Server, MySQL, MongoDB, Sybase, Teradata.

Data Masking Service

To benefit from the service, please contact your TAM/SDM to evaluate the complexity of the database to pseudonymize and related quotation.

Data Masking Service is built on top of TDM (Test Data Management).

Concepts

Dictionaries

A dictionary is a flat file or relational table that contains substitute data and a serial number. Dictionaries can be used to replace sensitive data in a table.

Various dictionaries already exist in TDM (first name, last name, country names, job position, etc.). Dictionaries can be used in Masking Rules in order to create substitution masking rules.

Custom dictionaries can be imported or created:

  • If results need to contain specific data. Example: for first name, only subset of list name; for country, only EU country, etc.

  • If dictionaries do not fit a project specifications

Masking rules

Masking rules are tools used to mask data. Different types of masking rules can be defined and many masking rules exist by default.

Example of Masking Rule properties :

  • Repeatable output returns deterministic value each time when the source and the seed values are the same. It depends on the Seed, that can be changed at will.

  • Unique substitution Data returns unique value for each unique entry and Seed.

  • Dictionary: dictionary used in the rule

  • Masked value: determines the value that will be masked in the database (and which column is "applied")

  • Lookup Column: for more advanced rules, defines a lookup condition that can be used to trigger conditional results

  • Serial Number column: id used by the application (transparent for the user)

Advanced custom masking rules can be created if needed.

Examples:

  • Masking rule based on several entries. Ex: first name + last name concatenation

  • Advanced masking rule combining several masking rules. Ex : Masking a field using different dictionaries depending on the content of another field in the database.

Below are some non-exhaustive masking techniques:

  • Substitution : Replaces a column of data with similar data from a dictionnary

  • Randomisation : Produces random results for the same source data and masking rules

  • Blurring : Return a random value that is close to the original value

  • Key : produces deterministic results for the same source data, masking rule and seed value

  • Expression : Applies an expression to the data and return the masked or changed data. Example: concatenation

  • Nullification : Replace a column or data with null value

  • Encryption : Transform data into unintelligible data using a cryptographic algorithm and a defined key.

  • Credit Card : Applies a build-in technique to disguise credit card number

  • IP Address : Applies a built in technique to disguise IP Adress

  • Phone : Applies a built in technique to disguise Phone number

  • Email : Applies a built in technique to disguise email address

  • Shuffle : Applies randomly sensitive column values from one row to another row in the same table

  • Advanced : Applies customisable masking technique to multiple input and output columns.

Projects and jobs

A project is an entity that defines the link between masking rules and databases.

Project connections: Each project is associated to one (in case of in place masking) or several (in case of instream masking) database by using connections (ex: oracle jdbc string).

Rule mapping: Once created, rules can be mapped with database entries in "Projects". For each column that needs to be masked, a masking rule is added:

Once all mapping attributes have been entered according to the client's needs, the project can be launched

Job execution: Each project can be executed to perform the masking. The masking operation is called "job". During a job, TDM application performs the masking of the database "on the fly". The database is not stored locally on the worker node, nor is it retained in the application.

Nodes

Nodes designate the virtual machine on which the job is performed. For each client, one node is created. The performance of the node depends on the complexity of the masking job.

In-Stream or In-Place

Different types of masking can be performed according to the databases used :

  • In-Stream: This is the commonly used masking type. It involves two database: one "source" containing the data you want to mask, and one destination empty database. Masking is performed "on the fly" while copying data from the source to the destination database. The destination database must be empty.

  • In-place: one database. Masking is performed on the database.

Last updated