Is Your Production Data for Testing GDPR Compliant?

Is Your Production Data for Testing GDPR Compliant?

The General Data Protection Regulation (GDPR) is a European regulation. In the Netherlands, the GDPR replaces the Personal Data Protection Act. It has become mandatory for companies to keep closely monitor the security of their email traffic. The GDPR protects personal data regardless of the technology used for processing and storing that; in all cases, personal data is subject to the protection requirements set out in the GDPR.

You must ensure that the personal details of your employees, as well as customers, always remain completely secure.

Can I use Test data and be compliant with the GDPR?

The short answer is no. In our latest webinar, guest speaker Arie van der Deijl from Aareon conveyed the importance of production data and GDPR compliance. The GDPR applies to all organizations dealing with sensitive data, regardless of whether this is personal data, production data, or both. When production data is being duplicated to a testing environment in non-production, organizations must be able to ensure that this data is secure and GDPR compliant while they are improving their internal processes and efficiencies.

A company may note that their terms & conditions suggest they use data for testing purposes only. However, this does not justify its use and exposure. According to the GDPR, anything that falls into the category of personal data must and should still be protected.

What is personal data?

Personal data is any information that relates to an identified or identifiable living individual. Different pieces of information, which collected together can lead to the identification of a particular person, also constitute personal data. A few examples of personal data are names, surnames, and home addresses.

Personal data that has been de-identified, encrypted or ‘pseudonymized’ but can be used to re-identify a person remains personal data and falls within the scope of the GDPR. Whereas personal data that has been rendered anonymous in such a way that the individual is not or no longer identifiable is no longer considered personal data. For data to be truly considered anonymised in accordance with the GDPR, the anonymisation must be irreversible.

What can we do about it, how can we use test data and be compliant with the GDPR?

To allow data to be used in testing or training, it must first be completely anonymized to the extent that it is irreversible. This is known as ‘Data Obfuscation’ (DO). This can officially be defined as a form of data masking where data is purposely scrambled to prevent unauthorized access to sensitive materials. This form of encryption results in unintelligible or confusing data. Masking is the primary means for data obfuscation. It is the process of scrambling, blurring, replacing existing data with data of approximate length and format.

Data masking

Data masking is an important technique to develop a structure similar to the available one but has an inauthentic update on the company’s information which can be used for multiple reasons like user training and software testing. However, the main aim is to save the original data by having an operational substitute for various situations whenever the real data is not necessary.


Source: Arie van der Deijl’s (Aareon) Presentation


Advantages and disadvantages of masking data

Numerous data masking software have been developed to help organizations comply with regulations and continue to develop and improve their own work processes. Data masking can be executed in a number of ways. Each of these methods has its own pros and cons, with each method usually being best applied to a certain data type.


Data masking methods

We’ve compiled a list of different data masking methods and their strengths and weaknesses below:

  • Substitution– randomly substituting the contents of a column of data with data that looks similar looking but completely unrelated data.
    • + effectively preserves look and feel of existing data.
    • too cumbersome when dealing with vast amounts of data as it may be too difficult to find such large quantities of relevant data to substitute.

  • Shuffling– like Substitution but in this instance the substitute data is generated from the column itself. The data in a column is randomly shuffled between rows until the data no longer correlates with the remaining information in the row.
    • + effectively preserves look and feel of existing data and quickly and efficiently deals with large amounts of data.
    • ineffective when dealing with small amounts of data. Also, since original data is still present, if the algorithm used was not sufficiently sophisticated, may be “unshuffled.”

  • Number and Date Variance – each number or date value in a column is algorithmically modified by some random percentage of its real value. 
    • + data masking tools can reasonably mask numeric data while keeping the range and distribution of values within existing limits.
    • only applicable to numeric data.

  • Encryption– data is algorithmically scrambled and only those with access to the appropriate key can view the encrypted data.
    • + masks data.
    • encryption destroys the formatting as well as the look and feel of the data. Consequently, it is easy to see when data has been encrypted. Also, with enough effort, almost any encryption can be broken. Similarly, anyone with the access key can also, when using test or development databases, anyone with the appropriate key can access the data, resulting in the encryption being useless.

  • Nulling Out/Truncating/Deletion – removal of the sensitive data.
    • + useful in circumstances where the data is not required.
    • not appropriate for test database environments, where data or at least a realistic approximation of the data is required by the test teams.

Adhering to old and new regulations still causes many problems in organizations today. Many organizations are unaware of data laws, while many continue to push their luck and run the risk of receiving heavy fines as a result.

Smartlockr takes this worry away by completely unburdening you, so that you as an organization can focus on what you are good at: your core business.

Want to know more about the GDPR?

New call-to-action

Similar posts