Real Drivers for Data Anonymization

As an enterprise architect and researcher in the cloud computing and security space, It seems like every month I hear of a new data breach at one company or another, exposing social security numbers, credit card numbers, or other sensitive information. According to the Ponemon Cost of a Data Breach Report for 2011, the organizational costs of a data breach was $5.5 million in 2011 and $194 per record; another study indicated that 95% of records lost included personal information, compared to 1% in 2010—highlighting a significant shift to targeted attacks specifically looking for personally identifiable information (PII).

There is increasing attention on protecting this type of information, such as the Payment Card Industry Data Security Standard (PCI DSS). Globally, especially in the European Union, stringent data protection laws exist that can have serious legal and financial consequences for companies in the event of a data breach. The combination of the PCI DSS and EU companies’ concerns with using of U.S-based clouds is catapulting data anonymization from the realm of academics to a critically important aspect of doing business using today’s global, cloud-based computing environments.

Data anonymization is the process of obscuring published data to prevent the identification of key information. This can be accomplished in a number of ways. For example, “shifting” adds a fixed offset to the numerical values, while “truncation” shortens data. Another technique is to add fictitious data records, to obscure patterns and relationships. Data anonymization can help protect sensitive data stored in the cloud. It can also help alleviate some of the potential legal problems encountered by U.S. companies that store data associated with customers living in the EU.

Tokenization (also called “permutation”) is one technique that can aid in data anonymization. For example, the Intel® Expressway Tokenization Broker (Tokenization Broker) can help protect credit card data. The Tokenization Broker generates secure, fixed-length tokens in place of primary account number data, and can serve as a secure proxy involved in authorization requests to credit card processors using standards-based interfaces, such as HTTP, SOAP, and WSDL. Of course, the Tokenization Broker also provides a method of de-tokenization—that is, mapping the token back to the original value.

Intel is developing a hybrid cloud usage model, where we use a combination of our enterprise private cloud as well as secure external clouds. As part of that effort, Intel IT is actively exploring data anonymization to see how it may enhance the security of Intel’s data in the public cloud while still allowing the data be to analyzed and used. For detailed discussion of our investigation and results, read the recently published IT@Intel white paper, “Enhancing Cloud Security Using Data Anonymization.”