Tech Firms Can Easily Identify You Using Anonymized Data On Yourself

Chitanis - Aug 27, 2019


Tech Firms Can Easily Identify You Using Anonymized Data On Yourself

Researchers have recently pointed out that even if your personal information has been anonymized, advanced technology can still identify you.

Just by living in this modern world, you are giving up a lot of your personal info to many services and institutions. Many places promise that they will keep your data as private and secure as possible, but in fact, they often share your anonymized data to some third parties either for profit or for research. But the new research shows that anonymized data isn’t so anonymous.

Anonymized-data-is-not-as-anonymous-as-you-think-1
Anonymized data is not as anonymous as you think

Recently, the Imperial College London’s researchers published their paper titled “Estimating the success of re-identifications in incomplete datasets using generative models,” show that techniques currently used to anonymize data sets are insufficient. Before sharing a dataset, companies will delete identifying information (names, e-mail address, etc.). But even if identifiable factors were excluded from the dataset, it isn’t difficult to match definite information and find out who is the user of that data set, with high accuracy.

The researchers used 210 datasets for the analyses. These datasets were collected from 5 sources. It also includes the US government, which has over 11 million individuals’ information. According to the study, by using a machine learning model along with datasets including 15 identifiable factors (gender, birth date, age, marital status, ZIP code, etc.), the researchers can reidentify up to 99.98% of people in an anonymized data set. According to the researchers, these findings are a successful effort for proposing and validating “a statistical model to quantify the likelihood for a re-identification attempt to be successful, even if the disclosed dataset is heavily incomplete.”

Current-anonymization-techniques-are-insufficient-2
Current anonymization techniques are insufficient

The study offered a hypothesis, a health insurance company issues a data set of 1,000 anonymous customers, which is 1% of the total customers of the company in California. This data set includes the ZIP code, gender, date of birth and diagnosis of breast cancer. One of these individuals’ boss finds out that there was a man, who has the same date of birth and ZIP code, and base on the data set, is having breast cancer and his stage IV treatments didn't succeed. However, the health insurance company is able to say that, even if this unique data of the employer and the record in their company’s file match, it could be anyone else among tens of thousands of people insured at that company.

One of the paper's authors - Luc Rocher – a researcher of Université Catholique de Louvain - said: “While there might be a lot of people who are in their thirties, male, and living in New York City, far fewer of them were also born on 5 January, are driving a red sports car, and live with two kids (both girls) and one dog.”

Yves-Alexandre de Montjoye - the paper's senior author, characterized these attributes as “pretty standard information for companies to ask for.”

The hypothesis in this study is not only a fiction. Lately, in June, a patient at the University of Chicago Medicine sued both Google and the private research university for sharing his personal data without his permission. This medical center supposedly de-identified the data set, but still provided Google with records of the patient's vital signs, height, weight, information about their diseases, medical procedures they have experienced, the medicine they are using and date stamp. This complaint not only showed the hole of privacy in sharing private data without their agreement, but it also pointed out that although the data is anonymized, some of the powerful tech corporations can use their tools and easily reverse engineer that data and identify someone.

Google-and-private-research-university-were-both-sued-for-sharing-medical-data-without-patient-consent-3
Google and private research university were both sued for sharing medical data without patient consent

There are a lot of companies are now collecting data sets that can provide enough information to identify someone, and the fact that the researchers are able to reidentify users by using only 15 identifiable characteristics shows that we really need to reevaluate what creates an ethical anonymized dataset.

“Companies and governments have downplayed the risk of re-identification by arguing that the datasets they sell are always incomplete,” Mr. de Montjoye said. “Our findings contradict this and demonstrate that an attacker could easily and accurately estimate the likelihood that the record they found belongs to the person they are looking for.”

We-need-better-standards-for-anonymization-techniques-4
We need better standards for anonymization techniques

According to the researchers, policymakers have the responsibility to make better standards for all of the anonymization techniques to make sure that the sharing of data sets will stop becoming an invasion of privacy. “The goal of anonymization is so we can use data to benefit society,” said Mr. de Montjoye. “This is extremely important but should not and does not have to happen at the expense of people’s privacy.”

Next Story

Read More

Huawei MediaPad M7 Could Have A Punch-Hole Notch For Selfie

Gadgets- Oct 21, 2019

Huawei MediaPad M7 Could Have A Punch-Hole Notch For Selfie

If the recent leaks are to be believed, Huawei is working on its Huawei MediaPad M7 with a punch-hole notch on the front and a dual-camera setup on the back

Best Laptops To Buy In 2019

Gadgets- Oct 20, 2019

Best Laptops To Buy In 2019

Here’s the list of the highest-rated laptops of this year. This list includes all the top choices for a traditional clamshell, 2-in-1, and travel device.

HTC Launches Exodus 1s, A Low-End Blockchain Phone

Mobile- Oct 21, 2019

HTC Launches Exodus 1s, A Low-End Blockchain Phone

HTC, a phone maker that has fallen behind in the market for years, is now taking another chance by releasing a new cheap blockchain phone.

Sophos Found 15 Malicious Apps That Can Hide Their Icons From Users

ICT News- Oct 19, 2019

Sophos Found 15 Malicious Apps That Can Hide Their Icons From Users

According to Google Play Store app pages, over 1.3 million mobile devices have downloaded 15 malicious apps

Your Smart Speaker Will Monitor Your Sleeping Baby's Movement And Breathing

Features- Oct 21, 2019

Your Smart Speaker Will Monitor Your Sleeping Baby's Movement And Breathing

The smart speaker has long been used in daily lives to play music, check the weather forecast as well as search things online. Now, they can do more.

Apple iPhone SE 2 Will Be Small, But Here's Why It's Important

Mobile- Oct 21, 2019

Apple iPhone SE 2 Will Be Small, But Here's Why It's Important

As you may have heard, Apple could be developing a new, cheaper iPhone SE 2 that will offer some similar features with the iPhone 11, like the A13 chip.