GDPR: Data Minimisation

I believe that the legislators drafting the GDPR have taken into account the concept of "behavioral surplus". Both can co-exist as long as companies exercise good judgement in the processing of data. According to Article 5(1)(c) of the GDPR, Personal data processed must be "adequate, relevant and limited to what is necessary" [1]. The crux here is the term "personal data", once companies anonymize the data, it no longer falls under the jurisdiction of the GDPR. Companies can then indulge in "all you can process" behaviour.

The obvious follow up question would be whether anonymized data would suffice for AI and Big Data purposes. Big Data, as it's namesake suggests, thrives on volume. It is nary concerned with data on a personal level. Zuboff, citing a Google research paper, mentioned the phrase "applying learning algorithms to understand and generalize" [2]. The aim of big data is to filter out personal quirks and identify general trends, hence anonymized data should work just as well. Looking at a more concrete example, incidental data relating to Google Search such as "how a query is phrased, spelling, punctuation, dwell times, click patterns" were collected to provide a "broad sensor of human behaviour" [3]. Once again, we are looking to identify human behaviour in general, not specific to any individual. I believe that companies can take proper safeguards to anonymize data and reap the benefits of big data while remaining compliant with GDPR.

Nonetheless, there is no smoke without fire. There are certain purposes which cannot be fulfilled using anonymized data. Later in the chapter, Zuboff explores "targeted" advertising [4] which would definitely require personal data to be present so that the outcome from the processing can be applied back to that individual. In such scenarios, companies have to be transparent and include inter alia targeted advertising as a purpose for data processing. This would give the company the mandate to collect and process the relevant data that is required for targeted advertising.

AI systems and big data applications are not sentient. They do not have the ability to discern if the processing they are performing is "fair", free from bias and what a reasonable person would expect. Thus, it is important for us to be very deliberate in considering what data to process and deciding how to interpret the results of the processing. Allowing these technologies to operate unchecked could result in violations of data privacy, human rights or even compromise the vital interests of a human being.

[1] Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) [2016] OJ L 119/1, Article 5

[2] Shoshana Zuboff (2019) 'The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power' (Profile Books 2019), Chapter 3

[3] ibid, Chapter 3 Section II. A Balance of Power

[4] ibid, Chapter 3 Section IV. The Discovery of Behavioral Surplus