Balancing Innovation and Privacy: AI in the Age of Regulation
Recently, artificial intelligence (AI) has become an integral part of technology and an active assistant in everyday life. However, implementing AI in business solutions requires compliance with industry-specific regulations. While the European Commission is working on comprehensive regulation of this issue in the Artificial Intelligence Act, the GDPR remains the main basis for compliance.
One of the most popular types of AI is generative AI, like ChatGPT. This type of AI uses information received from users to train and improve its algorithms.
The use of such tools still needs to be within the legal framework, in particular, within the requirements of personal data protection law. In March 2023, the Italian supervisory authority imposed temporary restrictions on the processing of personal data of Italian users by OpenAI, and in fact, restricted ChatGPT’s activities in Italy for numerous violations of the GDPR. This story had a positive outcome, as OpenAI made changes to its operations in accordance with the GDPR, and the global AI business sector learned many useful lessons.
Therefore, to develop AI in accordance with the GDPR, in addition to the standard requirements, it is necessary to use a GDPR-compliant dataset for model training. That is, all data must be collected specifically for the purpose of training AI algorithms.
The legal basis for their processing is likely to be the consent of the AI user (or a person on whose data the algorithm is trained, without the person’s active interaction with the AI), or the company’s legitimate interest (provided that it prevails over the legitimate interests of data subjects). The use of data that was obtained for the purpose of performing a contract with a user to train algorithms is a violation of the GDPR.
Such data must also comply with the principle of accuracy, i.e. be correct. The dataset should be sufficiently diverse and take into account differences in gender, socialisation, background, ethnicity, etc., so that the AI does not later demonstrate discriminatory traits and prejudices against certain individuals.
In the context of AI, it is also important to comply with the GDPR rules on automated processing, including profiling, which has legal consequences, as provided for in Article 22 of the GDPR.
As a general rule, data subjects have the right not to be subject to a decision based solely on automated processing that has serious consequences for them. For example, a bank uses an automated system developed on the basis of AI to assess the creditworthiness of its customers. If the decision to deny a loan was made solely on the basis of an AI decision without any human intervention, the data subject should have the right to appeal against such a decision, express their opinion, and have human intervention (review of the decision).
However, these rules do not apply if the decision is based on explicit consent, or is necessary for entering into, or performance of a contract, or is authorised by Union or Member State law, provided that the rights and freedoms of data subjects are adequately protected.
Machine learning is a subsector of AІ that focuses on creating algorithms that learn from data. The underlying idea behind machine learning is that computer programs can self-learn, constantly improving their algorithms, given that they are provided with enough data. Companies engaged in machine learning or actively implementing it in their activities may potentially process large amounts of personal data and should be guided by the GDPR in this process.
As machine learning is a subsector of AI, the same requirements as mentioned above apply to it. However, it is worth noting that there are now various ways to mitigate the risks of using personal data in machine learning. For example, federated learning is a widespread technique. This is a data minimisation technology developed by Google. The model is trained within a system where data already exists and then combined with the main model during the development process. In this way, personal data is used, but it does not actually leave the system where it is stored. Although federated learning avoids many of the challenges of the GDPR, compliance with some of its provisions (e.g. the transparency principle) remains a priority, as the system still processes personal data.