Artificial Intelligence in Your Business
Artificial Intelligence or “AI” has existed in some form for more than half a century and was theorised earlier still. Things have come a long way from Christopher Strachey’s 1951 checkers program, described by Britannica as the “earliest successful AI program”. Over the past year, however, there has been a sharp rise in AI, particularly generative AI, unlike anything we have seen before.
Generative AI is as it sounds – an AI program that generates content, such as images, text, or programming code. The past year has seen the launch of AI programming tool Co-Pilot on GitHub; AI image creation tools MidJourney, Stable Diffusion, and DALL-E v2; and OpenAI’s ChatGPT in November 2022 followed in March 2023 by the latest version, ChatGPT-4. AI tools are also increasingly being used for data processing and assisting in decision-making.
These recent developments in AI are far from being obscure tools that rarely reach beyond GitHub and niche tech publications. Microsoft invested $10bn in OpenAI in January this year and quickly released its AI-powered Bing chatbot. The results, which included declarations of love and apparent veiled (and not so veiled) threats, quickly became the stuff of hilarity and some justified concern.
Such high-profile misadventures aside, many businesses are rushing to jump onto the AI bandwagon. Some are offering services utilising AI and others are looking at ways to use AI within their business for content creation and decision-making. As has always been the case with new technology, the law is unable to keep up. New regulatory approaches are still being considered and the existing framework of applicable laws does not always make for an ideal fit. Moreover, in the excitement to be up-to-date with the latest trends, many developers and users of these new tools risk overlooking the legal issues and could face significant problems down the road. What, then, should you be considering if you are using AI in your business, and what is the outlook for the regulation of AI here in the UK?
The UK’s Proposed Approach to AI Regulation
At the end of March this year, the UK Government published its proposals for the UK’s regulatory framework for AI. Compared to other jurisdictions, the UK proposals set out a light-touch “common sense” approach to regulation with hopes that this will “[put] the UK on course to be the best place in the world to build, test and use AI technology”.
Five key principles form the basis for the proposed regulatory approach:
- Safety, security, and robustness – AI systems should function in a robust, secure, and safe way;
- Appropriate transparency and explainability – AI systems should be suitably transparent and explainable;
- Fairness – AI systems should not undermine the legal rights of individuals or organisations, discriminate unfairly against individuals, or create unfair market outcomes;
- Accountability and governance – Measures should be in place to ensure effective oversight of the supply and use of AI systems with clear lines of accountability established;
- Contestability and redress – Users, impacted third parties, and actors in the AI lifecycle should be able to contest an AI decision or outcome that is harmful or creates material risk of harm.
The Government does not plan to enshrine these principles in legislation at first, the view being that, “New rigid and onerous legislative requirements on businesses could hold back AI innovation and reduce our ability respond quickly and in a proportionate way to future technological advances.” For anyone who has observed lawmakers’ ability (or lack thereof) to keep pace with technology in the past, these will not be encouraging words.
There is much work to be done when it comes to regulating AI and a great deal of input from regulators (such as the ICO), industry bodies, business, and academia should be expected over the coming months and years. At present, however, we are mostly faced with the task of interpreting, applying, and complying with existing laws that intersect with new applications of AI, however well (or not) they might fit.
What Are the Concerns?
A relaxed policy approach that seemingly seeks to position the UK to out manoeuvre Silicon Valley is one thing, but there are more immediate issues that businesses looking to work with AI should be aware of. Among those most relevant to businesses are:
- Intellectual Property (copyright in particular); and
- Data Protection and Privacy.
Much has been made of the ability of AI models such as ChatGPT to write material that is almost indistinguishable from that written by a human. This may be advertising copy, social media posts, blog posts, or something more authoritative. Indeed, it is posing a particular problem in education, with students using AI to write their assignments.
Large language models like this need to be trained. Training in this sense means providing the model with large quantities of data such as articles, books, or web pages. To go into the complexities of the training is far beyond the scope of this newsletter, but the key point is that the model is trained on pre-existing knowledge or data.
This also means that it will only be trained up to a certain point in time. The latest ChatGPT-4, for example, is trained on a dataset accurate up to September 2021. The risks here are obvious. Prompting ChatGPT-4 to write a business blog post or, more significantly, a report or portion thereof, about the latest developments in your field would likely result in something that, while looking and sounding highly plausible to the uninitiated, could be out of date at best and wrong at worst.
Whatever your proposed use of AI, it is vitally important to consider accuracy, whether you are using AI to produce content or provide a service, incorporating it into your products, or developing it yourself. Considering the proposed application of AI and the significance of that application at all stages is key. Will your business be making decisions based on the AI’s output? Will your customers or clients?
The temptation with exciting new technology is often to rush ahead, but it is important to avoid complacency. AI is undoubtedly clever, but it is not infallible and failure to take adequate care with its development and/or output could result in reputational damage or worse.
Always ensure that you know about the data that the AI will be or has been trained on (and, as addressed below, be particularly careful to identify personal data and third-party intellectual property). Make sure you know that the data is reliable and be aware of any limitations inherent in the data such as reliability, accuracy, bias, scope, and age.
Generative AI, or at least its output, seems to be everywhere at present in a variety of written and visual forms. As noted above, however, before an AI can produce an academic article or a photo of Joe Biden on a hoverboard (albeit with curiously long fingers), it must be trained. As the training data is fed in, machine learning creates a map of sorts based on patterns in the data. The more data the AI is trained on, the more it adjusts its neural network and “learns”.
Choosing the right data to train the AI, then, is particularly important. From a legal perspective, however, knowing where that data comes from, who owns it, what rights you have to use it, and what for, is just as important.
The internet has become a valuable and almost inexhaustible source of content for people and so it is for AI models. Some of the biggest names in AI image generation, for example – DALL-E, Midjourney, and Stable Diffusion – are trained on existing images, that is, existing copyright works in many cases. The images in question are taken from the internet via a process known as “scraping”. Software automatically gathers data from online sources such as social media and stock photography websites. In some cases, even sites set up by individual artists to showcase their own work will be scraped.
In many cases, such scraping takes place without the knowledge or consent of the copyright owners whose images are being used to train the AI. If the data that has been scraped constitutes a copyright work, then copying it in this way and for this purpose (or indeed any purpose for which an exception doesn’t exist in copyright law) can amount to copyright infringement. Indeed, to cite a particularly high-profile example, Getty Images is suing Stability AI, the creators of Stable Diffusion, for precisely that.
While text and images are two of the most common forms of copyright work associated with AI, one should remember that copyright can subsist in many different types of work including literary works, dramatic works, musical or artistic works, sound recordings or broadcasts, or databases (which may also be protected by database rights).
There are a limited number of “fair dealing” exceptions to copyright which may be of some assistance where AI training is concerned in some contexts, including non-commercial research and private study; text and data mining for non-commercial research; criticism, review and reporting current events; and parody, caricature and pastiche, but these are likely to be of limited use in a business context.
Whether your business is considering using a pre-existing AI system or developing its own, IP – and especially copyright – is essential to keep in mind. Above all, the training data must be properly licensed, with careful consideration given to the proposed use or uses of that material. More information on copyright licensing and a range of templates are available here.
Data Protection and Privacy
As with training data that incorporates IP rights, large amounts of training data may also incorporate personal data. Furthermore, AI may be used to process personal data for a number of purposes. Given the strict laws that govern and protect personal data, it is essential to ensure that it is used in conjunction with AI in a compliant way.
In either case, personal data must always be accurate, adequate, relevant, and limited in accordance with the principles of the UK GDPR. When training an AI, the ICO recommends using privacy-preserving techniques such as perturbation (which it describes as adding “noise” to the data), synthetic data, or federated learning in order to minimise the personal data being processed.
Data should also be selected with care, particularly when training an AI, so as to avoid bias and discrimination. Imbalanced training data can lead to imbalanced outcomes when using the model. As noted above when considering accuracy, it is important to be aware of the scope and limitations of the data that has been used (or is being used) to train the AI in question. More information on the use of AI and personal data is available in the ICO’s guide: How to use AI and personal data appropriately and lawfully (PDF).
In its recent response to the Government’s proposals for AI regulation (see above), the ICO, while supporting the proposals’ ambitions, set out their own views on the principles put forth by the Government, in particular fairness, and contestability and redress.
Regarding the fairness principle, the ICO believes that, much like its UK GDPR counterpart, it should cover the stages of an AI system’s development, not just its use. As to contestability and redress, the ICO seeks clarity from the Government which states that regulators will be expected to clarify existing routes to contestability and redress, and implement proportionate measures to ensure the contestability of the outcome of the use of AI where relevant. The ICO points out that it is the organisations using AI and that have oversight over their own systems that should provide that kind of clarification and implementation, and that they would like to understand whether the scope for regulators such as the ICO may be better described as making individuals more aware of their rights where AI is involved.
The ICO also address interactions with Article 22 of the UK GDPR (relating to automated decision-making, including profiling). The Government’s white paper states that where an AI system has a legal (or similarly significant) effect on an individual, regulators should consider the suitability of requiring AI system operators to provide an appropriate justification for that decision to the affected parties. The ICO reminds us that if personal data is involved, the UK GDPR would make it a requirement for the operator of the AI system to provide a justification, not merely a consideration.
As with other areas of AI regulation, the relationship between personal data and AI is a developing area in a constant and rapid state of flux and more guidance and potential regulation is surely on the horizon. In any case, when using personal data with AI, whether for training the AI or using the AI itself for processing, the requirements and principles of data protection law should always be front and centre in your approach. More information on data protection and privacy and a wide range of templates including privacy policies, data protection policies, audits and assessments, and more is available here.
The Road Ahead and What You Can Do Now
AI presents exciting possibilities and concerning pitfalls in equal measure for the unwary business. There is clearly much work to be done by lawmakers and regulators both in terms of reshaping existing laws to better accommodate AI and in terms of creating a new regulatory framework that successfully balances the need to establish rules and compliance while encouraging investment and development.
It is equally clear, however, that by understanding how existing law and regulation applies in the AI context, ensuring that your use or development of AI complies with it, and avoiding the temptation to rush ahead come what may, you can take a safer, more pragmatic, and more legally-compliant approach to exploring the development and applications of AI in your business.
The contents of this Newsletter are for reference purposes only and do not constitute legal advice. Independent legal advice should be sought in relation to any specific legal matter.